K-Relations and Beyond

Although the theory of relational databases is highly developed and proves its usefulness in practice every day Garcia-Molina et al. (2008), there are situations where the relational model fails to offer adequate formal support. For instance, when querying approximate dataHjaltason & Brooks (2003); Minker (1998) or data within a given range of distance or similarityHjaltason & Brooks (2003); Patella & Ciaccia (2009). Examples of such similarity-search applications are databases storing images, fingerprints, audio clips or time sequences, text databases with typographical or spelling errors, and text databases where we look for documents that are similar to a given document. A core component of such cooperative systems is a treatment of imprecise data Hajdinjak & Mihelic (2006); Minker (1998).

At the heart of a cooperative database system is a database where the data domains come equipped with a similarity relation, to denote degrees of similarity rather than simply 'equal ' and 'not equal'. This notion of similarity leads to an extension of the relational model where data can be annotated with, for instance, boolean formulas (as in incomplete databases) Calì et al. (2003); Van der Meyden (1998), membership degrees (as in fuzzy databases) Bordogna & Psaila (2006); Yazici & George (1999), event tables (as in probabilistic databases) Suciu (2008), timestamps (as in temporal databases) Jae & Elmasri (2001), sets of contributing tuples (as in the context of data warehouses and the computation of lineages or why-provenance) Cui et al. (2000); Green et al. (2007), or numbers representing the multiplicity of tuples (as in the context of bag semantics) Montagna & Sebastiani (2001). Querying such annotated or tagged relations involves the generalization of the classical relational algebra to perform corresponding operations on the annotations (tags).
There have been many attempts to define extensions of the relational model to deal with similarity querying. Most utilize fuzzy logic Zadeh (1965), and the annotations are typically modelled by a membership function to the unit interval, [0, 1] Ma (2006); Penzo (2005); Rosado et al. (2006); Schmitt & Schulz (2004), although there are generalizations where the membership function instead maps to an algebraic structure of some kind (typically poset or lattice based) Belohlávek & V. Vychodil (2006); Peeva & Kyosev (2004); Shenoi & Melton (1989). Green et al. Green et al. (2007) proposed a general data model (referred to as the K-relation model) for annotated relations. In this model tuples in a relation are annotated with a value taken from a commutative semiring, K. The resulting positive relational algebra, RA + K , generalizes Codd's classic relational algebra Codd (1970), the bag algebra Montagna & Sebastiani (2001), the relational algebra on c-tables Imielinski & Lipski (1984), the probabilistic algebra on event tables Suciu (2008), and the provenance algebra Buneman et al. (2001); Cui et al. (2000). With relatively little work, the K-relation model is also suitable as a basis for Poggi (2010); Green et al. (2007); Hajdinjak & Bierman (2011), we adopt the named-attribute approach, so a schema, U = {a 1 : τ 1 ,...,a n : τ n }, is a finite map from attribute names a i to their types or domains We represent an U-tuple as a map t = {a 1 : v 1 ,...,a n : v n } from attribute names a i to values v i of the corresponding domain, i.e., where v i ∈ τ i for i = 1, . . . , n. We denote the set of all U-tuples by U-Tup.

K
Consider generalized relations in which the tuples are annotated (tagged) with information of various kinds. A notationally convenient way of working with annotated relations is to model tagging by a function on all possible tuples. Green et al. Green et al. (2007) argue that the generalization of the positive relational algebra to annotated relations requires that the set of tags is a commutative semiring.
Taking this extension of relations, Green et al. proposed a natural lifting of the classical relational operators over K-relations. The tuples considered to be 'in' the relation are tagged with 1 and the tuples considered to be 'out of' the relation are tagged with 0. The binary operation ⊕ is used to deal with union and projection and therefore to combine different tags of the same tuple into one tag. The binary operation ⊙ is used to deal with natural join and therefore to combine the tags of joinable tuples. Will-be-set-by-IN-TECH Empty relation: For any set of attributes U, there is ∅ U : U-Tup → K such that for all U-tuples t. 2 Union: If A, B : U-Tup → K, then A ∪ B : U-Tup → K is defined by Projection: If A : U-Tup → K and V ⊂ U , we write f ↓ V to be the restriction of the map f to the domain V. The projection π V A : V-Tup → K is defined by Selection: If A : U-Tup → K and the selection predicate P maps each U-tuple to either 0 or 1, then σ P A : U-Tup → K is defined by Join: Note that in the case for projection, the sum is finite since A has finite support.
The power of this definition is that it generalizes a number of proposals for annotated relations and associated query algebras.
1. The classical relational algebra with set semantics Codd (1970) is given by the K-relational algebra on the boolean semiring K B =(B, ∨, ∧, false, true). 2. The relational algebra with bag semantics Green et al. (2007); Montagna & Sebastiani (2001) is given by the K-relational algebra on the semiring of counting numbers K N =(N, +, ·,0,1). 3. The Fuhr-Rölleke-Zimányi probabilistic relational algebra on event tables Suciu (2008) is given by the K-relational algebra on the semiring K prob =( P (Ω), ∪, ∩, ∅, Ω) where Ω is a finite set of events and P (Ω) is the powerset of Ω. 4. The Imielinksi-Lipski algebra on c-tables Imielinski & Lipski (1984) is given by the K-relational algebra on the semiring K c-table =( PosBool(X), ∨, ∧, false, true) where PosBool(X) is the set of all positive boolean expressions over a finite set of variables X in which any two equivalent expressions are identified.
2 As is standard, we drop the subscript on the empty relation where it can be inferred by context.

22
Advances in Knowledge Representation www.intechopen.com K-Relations and Beyond 5 5. The provenance algebra of polynomials with variables from X and coefficients from N Cui et al. (2000); Green et al. (2007) is given by the K-relational algebra on the provenance semiring K prov = (N[X], +, ·,0,1).
The positive relational algebra RA + K satisfies many of the familiar relational equalities Ullman (1988;. Proposition 2.1 (Identities of K-relations Green et al. (2007); Hajdinjak & Bierman (2011)). The following identities hold for the positive relational algebra on K-relations: • union is associative, commutative, and has identity ∅; • selection distributes over union and product; • join is associative, commutative and distributive over union; • projection distributes over union and join; • selections and projections commute with each other; • selection with boolean predicates gives all or nothing, σ false (A)=∅ and σ true (A)=A; • join with an empty relation gives an empty relation, A ⊲⊳ ∅ U = ∅ U where A is a K-relation over a schema U; • projection of an empty relation gives an empty relation, π V (∅)=∅.
It is important to note that the properties of idempotence of union, A ∪ A = A, and self-join, A ⊲⊳ A = A, are missing from this list. These properties fail for the bag semantics and provenance, so they fail to hold for the more general model.  Geerts and Poggi Geerts & Poggi (2010) recently proposed extending the K-relation model by a difference operator following a standard approach for introducing a monus operator into an additive commutative monoid Amer (1984). First, they restricted the class of commutative semirings by requiring that every semiring additionally satisfy the following pair of conditions. (2010)). A commutative semiring K = (K, ⊕, ⊙, 0, 1) is said to satisfy the GP conditions if the following two conditions hold.

Definition 2.3 (GP-conditions Geerts & Poggi
1. The preorder x y on K defined as x y iff there exists a z ∈ K such that x ⊕ z = y is a partial order. 3 2. For each pair of elements x, y ∈ K, the set {z ∈ K; x y ⊕ z} has a smallest element. (As defines a partial order, this smallest element must be unique, if it exists.) (2010)). Let K =( K, ⊕, ⊙, 0, 1) be a commutative semiring that satisfies the GP conditions. For any x, y ∈ K, we define x ⊖ y to be the smallest element z such that x y ⊕ z. A (commutative) semiring K that can be equipped with a monus operator ⊖ is called a semiring with monus or m-semiring. (1) m-semirings that are a boolean algebra (i.e., complemented distributive lattice with distinguished elements 0 and 1), for which the monus behaves like set difference, and (2) m-semirings that are the positive cone of a lattice-ordered commutative ring, for which the monus behaves like the truncated minus of the natural numbers.

Lemma 2.2 (Example m-semirings Geerts & Poggi
1. The boolean semiring, K B =(B, ∨, ∧, false, true), is a boolean algebra. We have 2. The semiring of counting numbers, K N =(N, +, ·,0,1), is the positive cone of the ring of integers, Z. The monus corresponds to the truncated minus, 3. The probabilistic semiring, K prob =( P (Ω), ∪, ∩, ∅, Ω), is a boolean algebra. The monus corresponds to set difference, 4. In the case of the semiring of c-tables, K c-table =(PosBool(X), ∨, ∧, false, true), the monus cannot be defined unless negated literals are added to the base set, in which case we get a boolean algebra. For any two expressions φ 1 , φ 2 ∈ Bool(X) we then have where negation ¬ over boolean expressions takes truth to falsity, and vice versa, and it interchanges the meet and the join operation. 5. The provenance semiring, K prov =(N[X], +, ·,0,1), is the positive cone of the ring of polynomials from Z[X]. The monus of two polynomials f [X]=∑ α∈I f α x α and g[X]=∑ α∈I g α x α , where I is a finite subset of N n , corresponds to where− denotes the truncated minus on N.
Given an m-semiring, the positive relational algebra RA + K can be extended with the missing difference operator as follows.
Definition 2.5 (Relational algebra on K-relations Geerts & Poggi (2010)). Let K be an m-semiring. The algebra RA + K (\) is obtained by extending RA + K with the operator: Difference If A, B : U-Tup → K, then the difference A \ B : U-Tup → K is defined by 24 Advances in Knowledge Representation www.intechopen.com Geerts and Poggi show that their resulting algebra coincides with the classical relational algebra, the bag algebra with the monus operator, the probabilistic relational algebra on event tables, the relational algebra on c-tables, and the provenance algebra.

The L-relation model
In this section we recall the definition of the L-relation model, the aim of which was to include similarity relations into the general K-relation framework of annotated relations.

Domain similarities
In a similarity context it is typically assumed that all data domains come equipped with a similarity relation or similarity measure. (2011)). Given a type τ and a commutative semiring K =( K, ⊕, ⊙, 0, 1),asimilarity measure is a function ρ :

Definition 3.1 (Similarity measures Hajdinjak & Bierman
Following earlier work Shenoi & Melton (1989), only reflexivity of the similarity measure was required. Other properties don't hold in general Hajdinjak & Bauer (2009). For example, symmetry does not hold when similarity denotes driving distance between two points in a town because of one-way streets. Another property is transitivity, but there are a number of non-transitive similarity measures, e.g. when similarity denotes likeness between two colours.
Allowing only K-valued similarity relations, Hajdinjak and Bierman Hajdinjak & Bierman (2011) modeled an answer to a query as a K-relation in which each tuple is tagged by the similarity value between the tuple and the ideal tuple. (By an ideal tuple a tuple that perfectly fits the requirements of the similarity query is meant.) Prior to any querying, it is assumed that each U-tuple t has either desirability A(t)=1 or A(t)=0 whether it is in or out of A.
called the boolean semiring. 2. A fuzzy equality measure ρ : τ × τ → [0, 1] where ρ(x, y) expresses the degree of equality of x and y; the closer x and y are to each other, the closer ρ(x, y) is to 1. Here, the unit interval [0, 1] is the underlying set of the commutative semiring called the fuzzy semiring.
is the underlying set of the commutative semiring called the distance semiring.

Will-be-set-by-IN-TECH
Because of their use the commutative semirings from this example were called similarity semirings.
A predefined environment of similarity measures that can be used for building queries is assumed-for every domain K =( K, ⊕, ⊙, 0, 1) and every K-relation over a schema U = {a 1 : τ 1 , ..., a n : τ n } there are similarity measures

The selection predicate
In the original Green et al. model (Definition 2.2) the selection predicate maps U-tuples to either the zero or the unit element of the semiring. Since in a similarity context we expect the selection predicate to reflect the relevance or the degree of membership of a particular tuple in the answer relation, not just the two possibilities of full membership (1) or non-membership (0), the following generalization to the original definition was proposed Hajdinjak & Bierman (2011).

Selection:
If A : U-Tup → K and the selection predicate maps each U-tuple to an element of K (instead of mapping to either 0 or 1), then Selection queries can now be classified on whether they are based on the attribute values (as is normal in non-similarity queries) or whether they use the similarity measures. Selection queries can also use constant values. (2011)). Suppose in a schema U = {a 1 : τ 1 ,...,a n : τ n } the types of attributes a i and a j coincide. Then given a commutative semiring K =(K, ⊕, ⊙, 0, 1), for a given binary predicate θ, the primitive predicate [a i θ a j ] : U-Tup → Kis defined as follows.

Relational difference
Whilst the similarity semirings support a monus operation in the sense of Geerts and Poggi Geerts & Poggi (2010), the induced difference operator in the relational algebra does not behave as desired.
• The fuzzy semiring, K [0,1] =( [ 0, 1], max, min, 0, 1), satisfies the GP conditions, and the monus operator is as follows. x This induces the following difference operator in the relational algebra. Hajdinjak and Bierman Hajdinjak & Bierman (2011) regret that this is not the expected definition. First, fuzzy set difference is universally defined as min{A(t),1− B(t) } Rosado et al. (2006). Secondly, in similarity settings only totally irrelevant tuples should be annotated with 0 and excluded as a possible answer Hajdinjak & Mihelič (2006). In the case of the fuzzy set difference A \ B, these are exclusively those tuples t where A(t)=0 or B(t)=1, and certainly not where , min, max, d max ,0), satisfies the GP-conditions, and the monus operator is as follows. x This induces the following difference operator in the relational algebra.
Again, in the distance setting, we would expect the difference operator to be defined as Moreover, this is a continuous function in contrast to the step function behaviour of the operator above resulting from the monus definition.
Rather than using a monus-like operator, Hajdinjak and Bierman Hajdinjak & Bierman (2011) proposed a different approach using negation.
Definition 3.5 (Negation). Given a set L equipped with a preorder, a negation is an operation ¬ : L → L that reverts order, x ≤ y =⇒¬ y ≤¬x, and is involutive, ¬¬x = x.
Provided that K =(K, ⊕, ⊙, 0, 1, ¬) is a commutative n-semiring, the difference of K-relations A, B : U-Tup → K may be defined by Each of the similarity semirings has a negation operation that, in contrast to the monus, gives the expected notion of relational difference.
Example 3.2 (Relational difference over common similarity measures).
From the above we get exactly the monus-based difference of K B -relations.
• In the fuzzy semiring, K [0,1] =( [ 0, 1], max, min, 0, 1), ordered by relation ≤, we can define a negation operator as In the generalized fuzzy semiring K [a,b] =([a, b], max, min, a, b), we can define ¬x def = a + b − x. In the fuzzy semiring we thus get and in the generalized fuzzy semiring we get A(t) ⊙¬B(t)=min{A(t), a + b − B(t)}. These coincide with the fuzzy notions of difference on [0, 1] and [a, b], respectively Rosado et al. (2006). • In the distance semiring, K [0,d max ] =([ 0, d max ], min, max, d max ,0), ordered by relation ≥, we can define a negation operator as We again get the expected notion of difference.
This is a continuous function of A(t) and B(t), and it calculates the greatest distance d max only if A(t)=d max or B(t)=0.
Moreover, the negation operation gives the same result as the monus when K is the boolean semiring, K B , the probabilistic semiring, K prob , or the semiring on c-tables, K c-table .
Unfortunately, while the provenance semiring, K prov , and the semiring of counting numbers, K N , both contain a monus, neither contains a negation operation. In general, not all m-semirings are n-semirings. The opposite also holds Hajdinjak & Bierman (2011).

Advances in Knowledge Representation
www.intechopen.com

Relational algebra on L-relations
We have seen that the K-relational algebra does not satisfy the properties of idempotence of union and self-join because, in general, the sum and product operators of a semiring are not idempotent. In order to satisfy all the classical relational identities (including idempotence of union and self-join) and to allow a comparison and ordering of tags, Hajdinjak and Bierman Hajdinjak & Bierman (2011) have restricted commutative n-semirings to De Morgan frames (with the lattice join defined as sum and the lattice meet as product). Recall that the lattice supremum ∨ and infimum ∧ operators are always idempotent.

Proposition 3.1 (De Morgan laws Salii
The similarity semirings from Example 3.1 are De Morgan frames, the same holds for the probabilistic semiring and the semiring on c-tables. Empty relation: For any set of attributes U there is ∅ U : U-Tup → L such that for all U-tuples t. Union: If A, B : U-Tup → L then A ∪ B : U-Tup → L is defined by Projection: If A : U-Tup → L and V ⊂ U, the projection of A on attributes V is defined by Selection: If A : U-Tup → L and the selection predicate P : U-Tup → L maps each U-tuple to an element of L, then σ P A : U-Tup → L is defined by

12
Will-be-set-by-IN-TECH Join: If A : U 1 -Tup → L and B : U 2 -Tup → L, then A ⊲⊳ B is the L-relation over U 1 ∪ U 2 defined by Renaming: Unlike for K-relations, we need not require that L-relations have finite support, since De Morgan frames are complete lattices, which quarantees the existence of the join in the definition of projection.
It is important to note that since RA L satisfies all the main positive relational algebra identities, in terms of query optimization, all algebraic rewrites familiar from the classical (positive) relational algebra apply to RA L without restriction. Matters are a little different for the negative identities Hajdinjak & Bierman (2011). In fuzzy relations Rosado et al. (2006) many of the familiar laws concerning difference do not hold. For example, it is not the case that A \ A = ∅, and so it is not the case in general for the L-relational algebra. Consequently, some (negative) identities from the classical relational algebra do not hold any more.

The D-relation model
Notice that all tuples across all the K-relations or the L-relations in the database and intermediate relations in queries must be annotated with a value from the same commutative semiring K or De Morgan frame L. To support simultaneously several different similarity measures (e.g., similarity of strings, driving distance between cities, likelihood of objects to be equal), and use these different measures in our queries (even within the same query), Hajdinjak and Bierman Hajdinjak & Bierman (2011) proposed to move from a tuple-annotated model to an attribute-annotated model. They associated every attribute with its own De Morgan frame. They generalized an L-relation, which is a map from a tuple to an annotation value from a De Morgan frame, to a D-relation, which is a map from a tuple to a corresponding tuple containing an annotation value for every element in the source tuple, referred to as a De Morgan frame tuple.  (2011)).
•A De Morgan frame schema, D = {a 1 : L 1 , ..., a n : L n }, maps an attribute name, a i ,t oaD e Morgan frame,    (2011)). The operations of the relational algebra with similarities,RA D , are defined as follows: Empty relation: For any set of attributes U and corresponding De Morgan frame schema, D, the empty D-relation over U, ∅ U , is defined such that where t is a U-tuple and D(a)=(L a , a , ∧ a , 0 a , 1 a , ¬ a ). Union: If A, B : U-Tup →D(U)-Tup, then A ∪ B : U-Tup →D(U)-Tup is defined by where D(a)=(L a , a , ∧ a , 0 a , 1 a , ¬ a ).

Projection: If A : U-Tup →D(U)-Tup and V ⊂ U, the projection of A on attributes V is defined by
where D(a)=(L a , a , ∧ a , 0 a , 1 a , ¬ a ). Selection: If A : U-Tup →D (U)-Tup and the selection predicate P : U-Tup →D (U)-Tup maps each U-tuple to an element of D(U)-Tup, then σ P A : U-Tup →D(U)-Tup is defined by where D(a)=(L a , a , ∧ a , 0 a , 1 a , ¬ a ). Join: Let D 1 = {a 1 : L 1 , ..., a n : L n } and D 2 = {b 1 : L ′ 1 , ..., b m : L ′ m } be De Morgan frame schemata. Let their union, D 1 ∪D 2 , contain an attribute, c i : L i , as soon as c i : L i is in D 1 or D 2 or both. (If there is an attribute with different corresponding De Morgan frames in D 1 and D 2 ,a renaming of attributes is needed.) If A : U 1 -Tup →D 1 (U 1 )-Tup and B : U 2 -Tup →D 2 (U 2 )-Tup, then A ⊲⊳ B is the (D 1 ∪D 2 )-relation over U 1 ∪ U 2 defined as follows. . (58) Difference: If A, B : U-Tup →D(U)-Tup, then A \ B : U-Tup →D(U)-Tup is defined by where D(a)=(L a , a , ∧ a , 0 a , 1 a , ¬ a ). Renaming: If A : U-Tup →D (U)-Tup and β : U → U ′ is a bijection, then ρ β A : As in the case of L-relations it is required that every tuple outside of a similarity database is ranked with the minimal De Morgan frame tuple, {a 1 : 0 1 ,...,a n : 0 n }, and every other tuple is ranked either with the maximal De Morgan frame tuple, {a 1 : 1 1 ,...,a n : 1 n }, or a smaller De Morgan frame tuple expressing a lower degree of containment of the tuple in the database.

K-Relations and Beyond
www.intechopen.com (2011)). The following identities hold for the relational algebra on D-relations:

Proposition 4.1 (Identities of D-relations Hajdinjak & Bierman
• union is associative, commutative, idempotent, and has identity ∅; • selection distributes over union and difference; • join is associative and commutative, and distributes over union; • projection distributes over union and join; • selections and projections commute with each other; • difference has identity ∅ and distributes over union and intersection; • selection with boolean predicates gives all or nothing, σ false (A)=∅ and σ true (A)=A, where false(t)(a)=0 a and true(t)(a)=1 a for D(a)=(L a , a , ∧ a , 0 a , 1 a , ¬ a ); • join with an empty relation gives an empty relation, A ⊲⊳ ∅ U = ∅ U where A is a D-relation over a schema U; • projection of an empty relation gives an empty relation, π V (∅)=∅.
Each of the similarity measures associated with the attributes maps to its own De Morgan frame. Again, a predefined environment of similarity measures that can be used for building queries is assumed-for every D-relation over U, where D = {a 1 : L 1 , ..., a n : L n } and L i = (L i , i , ∧ i , 0 i , 1 i , ¬ i ) and U = {a 1 : τ 1 , ..., a n : τ n } there is a similarity measure In the D-relation model, primitive and similarity predicates need to be redefined. (2011)). Suppose in a schema U = {a 1 : τ 1 ,...,a n : τ n } the types of attributes a i and a j coincide. Then for a given binary predicate θ, the primitive predicate [a i θ a j ] : U-Tup →D(U)-Tup (62) is defined as follows.

Definition 4.3 (Primitive predicates Hajdinjak & Bierman
In words, [a i θ a j ] has value 1 in every attribute except a i and a j , where it behaves as the characteristic map of θ defined as follows. Similarity predicates annotate tuples based on the similarity measures. (2011)). Suppose in a schema U = {a 1 : τ 1 ,...,a n : τ n } the types of attributes a i and a j coincide. The similarity predicate [a i like a j ] : U-Tup →D(U)-Tup is defined as follows.

www.intechopen.com
In words, [a i like a j ] measures similarity of attributes a i and a j , each with its own similarity measure. The symmetric version is defined as follows.
Now union and intersection of selection predicates are computed component-wise.
Given the similarity measures associated with attributes, it is possible to define similarity-based variants of other familiar relational operators, such as similarity-based joins Hajdinjak & Bierman (2011). Such an operator joins two rows not only when their join-attributes have equal associated values, but when the values are similar.

A common framework
In this section we explore whether there is a common domain of annotations suitable for all kinds of annotated relations, and we define a general model of K, L-and D-relations.

A common annotation domain
We have recalled two notions of difference on annotated relations: the monus-based difference proposed by Geerts and Poggi Geerts & Poggi (2010) and the negation-based difference proposed by Hajdinjak and Bierman Hajdinjak & Bierman (2011). We have seen in §3.3 that the monus-based difference does not have the qualities expected in a fuzzy context. The negation-based difference, on the other hand, does agree with the standard fuzzy difference, but it is not defined for bag semantics (and provenance). More precisely, the semiring of counting numbers, K N =( N, +, ·,0,1), cannot be extended with a negation operation. (The same holds for the provenance semiring.) We could try to modify the semiring of counting numbers in such a way that negation can be defined. For instance, if we replace N by Z, we get the ring of integers, (Z, +, ·,0,1), where negation can be defined as ¬x def = −x. This implies (A \ B)(t)=−A(t) · B(t), which is not equal to the standard difference of relations annotated with the tuples' multiplicities Montagna & Sebastiani (2001). Some other modifications would give the so called tropical semirings Aceto et al. (2001) whose underlying carrier set is some subset of the set of real numbers R equipped with binary operations of minimum or maximum as sum, and addition as product.
Let us now study the properties of the annotation structures of both approaches.
Proposition 5.1 (Identities in an m-semiring Bosbach (1965)). The notion of an m-semiring is characterized by the properties of commutative semirings and the following identities involving ⊖.

Will-be-set-by-IN-TECH
Notice that even in a De Morgan frame a difference-like operation may be defined, Clearly, negation is then expressed as ¬x = 1 ÷ x.
Proposition 5.2 (Identities in a De Morgan frame). In a De Morgan frame the following identities involving ÷ hold.
x ÷ (1 ÷ y)=x ∧ y, Proof. The first four identities are exactly the De Morgan laws from Proposition 3.1. The rest holds by simple expansion of definitions and/or is implied by the De Morgan laws.
Notice the differences between the properties of the monus-based difference ⊖ in an m-semiring and the properties of the negation-based difference ÷ in a De Morgan frame. For instance, in a De Morgan frame we do not have x ÷ x = 0 in general.
However, since neither of the proposed notions of difference give the expected result for all kinds of annotated relations, an annotation structure different from m-semirings and De Morgan frames is needed. Observe that by its definition, a complete (even bounded) distributive lattice, L =( L, ∨, ∧, 0, 1), is a commutative semiring with the natural order being the lattice order, a ⊕ b = a ∨ b and a ⊙ b = a ∧ b for every a, b in L. Because lattice completeness assures the existence of a smallest element in every set and hence the existence of the monus (see Definition 2.3 on GP-conditions), a complete distributive lattice is an m-semiring. On the other hand, if a commutative semiring, K =( K, ⊕, ⊙, 0, 1),i s partially ordered by and any two elements from K have an infimum and a supremum, it is a lattice, not necessarily bounded Davey & Priestley (1990). The lattice meet and join are then determined by the partial order , and they are, in general, different from ⊕ and ⊙. Since 0 ⊕ a = a, we have 0 a for any a ∈ K, and 0 is the least element of the lattice. In general, a similar observation does not hold for 1, which is hence not the greatest element of the lattice.
The underlying carrier sets of all the semirings considered are partially ordered sets, even distributive lattices. The unbounded lattices among them (i.e., K N and K prov ) can be converted into bounded (even complete) lattices by adding a greatest element. To achieve this we just need to replace N ∪{∞} for N and define appropriate calculation rules for ∞.
To summarize, a complete distributive lattice is an m-semiring. If the lattice even contains negation, we have two difference-like operations; monus ⊖ and ÷, which is induced by negation. There is a class of annotated relations when only one of them (⊖ for bag semantics and provenance, ÷ for fuzzy semantics) gives the standard notion of relational difference, and there is a class of annotated relations when they both coincide (e.g., classical set semantics, probabilistic relations, and relations on c-tables).
Proposition 5.3 (General annotation structure). Complete distributive lattices with finite meets distributing over arbitrary joins are suitable codomains for all considered annotated relations.
Proof. The boolean semiring, the probabilistic semiring, the semiring on c-tables, the similarity semirings as well as the semiring of counting numbers and the provenance semiring (see Lemma 5.1) can all be extended to a complete distributive lattice in which finite meets distribute over arbitrary joins. The later property allows to model infinite relations satisfying all the desired relational identities from Proposition 4.1, including commuting selections and projections. Relational difference may be modeled with the existing monus, ⊖,o r÷ if the lattice is a De Morgan frame where a negation exists. The other (positive) relational operations are modeled using lattice meet, ∧, and join, ∨, or semiring sum, ⊕, and product, ⊙.

A common model
Recall that Green et al. Green et al. (2007) defined a K-relation over U = {a 1 : τ 1 ,...,a n : τ n } as a function A : U-Tup → K with finite support. The finite-support requirement was made to ensure the existence of the sum in the definition of relational projection. When the commutative semiring K =( K, ⊕, ⊙, 0, 1) was replaced by a De Morgan frame, L = (L, , ∧, 0, 1, ¬), the finite-support requirement became unnecessary; the existence of the join in the definition of projection was quaranteed by the completeness of the codomain.

www.intechopen.com
To model similarity relations more efficiently, Hajdinjak and Bierman Hajdinjak & Bierman (2011) introduced a D-relation over U as a function from U-Tup to D(U)-Tup assigning every element of U-Tup (row of a table) a tuple of different annotation values. We adopt Definition 4.1 to the proposed general annotation structure, and show that a tuple-annotated model may be injectively mapped to an attribute-annotated model.
•A n annotation schema, C = {a 1 : L 1 , ..., a n : L n }, over U = {a 1 : τ 1 , ..., a n : τ n } maps an attribute name, a i , to a complete distributive lattice in which finite meets distribute over arbitrary joins, L i =(L a i , a i , ∧ a i , 0 a i , 1 a i ). •A n annotation tuple,s = {a 1 : l 1 , ..., a n : l n }, maps an attribute name, a i , to an element of a complete distributive lattice in which finite meets distribute over arbitrary joins, l i . The set of all annotation tuples matching C over U is denoted C(U)-Tup. •A n C-relation over U is a finite map from U-Tup to C(U)-Tup.
Proposition 5.4 (Injection of a tuple-annotated model to an attribute-annotated model). Let A be the class of all functions A : U-Tup → L where U is any relational schema and L =(L, , ∧, 0, 1) is any complete distributive lattice with finite meets distributing over arbitrary joins. Let B be the class of all C-relations over U, B : U-Tup →C(U)-Tup, where C is an annotation schema. There is an injective function F : A→Bdefined by for all attributes a i in U and tuples t ∈ U-Tup.
Proposition 5.4 says that moving from tuple-annotated relations to attribute-annotated relations does not prevent us from correctly modeling the examples covered by the K-relation model in which each tuple is annotated with a single value from K. The annotation value just appears several times. We thus propose a model of C-relations, a common model of K, L-and D-relations, that is attribute annotated. The definitions of union, projection, selection, and join of C-relations may be based on the lattice join and meet operations (like in Definitions 3.9 and 4.2) or, if there exist semiring sum and product operations different from lattice join and meet, the positive relational operations may be defined using these additional semiring operations (like in Definition 2.2). The definition of relational difference may be based on the monus or, when dealing with De morgan frames where a negation exists, the derived ÷ operation.
Definition 5.2 (Relational algebra on C-relations). Consider C-relations where all the lattices L i = (L a i , a i , ∧ a i , 0 a i , 1 a i ) from annotation schema C = {a 1 : L 1 , ..., a n : L n } are complete distributive lattices in which finite meets distribute over arbitrary joins. Let ▽ a i and △ a i stand for either the lattice ∨ a i and ∧ a i or some other semiring ⊕ a i and ⊙ a i operations defined on the carrier set L a i of a L i , respectively. Let − a i stand for either the monus ⊖ a i or a ÷ a i operation defined on L a i . The operations of the relational algebra on C, denoted RA C , are defined as follows.
Empty relation: For any set of attributes U and corresponding annotation schema, C, the empty C-relation over U, ∅ U , is defined by Projection: If A : U-Tup →C(U)-Tup and V ⊂ U, the projection of A on attributes V is defined by Selection: If A : U-Tup →C(U)-Tup and the selection predicate P : U-Tup →C(U)-Tup maps each U-tuple to an element of C(U)-Tup, then σ P A : U-Tup →C(U)-Tup is defined by Join: Let C 1 = {a 1 : L 1 , ..., a n : L n } and C 2 = {b 1 : L ′ 1 , ..., b m : L ′ m } be annotation schemata. If A : U 1 -Tup →C 1 (U 1 )-Tup and B : U 2 -Tup →C 2 (U 2 )-Tup, then A ⊲⊳ B is the (C 1 ∪C 2 )-relation over U 1 ∪ U 2 defined as follows.
Relational algebra RA C still satisfies all the main positive relational algebra identities.
Proposition 5.5 (Identities of C-relations). The following identities hold for the relational algebra on C-relations: • union is associative, commutative, idempotent, and has identity ∅; • selection distributes over union and difference; • join is associative and commutative, and distributes over union; • projection distributes over union and join; • selections and projections commute with each other; • difference has identity ∅ and distributes over union and intersection; • selection with boolean predicates gives all or nothing, σ false (A)=∅ and σ true (A)=A, where false(t)(a)=0 a and true(t)(a)=1 a for C(a)=(L a , a , ∧ a , 0 a , 1 a ); • join with an empty relation gives an empty relation, A ⊲⊳ ∅ U = ∅ U where A is a C-relation over a schema U; • projection of an empty relation gives an empty relation, π V (∅)=∅.

www.intechopen.com
Proof. If the lattice join and meet are chosen to model the positive relational operations, the above identities are implied by Proposition 4.1. On the other hand, if some other semiring sum and product operations are chosen, the identities are implied by Proposition 2.1.
The properties of relational difference are implied by the identities involving ⊖ (see Proposition 5.1) and/or the identities involving ÷ (see Proposition 5.2), depending on the selection we make.

Conclusion
Although the attribute-annotated approach has many advantages, it also has some disadvantages. First, it is clear that asking all attributes to be annotated requires more storage than simple tuple-level annotation. Another problem is that since the proposed general annotation structure, complete distributive lattices with finite meets distributing over arbitrary joins, may not be linearly ordered, an ordering of tuples with falling annotation values is not always possible. Even if each lattice used in an annotation schema is linearly ordered, it is not necessarily the case that there is a linear order on the annotation tuples. Hence, it may not be possible to list query answers (tuples) in a (decreasing) order of relevance. In fact, a suitable ordering of tuples may be established as soon as the lattice of annotation values, L =( L, , ∧, 0, 1),i sgraded Stanley (1997). Recall that a graded or ranked poset is a partially ordered set equipped with a rank function ρ : L → Z compatible with the ordering, ρ(x) < ρ(y) whenever x < y, and such that whenever y covers x, then ρ(y)=ρ(x)+1. Graded posets can be visualized by means of a Hasse diagram. Examples of graded posets are the natural numbers with the usual order, the Cartesian product of two or more sets of natural numbers with the product order being the sum of the coefficients, and the boolean lattice of finite subsets of a set with the number of elements in the subset. Notice, however, that the ranking problem simply reflects a fact about ordered structures and not a flaw in the model.
The work on attribute-annotated models is very new and has, as far as we know, not been implemented yet Hajdinjak & Bierman (2011). A prototype implementation by means of existing relational database management systems is thus expected to be performed in short term. Another guideline for future research is the study of standard issues from relational databases in the general setting, including data dependencies, redundancy, normalization, and design of databases, optimization issues.