|
|
Line 1: |
Line 1: |
| {{Distinguish|Relation algebra}}
| |
| {{Cleanup|date=May 2011}}
| |
|
| |
|
| In [[computer science]], '''relational algebra''' is an offshoot of [[first-order logic]] and of [[algebra of sets]] concerned with [[Operation (mathematics)|operation]]s over [[finitary relation]]s, usually made more convenient to work with by identifying the components of a [[tuple]] by a name (called attribute) rather than by a numeric column index, which is called a [[relation (database)|relation]] in database terminology.
| |
|
| |
|
| The main application of relational algebra is providing a theoretical foundation for [[relational database]]s, particularly [[query language]]s for such databases, chief among which is [[SQL]].
| | Man or woman who wrote the blog is [http://Imgur.com/hot?q=called+Eusebio called Eusebio]. His [http://www.reddit.com/r/howto/search?q=friends friends] say it's a bad one for him but alternatives he loves doing is going to be acting and he's has been doing it for several years. Filing has been his profession for a long time. Massachusetts has always been his living place and his family loves it. Go to his website to identify a out more: http://prometeu.net<br><br>Have a look at my website :: [http://prometeu.net how to hack clash of clans] |
| | |
| == Introduction ==
| |
| Relational algebra received little attention outside of pure mathematics until the publication of [[Edgar F. Codd|E.F. Codd]]'s [[relational model|relational model of data]] in 1970. Codd proposed such an algebra as a basis for database query languages. (See section [[#Implementations|Implementations]].)
| |
| | |
| Both a named and an unnamed perspective are possible for relational algebra, depending on whether the tuples are endowed with component names or not. In the unnamed perspective, a [[tuple]] is simply a member of a [[Cartesian product]]. In the [[Tuple#Relational_model|named perspective]], tuples are functions from a finite set ''U'' of attributes (of the relation) to a domain of values (assumed distinct from ''U'').<ref>[[Serge Abiteboul]], [[Richard Hull (computer scientist)|Richard Hull]], [[Victor Vianu]], ''Foundations of databases'', Addison-Wesley, 1995, ISBN 0-201-53771-0, p. 29–33</ref> The relational algebras obtained from the two perspectives are equivalent.<ref>[[Serge Abiteboul]], [[Richard Hull (computer scientist)|Richard Hull]], [[Victor Vianu]], ''Foundations of databases'', Addison-Wesley, 1995, ISBN 0-201-53771-0, p. 59–63 and p. 71</ref> The typical undergraduate textbooks present only the named perspective though,<ref name="SilberschatzKorth2010">{{cite book|author1=Abraham Silberschatz|author2=Henry Korth|author3=S. Sudarshan|title=Database System Concepts|year=2010|publisher=McGraw-Hill Companies,Incorporated|isbn=978-0-07-352332-3|edition=6th}}</ref><ref name="RamakrishnanGehrke2003">{{cite book|author1=Raghu Ramakrishnan|author2=Johannes Gehrke|title=Database management systems|year=2003|publisher=McGraw-Hill|isbn=978-0-07-246563-1|edition=3rd}}</ref> and this article follows suit.
| |
| | |
| Relational algebra is essentially equivalent in expressive power to [[relational calculus]] (and thus [[first-order logic]]); this result is known as [[Codd's theorem]]. One must be careful to avoid a mismatch that may arise between the two languages because negation, applied to a formula of the calculus, constructs a formula that may be true on an infinite set of possible [[tuple]]s, while the difference operator of relational algebra always returns a finite result. To overcome these difficulties, Codd restricted the operands of relational algebra to finite [[relation (database)|relation]]s only and also proposed restricted support for negation (NOT) and disjunction (OR). Analogous restrictions are found in many other logic-based computer languages. Codd defined the term '''''relational completeness''''' to refer to a language that is complete with respect to first-order predicate calculus apart from the restrictions he proposed. In practice the restrictions have no adverse effect on the applicability of his relational algebra for database purposes.
| |
| | |
| == Primitive operations == | |
| As in any algebra, some operators are primitive and the others are derived in terms of the primitive ones. It is useful if the choice of primitive operators parallels the usual choice of primitive logical operators.
| |
| | |
| Five primitive operators of Codd's algebra are the ''[[selection (relational algebra)|selection]]'', the ''[[projection (relational algebra)|projection]]'', the ''[[Cartesian product]]'' (also called the ''cross product'' or ''cross join''), the ''[[set union]]'', and the ''[[set difference]]''. Another operator, ''[[Rename (relational algebra)|rename]]'' was not noted by Codd, but the need for it is shown by the inventors of [[ISBL]]. These six operators are fundamental in the sense that omitting any one of them causes a loss of expressive power. Many other operators have been defined in terms of these six. Among the most important are [[set intersection]], division, and the natural join. In fact ISBL made a compelling case for replacing the Cartesian product with the natural join, of which the Cartesian product is a degenerate case.
| |
| | |
| Altogether, the operators of relational algebra have an expressive power identical to that of [[domain relational calculus]] or [[tuple relational calculus]]. However, for the reasons given in section [[#Introduction|Introduction]], relational algebra is less expressive than [[first-order predicate calculus]] without function symbols. Relational algebra corresponds to a subset of [[first-order logic]], namely [[Horn clause]]s without recursion and negation.
| |
| | |
| === Set operators ===
| |
| The relational algebra uses [[set union]], [[set difference]], and [[Cartesian product]] from [[set theory]], but adds additional constraints to these operators.
| |
| | |
| For set union and set difference, the two [[relation (database)|relation]]s involved must be ''union-compatible''—that is, the two relations must have the same [[set of attributes]]. Because [[set intersection]] can be defined in terms of set difference, the two relations involved in set intersection must also be union-compatible.
| |
| | |
| For the Cartesian product to be defined, the two relations involved must have disjoint headers—that is, they must not have a common [[attribute name]].
| |
| | |
| In addition, the Cartesian product is defined differently from the one in [[Set (mathematics)|set]] theory in the sense that tuples are considered to be "shallow" for the purposes of the operation. That is, the Cartesian product of a set of ''n''-tuples with a set of ''m''-tuples yields a set of "flattened" (''n'' + ''m'')-tuples (whereas basic set theory would have prescribed a set of 2-tuples, each containing an ''n''-tuple and an ''m''-tuple). More formally, ''R'' × ''S'' is defined as follows:
| |
| :''R'' × ''S'' = {(''r''<sub>1</sub>, ''r''<sub>2</sub>, ..., ''r''<sub>''n''</sub>, ''s''<sub>1</sub>, ''s''<sub>2</sub>, ..., ''s''<sub>''m''</sub>) | (''r''<sub>1</sub>, ''r''<sub>2</sub>, ..., ''r''<sub>''n''</sub>) ∈ ''R'', (''s''<sub>1</sub>, ''s''<sub>2</sub>, ..., ''s''<sub>''m''</sub>) ∈ ''S''}
| |
| The cardinality of the Cartesian product is the product of the cardinalities of its factors, i.e., |''R'' × ''S''| = |''R''| × |''S''|.
| |
| | |
| === Projection ({{pi}}) ===
| |
| {{Main|Projection (relational algebra)}}
| |
| A '''projection''' is a [[unary operation]] written as <math>\pi_{a_1, \ldots,a_n}( R )</math> where <math>a_1,\ldots,a_n</math> is a set of attribute names. The result of such projection is defined as the [[Set (mathematics)|set]] that is obtained when all [[tuple]]s in ''R'' are restricted to the set <math>\{a_1,\ldots,a_n\}</math>.
| |
| | |
| This specifies the specific subset of columns (attributes of each tuple) to be retrieved. To obtain the names and phone numbers from an address book, the projection might be written <math>\pi_{\text{contactName, contactPhoneNumber}}( \text{addressBook} )</math>. The result of that projection would be a relation which contains only the contactName and contactPhoneNumber attributes for each unique entry in addressBook.
| |
| | |
| === Selection (''σ'') ===
| |
| {{Main|Selection (relational algebra)}}
| |
| A '''generalized selection''' is a [[unary operation]] written as <math>\sigma_\varphi(R)</math> where <math>\varphi</math> is a [[propositional formula]] that consists of [[atomic formula|atom]]s as allowed in the [[selection (relational algebra)|normal selection]] and the logical operators <math>\and</math> ([[logical conjunction|and]]), <math>\or</math> ([[logical disjunction|or]]) and <math>\lnot</math> ([[negation]]). This selection selects all those [[tuple]]s in ''R'' for which <math>\varphi</math> holds.
| |
| | |
| To obtain a listing of all friends or business associates in an address book, the selection might be written as <math>\sigma_{\text{isFriend = true} \or \text{isBusinessContact = true}}( \text{addressBook} )</math>. The result would be a relation containing every attribute of every unique record where isFriend is true or where isBusinessContact is true.
| |
| | |
| In Codd's 1970 paper, selection is called restriction. <ref name="Codd1970">{{cite journal|first=E.F.|last=Codd|authorlink=E.F. Codd|title=A Relational Model of Data for Large Shared Data Banks|journal=[[Communications of the ACM]]|volume=13|issue=6|date=June 1970|pages=377–387|url=http://www.acm.org/classics/nov95/toc.html | doi = 10.1145/362384.362685}}</ref>
| |
| | |
| === Rename (''ρ'') ===
| |
| {{Main|Rename (relational algebra)}}
| |
| A '''rename''' is a [[unary operation]] written as <math>\rho_{a / b}(R)</math> where the result is identical to ''R'' except that the ''b'' attribute in all tuples is renamed to an ''a'' attribute. This is simply used to rename the attribute of a [[relation (database)|relation]] or the relation itself.
| |
| | |
| To rename the 'isFriend' attribute to 'isBusinessContact' in a relation, <math>\rho_{\text{isBusinessContact / isFriend} } ( \text{addressBook} )</math> might be used.
| |
| | |
| == Joins and join-like operators ==
| |
| <!-- insert overview text here... -->
| |
| | |
| === {{visible anchor|Natural join}} ({{visible anchor|⋈}}) ===
| |
| {{redirect|Natural join|the SQL implementation|Natural join (SQL)}}
| |
| Natural join (<math>\bowtie</math>) is a [[Binary relation|binary operator]] that is written as (''R'' <math>\bowtie</math> ''S'') where ''R'' and ''S'' are [[relation (database)|relation]]s.<ref>In [[Unicode]], the bowtie symbol is {{unicode|⋈}} (U+22C8).</ref> The result of the natural join is the set of all combinations of tuples in ''R'' and ''S'' that are equal on their common attribute names. For an example consider the tables ''Employee'' and ''Dept'' and their natural join:
| |
| | |
| {| style="margin: 0 auto;" cellpadding="20"
| |
| |- valign="top"
| |
| |
| |
| {| class="wikitable"
| |
| |+ ''Employee''
| |
| |-
| |
| ! Name !! EmpId !! DeptName
| |
| |-
| |
| | Harry || 3415 || Finance
| |
| |-
| |
| | Sally || 2241 || Sales
| |
| |-
| |
| | George || 3401 || Finance
| |
| |-
| |
| | Harriet || 2202 || Sales
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''Dept''
| |
| |-
| |
| ! DeptName !! Manager
| |
| |-
| |
| | Finance || George
| |
| |-
| |
| | Sales || Harriet
| |
| |-
| |
| | Production || Charles
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''Employee'' <math> \bowtie </math> ''Dept''
| |
| |-
| |
| ! Name !! EmpId !! DeptName !! Manager
| |
| |-
| |
| | Harry || 3415 || Finance || George
| |
| |-
| |
| | Sally || 2241 || Sales || Harriet
| |
| |-
| |
| | George || 3401 || Finance || George
| |
| |-
| |
| | Harriet || 2202 || Sales || Harriet
| |
| |}
| |
| |}
| |
| | |
| This can also be used to define [[composition of relations]]. For example, the composition of ''Employee'' and ''Dept'' is their join as shown above, projected on all but the common attribute ''DeptName''. In [[category theory]], the join is precisely the [[fiber product]].
| |
| | |
| The natural join is arguably one of the most important operators since it is the relational counterpart of logical AND. Note carefully that if the same variable appears in each of two predicates that are connected by AND, then that variable stands for the same thing and both appearances must always be substituted by the same value. In particular, natural join allows the combination of relations that are associated by a [[foreign key]]. For example, in the above example a foreign key probably holds from ''Employee''.''DeptName'' to ''Dept''.''DeptName'' and then the natural join of ''Employee'' and ''Dept'' combines all employees with their departments. Note that this works because the foreign key holds between attributes with the same name. If this is not the case such as in the foreign key from ''Dept''.''manager'' to ''Employee''.''Name'' then we have to rename these columns before we take the natural join. Such a join is sometimes also referred to as an '''equijoin''' (see ''θ''-join).
| |
| | |
| More formally the semantics of the natural join are defined as follows:
| |
| | |
| :<math>R \bowtie S = \left\{ t \cup s \ \vert \ t \in R \ \land \ s \in S \ \land \ \mathit{Fun}(t \cup s) \right\}</math>
| |
| | |
| where ''Fun'' is a [[Predicate (mathematics)|predicate]] that is true for a [[Relation (mathematics)|relation]] ''r'' [[if and only if]] ''r'' is a function. It is usually required that ''R'' and ''S'' must have at least one common attribute, but if this constraint is omitted, and ''R'' and ''S'' have no common attributes, then the natural join becomes exactly the Cartesian product.
| |
| | |
| The natural join can be simulated with Codd's primitives as follows. Assume that ''c''<sub>1</sub>,...,''c''<sub>''m''</sub> are the attribute names common to ''R'' and ''S'', ''r''<sub>1</sub>,...,''r''<sub>''n''</sub> are the
| |
| attribute names unique to ''R'' and ''s''<sub>1</sub>,...,''s''<sub>''k''</sub> are the
| |
| attribute unique to ''S''. Furthermore assume that the attribute names ''x''<sub>1</sub>,...,''x''<sub>''m''</sub> are neither in ''R'' nor in ''S''. In a first step we can now rename the common attribute names in ''S'':
| |
| | |
| :<math>T = \rho_{x_1/c_1,\ldots,x_m/c_m}(S) = \rho_{x_1/c_1}(\rho_{x_2/c_2}(\ldots\rho_{x_m/c_m}(S)\ldots))</math>
| |
| | |
| Then we take the Cartesian product and select the tuples that are to be joined:
| |
| | |
| :<math>P = \sigma_{c_1=x_1,\ldots,c_m=x_m}(R \times T) = \sigma_{c_1=x_1}(\sigma_{c_2=x_2}(\ldots\sigma_{c_m=x_m}(R \times T)\ldots))</math>
| |
| | |
| Finally we take a projection to get rid of the renamed attributes:
| |
| | |
| :<math>U = \pi_{r_1,\ldots,r_n,c_1,\ldots,c_m,s_1,\ldots,s_k}(P)</math>
| |
| | |
| === ''θ''-join and equijoin ===
| |
| Consider tables ''Car'' and ''Boat'' which list models of cars and boats and their respective prices. Suppose a customer wants to buy a car and a boat, but she does not want to spend more money for the boat than for the car. The θ-join (<math>\bowtie</math><sub>θ</sub>) on the [[relation (database)|relation]] ''CarPrice'' ≥ ''BoatPrice'' produces a table with all the possible options. When using a condition where the attributes are equal, for example Price, then the condition may be specified as ''Price=Price''
| |
| or alternatively ''(Price)'' itself.
| |
| | |
| {| style="margin: 0 auto;" cellpadding="20"
| |
| |- valign="top"
| |
| |
| |
| {| class="wikitable"
| |
| |+ ''Car''
| |
| |-
| |
| ! CarModel !! CarPrice
| |
| |-
| |
| | CarA || 20,000
| |
| |-
| |
| | CarB || 30,000
| |
| |-
| |
| | CarC || 50,000
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''Boat''
| |
| |-
| |
| ! BoatModel !! BoatPrice
| |
| |-
| |
| | Boat1 || 10,000
| |
| |-
| |
| | Boat2 || 40,000
| |
| |-
| |
| | Boat3 || 60,000
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ <math>\begin{matrix} Car \bowtie Boat \\ \scriptstyle CarPrice \geq BoatPrice \end{matrix}</math>
| |
| |-
| |
| ! CarModel !! CarPrice !! BoatModel !! BoatPrice
| |
| |-
| |
| | CarA || 20,000 || Boat1 || 10,000
| |
| |-
| |
| | CarB || 30,000 || Boat1 || 10,000
| |
| |-
| |
| | CarC || 50,000 || Boat1 || 10,000
| |
| |-
| |
| | CarC || 50,000 || Boat2 || 40,000
| |
| |}
| |
| |}
| |
| | |
| If we want to combine tuples from two relations where the combination condition is not simply the equality of shared attributes then it is convenient to have a more general form of join operator, which is the θ-join (or theta-join). The θ-join is a binary operator that is written as <math>\begin{matrix} R\ \bowtie\ S \\ a\ \theta\ b\end{matrix}</math> or <math>\begin{matrix} R\ \bowtie\ S \\ a\ \theta\ v\end{matrix}</math> where ''a'' and ''b'' are attribute names, θ is a [[binary relation]] in the set {<, ≤, =, >, ≥}, ''v'' is a value constant, and ''R'' and ''S'' are relations. The result of this operation consists of all combinations of tuples in ''R'' and ''S'' that satisfy the relation θ. The result of the θ-join is defined only if the headers of ''S'' and ''R'' are disjoint, that is, do not contain a common attribute.
| |
| | |
| The simulation of this operation in the fundamental operations is therefore as follows:
| |
| : ''R'' <math>\bowtie</math><sub>θ</sub> ''S'' = σ<sub>θ</sub>(''R'' × ''S'')
| |
| | |
| In case the operator θ is the equality operator (=) then this join is also called an '''equijoin'''.
| |
| | |
| Note, however, that a computer language that supports the natural join and rename operators does not need θ-join as well, as this can be achieved by selection from the result of a natural join (which degenerates to Cartesian product when there are no shared attributes).
| |
| | |
| === {{visible anchor|Semijoin}} (⋉)(⋊) ===
| |
| | |
| The left semijoin is joining similar to the natural join and written as ''R''
| |
| <math>\ltimes</math> ''S'' where ''R'' and ''S'' are [[relation (database)|relation]]s.<ref>In [[Unicode]], the ltimes symbol is {{unicode|⋉}} (U+22C9). The rtimes symbol is {{unicode|⋊}} (U+22CA)</ref> The result of this semijoin is the set of all tuples in ''R'' for which there is a tuple in ''S'' that is equal on their common attribute names. For an example consider the tables ''Employee'' and ''Dept'' and their
| |
| semi join:
| |
| | |
| {| style="margin: 0 auto;" cellpadding="20"
| |
| |- valign="top"
| |
| |
| |
| {| class="wikitable"
| |
| |+ ''Employee''
| |
| |-
| |
| ! Name !! EmpId !! DeptName
| |
| |-
| |
| | Harry || 3415 || Finance
| |
| |-
| |
| | Sally || 2241 || Sales
| |
| |-
| |
| | George || 3401 || Finance
| |
| |-
| |
| | Harriet || 2202 || Production
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''Dept''
| |
| |-
| |
| ! DeptName !! Manager
| |
| |-
| |
| | Sales || Bob
| |
| |-
| |
| | Sales || Thomas
| |
| |-
| |
| | Production || Katie
| |
| |-
| |
| | Production || Mark
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''Employee'' ⋉ ''Dept''
| |
| |-
| |
| ! Name !! EmpId !! DeptName
| |
| |-
| |
| | Sally || 2241 || Sales
| |
| |-
| |
| | Harriet || 2202 || Production
| |
| |}
| |
| |}
| |
| | |
| More formally the semantics of the semijoin can be defined as
| |
| follows:
| |
| | |
| : ''R'' <math>\ltimes</math> ''S'' = { ''t'' : ''t'' <math>\in</math> ''R'' <math>\and</math> <math>\exists</math>''s'' <math>\in</math> ''S''(''Fun'' (''t'' <math>\cup</math> ''s'')) }
| |
| | |
| where ''Fun''(''r'') is as in the definition of natural join.
| |
| | |
| The semijoin can be simulated using the natural join as
| |
| follows. If ''a''<sub>1</sub>, ..., ''a''<sub>''n''</sub> are the
| |
| attribute names of ''R'', then
| |
| : ''R'' <math>\ltimes</math> ''S'' = <math>\pi</math><sub>''a''<sub>1</sub>,..,''a''<sub>''n''</sub></sub>(''R'' <math>\bowtie</math> ''S'').
| |
| Since we can simulate the natural join with the basic operators it follows that this also holds for the semijoin.
| |
| | |
| === {{visible anchor|Antijoin}} (▷) ===
| |
| | |
| The antijoin, written as ''R'' <math>\triangleright</math> ''S'' where ''R'' and ''S'' are [[relation (database)|relation]]s, is similar to the semijoin, but the result of an antijoin is only those tuples in ''R'' for which there is ''no'' tuple in ''S'' that is equal on their common attribute names.<ref>In [[Unicode]], the Antijoin symbol is {{unicode|▷}} (U+25B7).</ref>
| |
| | |
| For an example consider the tables ''Employee'' and ''Dept'' and their
| |
| antijoin:
| |
| | |
| {| style="margin: 0 auto;" cellpadding="20"
| |
| |- valign="top"
| |
| |
| |
| {| class="wikitable"
| |
| |+ ''Employee''
| |
| |-
| |
| ! Name !! EmpId !! DeptName
| |
| |-
| |
| | Harry || 3415 || Finance
| |
| |-
| |
| | Sally || 2241 || Sales
| |
| |-
| |
| | George || 3401 || Finance
| |
| |-
| |
| | Harriet || 2202 || Production
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''Dept''
| |
| |-
| |
| ! DeptName !! Manager
| |
| |-
| |
| | Sales || Sally
| |
| |-
| |
| | Production || Harriet
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''Employee'' <math>\triangleright</math> ''Dept''
| |
| |-
| |
| ! Name !! EmpId !! DeptName
| |
| |-
| |
| | Harry || 3415 || Finance
| |
| |-
| |
| | George || 3401 || Finance
| |
| |}
| |
| |}
| |
| | |
| The antijoin is formally defined as follows:
| |
| | |
| : ''R'' <math>\triangleright</math> ''S'' = { ''t'' : ''t'' <math>\in</math> ''R'' <math>\and</math> <math>\neg\exists</math>''s'' <math>\in</math> ''S''(''Fun'' (''t'' <math>\cup</math> ''s'')) }
| |
| | |
| or
| |
| | |
| : ''R'' <math>\triangleright</math> ''S'' = { ''t'' : ''t'' <math>\in</math> ''R'', there is no tuple ''s'' of ''S'' that satisfies ''Fun'' (''t'' <math>\cup</math> ''s'') }
| |
| | |
| where ''Fun''(''r'') is as in the definition of natural join.
| |
| | |
| The antijoin can also be defined as the [[Complement (set theory)|complement]] of the semijoin, as follows:
| |
| : ''R'' <math>\triangleright</math> ''S'' = ''R'' − ''R'' <math>\ltimes</math> ''S''
| |
| Given this, the antijoin is sometimes called the anti-semijoin, and the antijoin operator is sometimes written as semijoin symbol with a bar above it, instead of <math>\triangleright</math>.
| |
| | |
| === {{visible anchor|Division}} (÷) ===
| |
| The division is a binary operation that is written as ''R'' ÷ ''S''. The result consists of the restrictions of tuples in ''R'' to the attribute names unique to ''R'', i.e., in the header of ''R'' but not in the header of ''S'', for which it holds that all their combinations with tuples in ''S'' are present in ''R''. For an example see the tables ''Completed'',
| |
| ''DBProject'' and their division:
| |
| | |
| {| style="margin: 0 auto;" cellpadding="20"
| |
| |
| |
| {| class="wikitable"
| |
| |+ ''Completed''
| |
| |-
| |
| ! Student !! Task
| |
| |-
| |
| | Fred || Database1
| |
| |-
| |
| | Fred || Database2
| |
| |-
| |
| | Fred || Compiler1
| |
| |-
| |
| | Eugene || Database1
| |
| |-
| |
| | Eugene || Compiler1
| |
| |-
| |
| | Sarah || Database1
| |
| |-
| |
| | Sarah || Database2
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''DBProject''
| |
| |-
| |
| ! Task
| |
| |-
| |
| | Database1
| |
| |-
| |
| | Database2
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''Completed'' ÷ ''DBProject''
| |
| |-
| |
| ! Student
| |
| |-
| |
| | Fred
| |
| |-
| |
| | Sarah
| |
| |}
| |
| |}
| |
| | |
| If ''DBProject'' contains all the tasks of the Database
| |
| project, then the result of the division above contains exactly
| |
| the students who have completed both of the tasks in the Database project.
| |
| | |
| More formally the semantics of the division is defined as follows:
| |
| | |
| : ''R'' ÷ ''S'' = { ''t''[''a''<sub>1</sub>,...,''a''<sub>''n''</sub>] : ''t'' <math>\in</math> ''R'' <math>\wedge</math> <math>\forall</math>''s'' <math>\in</math> ''S'' ( (''t''[''a''<sub>1</sub>,...,''a''<sub>''n''</sub>] <math>\cup</math> ''s'') <math>\in</math> ''R'') }
| |
| | |
| where {''a''<sub>1</sub>,...,''a''<sub>''n''</sub>} is the set of
| |
| attribute names unique to ''R'' and
| |
| ''t''[''a''<sub>1</sub>,...,''a''<sub>''n''</sub>] is the restriction of
| |
| ''t'' to this set. It is usually required that the attribute names
| |
| in the header of ''S'' are a subset of those of ''R'' because
| |
| otherwise the result of the operation will always be empty.
| |
| | |
| The simulation of the division with the basic operations is as
| |
| follows. We assume that ''a''<sub>1</sub>,...,''a''<sub>''n''</sub> are
| |
| the attribute names unique to ''R'' and
| |
| ''b''<sub>1</sub>,...,''b''<sub>''m''</sub> are the attribute names of
| |
| ''S''. In the first step we project ''R'' on its unique attribute
| |
| names and construct all combinations with tuples in ''S'':
| |
| : ''T'' := π<sub>''a''<sub>1</sub>,...,''a''<sub>''n''</sub></sub>(''R'') × ''S''
| |
| | |
| In the prior example, T would represent a table such that every Student (because Student is the unique key / attribute of the Completed table) is combined with every given Task. So Eugene, for instance, would have two rows, Eugene -> Database1 and Eugene -> Database2 in T.
| |
| | |
| In the next step we subtract ''R'' from this
| |
| [[relation (database)|relation]]:
| |
| : ''U'' := ''T'' − ''R''
| |
| Note that in ''U'' we have the possible
| |
| combinations that "could have" been in ''R'', but weren't. So if
| |
| we now take the projection on the attribute names unique to ''R''
| |
| then we have the restrictions of the tuples in ''R'' for which not
| |
| all combinations with tuples in ''S'' were present in ''R'':
| |
| : ''V'' := π<sub>''a''<sub>1</sub>,...,''a''<sub>''n''</sub></sub>(''U'')
| |
| So what remains to be done is take the projection of ''R'' on its
| |
| unique attribute names and subtract those in ''V'':
| |
| : ''W'' := π<sub>''a''<sub>1</sub>,...,''a''<sub>''n''</sub></sub>(''R'') − ''V''
| |
| | |
| == Common extensions ==
| |
| In practice the classical relational algebra described above is extended with various operations such as outer joins, aggregate functions and even transitive closure.<ref name="ÖzsuValduriez2011">{{cite book|author1=M. Tamer Özsu|author2=Patrick Valduriez|title=Principles of Distributed Database Systems|url=http://books.google.com/books?id=TOBaLQMuNV4C&pg=PA46|year=2011|publisher=Springer|isbn=978-1-4419-8833-1|page=46|edition=3rd}}</ref>
| |
| | |
| === Outer joins ===
| |
| {{Main|Outer join}}
| |
| Whereas the result of a join (or inner join) consists of tuples formed by combining matching tuples in the two operands, an outer join contains those tuples and additionally some tuples formed by extending an unmatched tuple in one of the operands by "fill" values for each of the attributes of the other operand. Note that outer joins are not considered part of the classical relational algebra discussed so far.<ref name="O'NeilO'Neil2001">{{cite book|author1=Patrick O'Neil|author2=Elizabeth O'Neil|title=Database: Principles, Programming, and Performance, Second Edition|url=http://books.google.com/books?id=UXh4qTpmO8QC&pg=PA120|year=2001|publisher=Morgan Kaufmann|isbn=978-1-55860-438-4|page=120}}</ref>
| |
| | |
| The operators defined in this section assume the existence of a ''null'' value, ''ω'', which we do not define, to be used for the fill values; in practice this corresponds to the [[Null (SQL)|NULL]] in SQL. In order to make subsequent selection operations on the resulting table meaningful, a semantic meaning needs to be assigned to nulls; in Codd's approach the propositional logic used by the selection is [[Null_(SQL)#Comparisons_with_NULL_and_the_three-valued_logic_.283VL.29|extended to a three-valued logic]], although we elide those details in this article.
| |
| | |
| Three outer join operators are defined: left outer join, right outer join, and full outer join. (The word "outer" is sometimes omitted.)
| |
| | |
| ==== {{visible anchor|Left outer join}} (⟕) ====
| |
| | |
| The left outer join is written as ''R'' ⟕ ''S'' where ''R'' and ''S'' are [[relation (database)|relation]]s.<ref>In [[Unicode]], the Left outer join symbol is {{unicode|⟕}} (U+27D5).</ref> The result of the left outer join is the set of all combinations of tuples in ''R'' and ''S'' that are equal on their common attribute names, in addition (loosely speaking) to tuples in ''R'' that have no matching tuples in ''S''.
| |
| | |
| For an example consider the tables ''Employee'' and ''Dept'' and their left outer join:
| |
| | |
| {| style="margin: 0 auto;" cellpadding="20"
| |
| |- valign="top"
| |
| |
| |
| {| class="wikitable"
| |
| |+ ''Employee''
| |
| |-
| |
| ! Name !! EmpId !! DeptName
| |
| |-
| |
| | Harry || 3415 || Finance
| |
| |-
| |
| | Sally || 2241 || Sales
| |
| |-
| |
| | George || 3401 || Finance
| |
| |-
| |
| | Harriet || 2202 || Sales
| |
| |-
| |
| | Tim || 1123 || Executive
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''Dept''
| |
| |-
| |
| ! DeptName !! Manager
| |
| |-
| |
| | Sales || Harriet
| |
| |-
| |
| | Production || Charles
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''Employee'' ⟕ ''Dept''
| |
| |-
| |
| ! Name !! EmpId !! DeptName !! Manager
| |
| |-
| |
| | Harry || 3415 || Finance || ω
| |
| |-
| |
| | Sally || 2241 || Sales || Harriet
| |
| |-
| |
| | George || 3401 || Finance || ω
| |
| |-
| |
| | Harriet || 2202 || Sales || Harriet
| |
| |-
| |
| | Tim || 1123 || Executive || ω
| |
| |}
| |
| |}
| |
| | |
| In the resulting relation, tuples in ''S'' which have no common values in common attribute names with tuples in ''R'' take a ''null'' value, ''ω''.
| |
| | |
| Since there are no tuples in ''Dept'' with a ''DeptName'' of ''Finance'' or ''Executive'', ''ω''s occur in the resulting relation where tuples in ''Employee'' have a ''DeptName'' of ''Finance'' or ''Executive''.
| |
| | |
| Let ''r''<sub>1</sub>, ''r''<sub>2</sub>, ..., ''r''<sub>''n''</sub> be the attributes of the relation ''R'' and let {(''ω'', ..., ''ω'')} be the singleton
| |
| relation on the attributes that are ''unique'' to the relation ''S'' (those that are not attributes of ''R''). Then the left outer join can be described in terms of the natural join (and hence using basic operators) as follows:
| |
| | |
| :<math>(R \bowtie S) \cup ((R - \pi_{r_1, r_2, \dots, r_n}(R \bowtie S)) \times \{(\omega, \dots \omega)\})</math>
| |
| | |
| ==== {{visible anchor|Right outer join}} (⟖) ====
| |
| The right outer join behaves almost identically to the left outer join, but the roles of the tables are switched.
| |
| | |
| The right outer join of [[relation (database)|relation]]s ''R'' and ''S'' is written as ''R'' ⟖ ''S''.<ref>In [[Unicode]], the Right outer join symbol is {{unicode|⟖}} (U+27D6).</ref> The result of the right outer join is the set of all combinations of tuples in ''R'' and ''S'' that are equal on their common attribute names, in addition to tuples in ''S'' that have no matching tuples in ''R''.
| |
| | |
| For example consider the tables ''Employee'' and ''Dept'' and their
| |
| right outer join:
| |
| | |
| {| style="margin: 0 auto;" cellpadding="20"
| |
| |- valign="top"
| |
| |
| |
| {| class="wikitable"
| |
| |+ ''Employee''
| |
| |-
| |
| ! Name !! EmpId !! DeptName
| |
| |-
| |
| | Harry || 3415 || Finance
| |
| |-
| |
| | Sally || 2241 || Sales
| |
| |-
| |
| | George || 3401 || Finance
| |
| |-
| |
| | Harriet || 2202 || Sales
| |
| |-
| |
| | Tim || 1123 || Executive
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''Dept''
| |
| |-
| |
| ! DeptName !! Manager
| |
| |-
| |
| | Sales || Harriet
| |
| |-
| |
| | Production || Charles
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''Employee'' ⟖ ''Dept''
| |
| |-
| |
| ! Name !! EmpId !! DeptName !! Manager
| |
| |-
| |
| | Sally || 2241 || Sales || Harriet
| |
| |-
| |
| | Harriet || 2202 || Sales || Harriet
| |
| |-
| |
| | ω || ω || Production || Charles
| |
| |}
| |
| |}
| |
| | |
| In the resulting relation, tuples in ''R'' which have no common values in common attribute names with tuples in ''S'' take a ''null'' value, ''ω''.
| |
| | |
| Since there are no tuples in ''Employee'' with a ''DeptName'' of ''Production'', ''ω''s occur in the Name attribute of the resulting relation where tuples in ''DeptName'' had tuples of ''Production''.
| |
| | |
| Let ''s''<sub>1</sub>, ''s''<sub>2</sub>, ..., ''s''<sub>''n''</sub> be the attributes of the relation ''S'' and let {(''ω'', ..., ''ω'')} be the singleton
| |
| relation on the attributes that are ''unique'' to the relation ''R'' (those that are not attributes of ''S''). Then, as with the left outer join, the right outer join can be simulated using the natural join as follows:
| |
| | |
| :<math>(R \bowtie S) \cup (\{(\omega, \dots, \omega)\} \times (S - \pi_{s_1, s_2, \dots, s_n}(R \bowtie S)))</math>
| |
| | |
| ==== {{visible anchor|Full outer join}} (⟗) ====
| |
| | |
| The '''outer join''' or '''full outer join''' in effect combines the results of the left and right outer joins.
| |
| | |
| The full outer join is written as ''R'' ⟗ ''S'' where ''R'' and ''S'' are [[relation (database)|relation]]s.<ref>In [[Unicode]], the Full Outer join symbol is {{unicode|⟗}} (U+27D7).</ref> The result of the full outer join is the set of all combinations of tuples in ''R'' and ''S'' that are equal on their common attribute names, in addition to tuples in ''S'' that have no matching tuples in ''R'' and tuples in ''R'' that have no matching tuples in ''S'' in their common attribute names.
| |
| | |
| For an example consider the tables ''Employee'' and ''Dept'' and their
| |
| full outer join:
| |
| | |
| {| style="margin: 0 auto;" cellpadding="20"
| |
| |- valign="top"
| |
| |
| |
| {| class="wikitable"
| |
| |+ ''Employee''
| |
| |-
| |
| ! Name !! EmpId !! DeptName
| |
| |-
| |
| | Harry || 3415 || Finance
| |
| |-
| |
| | Sally || 2241 || Sales
| |
| |-
| |
| | George || 3401 || Finance
| |
| |-
| |
| | Harriet || 2202 || Sales
| |
| |-
| |
| | Tim || 1123 || Executive
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''Dept''
| |
| |-
| |
| ! DeptName !! Manager
| |
| |-
| |
| | Sales || Harriet
| |
| |-
| |
| | Production || Charles
| |
| |}
| |
| ||
| |
| {| class="wikitable"
| |
| |+ ''Employee'' ⟗ ''Dept''
| |
| |-
| |
| ! Name !! EmpId !! DeptName !! Manager
| |
| |-
| |
| | Harry || 3415 || Finance || ω
| |
| |-
| |
| | Sally || 2241 || Sales || Harriet
| |
| |-
| |
| | George || 3401 || Finance || ω
| |
| |-
| |
| | Harriet || 2202 || Sales || Harriet
| |
| |-
| |
| | Tim || 1123 || Executive || ω
| |
| |-
| |
| | ω || ω || Production || Charles
| |
| |}
| |
| |}
| |
| | |
| In the resulting relation, tuples in ''R'' which have no common values in common attribute names with tuples in ''S'' take a ''null'' value, ''ω''. Tuples in ''S'' which have no common values in common attribute names with tuples in ''R'' also take a ''null'' value, ''ω''.
| |
| | |
| The full outer join can be simulated using the left and right outer joins (and hence the natural join and set union) as follows:
| |
| | |
| :''R'' ⟗ ''S'' = (''R'' ⟕ ''S'') <math>\cup</math> (''R'' ⟖ ''S'')
| |
| | |
| === Operations for domain computations ===
| |
| There is nothing in relational algebra introduced so far that would allow computations on the data domains (other than evaluation of propositional expressions involving equality). For example, it's not possible using only the algebra introduced so far to write an expression that would multiply the numbers from two columns, e.g. a unit price with a quantity to obtain a total price. Practical query languages have such facilities, e.g. the SQL SELECT allows arithmetic operations to define new columns in the result <code>SELECT unit_price * quantity AS total_price FROM t</code>, and a similar facility is provided more explicitly by [[Tutorial D]]'s <code>EXTEND</code> keyword.<ref name="Date2011">{{cite book|author=C. J. Date|title=SQL and Relational Theory: How to Write Accurate SQL Code|url=http://books.google.com/books?id=WuZGD5tBfMwC&pg=PA133|year=2011|publisher=O'Reilly Media, Inc.|isbn=978-1-4493-1974-8|pages=133–135}}</ref> In database theory, this is called '''extended projection'''.<ref name="Garcia-MolinaUllman2009"/>{{rp|213}}
| |
| | |
| ==== Aggregation ====
| |
| Furthermore, computing various functions on a column, like the summing up its elements, is also not possible using the relational algebra introduced so far. There are five [[aggregate function]]s that are included with most relational database systems. These operations are Sum, Count, Average, Maximum and Minimum. In relational algebra the aggregation operation over a schema (A<sub>1</sub>, A<sub>2</sub>, ... A<sub>n</sub>) is written as follows:
| |
| | |
| G<sub>1</sub>, G<sub>2</sub>, ..., G<sub>m</sub> ''g''<sub> f<sub>1</sub>(A<sub>1</sub>'), f<sub>2</sub>(A<sub>2</sub>'), ..., f<sub>k</sub>(A<sub>k</sub>')</sub> ('''r''')
| |
| | |
| where each A<sub>j</sub>', 1 ≤ j ≤ k, is one of the original attributes A<sub>i</sub>, 1 ≤ i ≤ n.
| |
| | |
| The attributes preceding the ''g'' are grouping attributes, which function like a "group by" clause in SQL. Then there are an arbitrary number of aggregation functions applied to individual attributes. The operation is applied to an arbitrary relation '''r'''. The grouping attributes are optional, and if they are not supplied, the aggregation functions are applied across the entire relation to which the operation is applied.
| |
| | |
| Let's assume that we have a table named Account with three columns, namely Account_Number, Branch_Name and Balance. We wish to find the maximum balance of each branch. This is accomplished by <sub>Branch_Name</sub>G<sub>Max(Balance)</sub>(Account). To find the highest balance of all accounts regardless of branch, we could simply write G<sub>Max(Balance)</sub>(Account).
| |
| | |
| === Transitive closure ===
| |
| Although relational algebra seems powerful enough for most practical purposes, there are some simple and natural operators on [[relation (database)|relation]]s which cannot be expressed by relational algebra. One of them is the [[transitive closure]] of a binary relation. Given a domain ''D'', let binary relation ''R'' be a subset of ''D''×''D''. The transitive closure ''R<sup>+</sup>'' of ''R'' is the smallest subset of ''D''×''D'' containing ''R'' which satisfies the following condition:
| |
| | |
| :<math>\forall x \forall y \forall z \left( (x,y) \in R^+ \wedge (y,z) \in R^+ \Rightarrow (x,z) \in R^+ \right)</math>
| |
| | |
| There is no relational algebra expression ''E''(''R'') taking ''R'' as a variable argument which produces ''R''<sup>+</sup>. This can be proved using the fact that, given a relational expression ''E'' for which it is claimed that ''E''(''R'') = ''R''<sup>+</sup>, where ''R'' is a variable, we can always find an instance '''''r''''' of ''R'' (and a corresponding domain '''d''') such that ''E''('''''r''''') ≠ '''''r'''''<sup>+</sup>.<ref>{{cite journal|title=Universality of data retrieval languages|journal=Proceedings of the 6th ACM SIGACT-SIGPLAN symposium on Principles of programming languages|year=1979|first=Alfred V.|last=Aho|coauthors=Jeffrey D. Ullman|volume=|issue=|pages=110–119|id=|format=|accessdate=|doi=10.1145/567752.567763}}</ref>
| |
| | |
| SQL however officially supports such [[Hierarchical and recursive queries in SQL|fixpoint queries]] since 1999, and it had vendor-specific extensions in this direction well before that.
| |
| | |
| == Use of algebraic properties for query optimization ==
| |
| {{unreferenced section|date=June 2013}}
| |
| [[relational query|Queries]] can be represented as a [[tree (data structure)|tree]], where
| |
| * the internal nodes are operators,
| |
| * leaves are [[relation (database)|relation]]s,
| |
| * subtrees are subexpressions.
| |
| | |
| Our primary goal is to transform expression trees into equivalent expression trees, where the average size of the relations yielded by subexpressions in the tree is smaller than it was before the [[query optimization|optimization]]. Our secondary goal is to try to form common subexpressions within a single query, or if there is more than one query being evaluated at the same time, in all of those queries. The rationale behind the second goal is that it is enough to compute common subexpressions once, and the results can be used in all queries that contain that subexpression.
| |
| | |
| Here we present a set of rules that can be used in such transformations.
| |
| | |
| === Selection ===
| |
| | |
| Rules about selection operators play the most important role in query optimization. Selection is an operator that very effectively decreases the number of rows in its operand, so if we manage to move the selections in an expression tree towards the leaves, the internal [[relation (database)|relation]]s (yielded by subexpressions) will likely shrink.
| |
| | |
| ==== Basic selection properties ====
| |
| | |
| Selection is [[idempotent]] (multiple applications of the same selection have no additional effect beyond the first one), and [[commutative]] (the order selections are applied in has no effect on the eventual result).
| |
| | |
| #<math>\sigma_{A}(R)=\sigma_{A}\sigma_{A}(R)\,\!</math>
| |
| #<math>\sigma_{A}\sigma_{B}(R)=\sigma_{B}\sigma_{A}(R)\,\!</math>
| |
| | |
| ==== Breaking up selections with complex conditions ====
| |
| | |
| A selection whose condition is a [[Logical conjunction|conjunction]] of simpler conditions is equivalent to a sequence of selections with those same individual conditions, and selection whose condition is a [[Logical disjunction|disjunction]] is equivalent to a union of selections. These identities can be used to merge selections so that fewer selections need to be evaluated, or to split them so that the component selections may be moved or optimized separately.
| |
| | |
| #<math>\sigma_{A \land B}(R)=\sigma_{A}(\sigma_{B}(R))=\sigma_{B}(\sigma_{A}(R))</math>
| |
| #<math>\sigma_{A \lor B}(R)=\sigma_{A}(R)\cup\sigma_{B}(R)</math>
| |
| | |
| ==== Selection and cross product ====
| |
| Cross product is the costliest operator to evaluate. If the input [[relation (database)|relation]]s have ''N'' and ''M'' rows, the result will contain <math>NM</math> rows. Therefore it is very important to do our best to decrease the size of both operands before applying the cross product operator.
| |
| | |
| This can be effectively done, if the cross product is followed by a selection operator, e.g. <math>\sigma_{A}</math>(''R'' × ''P''). Considering the definition of join, this is the most likely case. If the cross product is not followed by a selection operator, we can try to push down a selection from higher levels of the expression tree using the other selection rules.
| |
| | |
| In the above case we break up condition ''A'' into conditions ''B'', ''C'' and ''D'' using the split rules about complex selection conditions, so that ''A'' = ''B'' <math>\wedge</math> ''C'' <math>\wedge</math> ''D'' and ''B'' only contains attributes from ''R'', ''C'' contains attributes only from ''P'' and ''D'' contains the part of ''A'' that contains attributes from both ''R'' and ''P''. Note, that ''B'', ''C'' or ''D'' are possibly empty. Then the following holds:
| |
| :<math>\sigma_{A}(R \times P) = \sigma_{B \wedge C \wedge D}(R \times P) = \sigma_{D}(\sigma_{B}(R) \times \sigma_{C}(P))</math> | |
| | |
| ==== Selection and set operators ====
| |
| Selection is [[distributive]] over the setminus, intersection, and union operators. The following three rules are used to push selection below set operations in the expression tree. Note, that in the setminus and the intersection operators it is possible to apply the selection operator to only one of the operands after the transformation. This can make sense in cases, where one of the operands is small, and the overhead of evaluating the selection operator outweighs the benefits of using a smaller [[relation (database)|relation]] as an operand.
| |
| | |
| #<math>\sigma_{A}(R\setminus P)=\sigma_{A}(R)\setminus \sigma_{A}(P) =\sigma_{A}(R)\setminus P</math>
| |
| #<math>\sigma_{A}(R\cup P)=\sigma_{A}(R)\cup\sigma_{A}(P)</math>
| |
| #<math>\sigma_{A}(R\cap P)=\sigma_{A}(R)\cap\sigma_{A}(P)=\sigma_{A}(R)\cap P=R\cap \sigma_{A}(P)</math>
| |
| | |
| ==== Selection and projection ====
| |
| | |
| Selection commutes with projection if and only if the fields referenced in the selection condition are a subset of the fields in the projection. Performing selection before projection may be useful if the operand is a cross product or join. In other cases, if the selection condition is relatively expensive to compute, moving selection outside the projection may reduce the number of tuples which must be tested (since projection may produce fewer tuples due to the elimination of duplicates resulting from omitted fields).
| |
| | |
| :<math>\pi_{a_1, \ldots ,a_n}(\sigma_A( R )) = \sigma_A(\pi_{a_1, \ldots,a_n}( R ))\text{ where fields in }A \subseteq \{a_1,\ldots,a_n\}</math>
| |
| | |
| === Projection ===
| |
| | |
| ==== Basic projection properties ====
| |
| | |
| Projection is idempotent, so that a series of (valid) projections is equivalent to the outermost projection.
| |
| | |
| :<math>\pi_{a_1, \ldots , a_n}(\pi_{b_1,\ldots , b_m}(R)) = \pi_{a_1, \ldots , a_n}(R)\text{ where }\{a_1, \ldots , a_n\} \subseteq \{b_1, \ldots , b_m\}</math>
| |
| | |
| ==== Projection and set operators ====
| |
| | |
| Projection is [[distributive]] over set union.
| |
| | |
| :<math>\pi_{a_1, \ldots, a_n}(R \cup P) = \pi_{a_1, \ldots, a_n}(R) \cup \pi_{a_1, \ldots, a_n}(P). \, </math>
| |
| | |
| Projection does not distribute over intersection and set difference. Counterexamples are given by:
| |
| | |
| :<math>\pi_A(\{ \langle A=a, B=b \rangle \} \cap \{ \langle A=a, B=b' \rangle \}) = \emptyset</math>
| |
| | |
| :<math>\pi_A(\{ \langle A=a, B=b \rangle \}) \cap \pi_A(\{ \langle A=a, B=b' \rangle \}) = \{ \langle A=a \rangle \}</math>
| |
| | |
| and
| |
| | |
| :<math>\pi_A(\{ \langle A=a, B=b \rangle \} \setminus \{ \langle A=a, B=b' \rangle \}) = \{ \langle A=a\}</math>
| |
| | |
| :<math>\pi_A(\{ \langle A=a, B=b \rangle \}) \setminus \pi_A(\{ \langle A=a, B=b' \rangle \}) = \emptyset\,,</math>
| |
| where ''b'' is assumed to be distinct from ''b{{'}}''.
| |
| | |
| === Rename ===
| |
| | |
| ==== Basic rename properties ====
| |
| | |
| Successive renames of a variable can be collapsed into a single rename. Rename operations which have no variables in common can be arbitrarily reordered with respect to one another, which can be exploited to make successive renames adjacent so that they can be collapsed.
| |
| | |
| #<math>\rho_{a / b}(\rho_{b / c}(R)) = \rho_{a / c}(R)\,\!</math>
| |
| #<math>\rho_{a / b}(\rho_{c / d}(R)) = \rho_{c / d}(\rho_{a / b}(R))\,\!</math>
| |
| | |
| ==== Rename and set operators ====
| |
| | |
| Rename is distributive over set difference, union, and intersection.
| |
| | |
| #<math>\rho_{a / b}(R \setminus P) = \rho_{a / b}(R) \setminus \rho_{a / b}(P)</math>
| |
| #<math>\rho_{a / b}(R \cup P) = \rho_{a / b}(R) \cup \rho_{a / b}(P)</math>
| |
| #<math>\rho_{a / b}(R \cap P) = \rho_{a / b}(R) \cap \rho_{a / b}(P)</math>
| |
| | |
| == Implementations ==
| |
| | |
| The first query language to be based on Codd's algebra was [[ISBL]], and this pioneering work has been acclaimed by many authorities as having shown the way to make Codd's idea into a useful language. [[Business System 12]] was a short-lived industry-strength relational DBMS that followed the ISBL example.
| |
| | |
| In 1998 [[Christopher J. Date|Chris Date]] and [[Hugh Darwen]] proposed a language called '''Tutorial D''' intended for use in teaching relational database theory, and its query language also draws on ISBL's ideas. [[Rel (DBMS)|Rel]] is an implementation of '''Tutorial D'''.
| |
| | |
| Even the query language of [[SQL]] is loosely based on a relational algebra, though the operands in SQL ([[table (database)|table]]s) are not exactly [[relation (database)|relation]]s and several useful theorems about the relational algebra do not hold in the SQL counterpart (arguably to the detriment of optimisers and/or users). The SQL table model is a bag ([[multiset]]), rather than a set. For example, the expression (R ∪ S) − T = (R − T ) ∪ (S − T) is a theorem for relational algebra on sets, but not for relational algebra on bags; for a treatment of relational algebra on bags see chapter 5 of the "Complete" textbook by [[Garcia-Molina]], [[Jeffrey Ullman|Ullman]] and [[Jennifer Widom|Widom]].<ref name="Garcia-MolinaUllman2009">{{cite book|author1=Hector Garcia-Molina|author2=Jeffrey D. Ullman|author3=Jennifer Widom|title=Database systems: the complete book|year=2009|publisher=Pearson Prentice Hall|isbn=978-0-13-187325-4|edition=2nd}}</ref>
| |
| | |
| == See also ==
| |
| <div style="-moz-column-count:3; column-count:3;">
| |
| * [[Cartesian product]]
| |
| * [[D (data language specification)]]
| |
| * [[D4 (programming language)]] (an implementation of D)
| |
| * [[Database]]
| |
| * [[Logic of relatives]]
| |
| * [[Object role modeling]]
| |
| * [[Projection (mathematics)]]
| |
| * [[Projection (relational algebra)]]
| |
| * [[Projection (set theory)]]
| |
| * [[Relation (mathematics)|Relation]]
| |
| * [[Relation (database)]]
| |
| * [[Relation algebra]]
| |
| * [[Relation composition]]
| |
| * [[Relation construction]]
| |
| * [[Relational calculus]]
| |
| * [[Relational database]]
| |
| * [[Relational model]]
| |
| * [[Theory of relations]]
| |
| * [[Triadic relation]]
| |
| * [[Tutorial D]]
| |
| </div>
| |
| | |
| == References ==
| |
| {{reflist}}
| |
| | |
| == Further reading ==
| |
| Practically any academic textbook on databases has a detailed treatment of the classic relational algebra.
| |
| * {{cite doi|10.1016/0022-0000(84)90077-1}} (For relationship with [[cylindric algebra]]s).
| |
| | |
| == External links ==
| |
| *[http://www.slinfo.una.ac.cr/rat/rat.html RAT. Software Relational Algebra Translator to SQL]
| |
| *[http://www.databasteknik.se/webbkursen/relalg-lecture/index.html Lecture Notes: Relational Algebra] – A quick tutorial to adapt SQL queries into relational algebra
| |
| * [http://leap.sourceforge.net LEAP – An implementation of the relational algebra]
| |
| * [http://galileo.dmi.unict.it/wiki/relational/ Relational – A graphic implementation of the relational algebra]
| |
| * [http://www-db.stanford.edu/~widom/cs346/ioannidis.pdf Query Optimization] This paper is an introduction into the use of the relational algebra in optimizing queries, and includes numerous citations for more in-depth study.
| |
| * [http://bandilab.org/bandicoot-algebra.pdf bandilab.org – neat graphical illustrations of the relational operators]
| |
| * [http://www.cse.fau.edu/~marty#RADownload Relational Algebra System for Oracle and Microsoft SQL Server]
| |
| | |
| {{Databases}}
| |
| | |
| {{DEFAULTSORT:Relational Algebra}}
| |
| [[Category:Relational model]]
| |