MyPage is a personalized page based on your interests.The page is customized to help you to find content that matters you the most.


I'm not curious

Simplifies SQL-style Computations -Relational Computation on Groups

Published on 05 November 14
486
0
0

During developing database applications, we are often faced with complicated SQL-style computations like relational computation on multilayered groups. As SQL requires summarizing during data grouping and it doesnât support object-style relational access, it is difficult to deal with these computations. We have to resort to an advanced technique of window functions nested with multilayered subqueries. esProc, however, can handle them more easily by supporting real grouping and visual object-style relational access.

In practice, there are a lot of cases in which relational computation on groups is needed, such as the one presented on the website http://forums.bit-tech.net/showthread.php?t=207052. Based on this practical example, we have designed a more common one for illustrating in detail how esProc works to realize the relational computation on groups.

Computing target: Query branches of a DVD store where there are less than four categories of DVD copies.

Data Structureï¼Branch Table stores information of the branch stores. DVD Table stores titles and categories of DVDs, in which DVDs, like Transformers IV, are virtual data items instead of physical disks. DVDCopy Table stores physical DVD copies branch stores have. Note: DVDCopy Table is related to Branch Table through BranchID field, and to DVD Table through DVDID field. The following is part of the data:

Branch Tableï¼
Simplifies SQL-style Computations -Relational Computation on Groups - Image 1
DVD Tableï¼
Simplifies SQL-style Computations -Relational Computation on Groups - Image 2
DVDCopy Tableï¼
Simplifies SQL-style Computations -Relational Computation on Groups - Image 3

Descriptionï¼

1. Computed results should be certain records in Branch Table.

2. That the Status field of a record in DVDCopy shows Miss means the DVD is missing, and that a recordâs LastDateReturned field is empty means the DVD has been rented out but not returned. Obviously the DVDs that are missing and havenât been returned are outside of our computing objects and should be filtered away.

3. We should consider the situation that there may be certain branch stores whose information isnât included in DVDCopy Table, though it is rarely seen.

Solution:

1. Select valid, existing DVD copies the branch stores have from DVDCopy Table.

2. Group DVDCopy Table by BID. Each group will contain all DVD copies a branch store has.

3. Select the DVDs corresponding to the DVD copies each branch store has, and compute the number of categories to which these DVDs belong.

4. Select branch stores where the number of categories of existing DVDs is less than four. These branch stores are eligible.

5. Select branch stores which DVDCopy Table hasnât. They are also eligible.

6. Combine the two kinds of eligible branch stores.

Code written in esProc:
Simplifies SQL-style Computations -Relational Computation on Groups - Image 4
A1-A3ï¼Query data from three tables in the database. The three tables are made variables which are named respectively as Branch, DVD and DVDCopy. Computed results are as follows:
Simplifies SQL-style Computations -Relational Computation on Groups - Image 5
A4ï¼ Switch the DVDID field and BID field in DVDCopy Table to corresponding records in DVD Table and Branch Table respectively. Note: This step is the basis of object-style relational access, which requires the use of switch function. After computing, DVDCopy becomes as follows:
Simplifies SQL-style Computations -Relational Computation on Groups - Image 6
Fields in blue have corresponding records. Click one and you can see the details, as shown in the following figure:
Simplifies SQL-style Computations -Relational Computation on Groups - Image 7

Now we can perform object-style relational access only with the operator â.â. For instance, DVDCopy.(DVDID). (Category) represents the category of each DVD copy, and DVDCopy.(BID) gets the detailed information (complete record) about the branch store corresponding to each DVD copy.

A5ï¼=DVDCopy.select(Statues!="Miss" && LastDateReturend!=null). This line of code aims to filter away the DVD copies that are missing and that havenât been returned. After the code is executed, values of A5 are shown as follows:
Simplifies SQL-style Computations -Relational Computation on Groups - Image 8
A6ï¼=A5.group(BID) is to group data in A5 by BID, with each row representing all DVD copies a branch store has. The result is as follows:
Simplifies SQL-style Computations -Relational Computation on Groups - Image 9
Click the data in blue and youâll see members of each group:
Simplifies SQL-style Computations -Relational Computation on Groups - Image 10

It can be seen that group function only groups data, but doesnât summarize the data at the same time. In this point, it is different from the function for grouping in SQL. Sometimes, we need to further process the grouped data, rather than simply summarizing them. To do this, esProcâs group function is more convenient to use, as shown in the code in A7:

A7ï¼=A6.new( ~.BID:BonList, ~.(DVDID).id(Category).count():CatCount )

The above line of code computes the number of categories of DVD copies to which each branch store corresponds. new function can generate a new object A7 based on the data in A6. A7 has two columns: BonList and CatCount. BonList originates directly from column BID of the grouped data in A6, and CatCount originates from column DVDID of the grouped data. There are three steps to compute CatCount: ~.(DVDID) finds the DVD records corresponding to all DVD copies each branch store has; id(Category) removes repeated records of Category from these DVD records; and count() computes the number of categories. The result is as follows:
Simplifies SQL-style Computations -Relational Computation on Groups - Image 11

That is, branch store B002 has three categories of DVD copies, B003 also has three categories and B001 has four categories.

A8ï¼A7.select(CatCount<4).This line of code makes query to select branch stores whose CatCount is less than 4. The result is as follows:
Simplifies SQL-style Computations -Relational Computation on Groups - Image 12

The above branch stores that are in short supply are computed according to DVDCopy Table. But maybe some branch stores with serious supply shortage are not in the DVDCopy, such as the cases that all the DVD copies in the branch store has been rented out, or that the branch store hasnât any DVD copies. So these branch stores should also be counted. The code for this step is as follows:

A9ï¼=A8.(BonList) | (Branch \ A7.(BonList))

In the above code, operator â|â represents the union operation of two data sets (which can be replaced by union function); operator â\â represents the complement operation (which can be replaced by diff function). A8.(BonList), Branch and A7.(BonList) represent respectively branch stores with supply shortage in DVDCopy Table, all branch stores and branch stores which are included in DVDCopy Table. Their respective values are:
Simplifies SQL-style Computations -Relational Computation on Groups - Image 13
A9 computes the final result of this example. Its values are:
Simplifies SQL-style Computations -Relational Computation on Groups - Image 14

Or we can realize the computation indirectly. For instance, first compute âbranch stores that are not in short supplyâ, and then compute the complement of the result and Branch Table. The final result should be the same as that of A9.

Please note variables like A8 or Branch cannot be used to represent a data set in SQL because it doesnât support explicit set. Thus the simple code in the above have to be replaced by lines of complicated SQL statements.

Besides, esProc can be called by a Java program. The method is similar to that with which a Java program calls an ordinary database. The JDBC provided by esProc can be used to return a computed result of the form of ResultSet to Java main program. For more details, please refer to the related documents
During developing database applications, we are often faced with complicated SQL-style computations like relational computation on multilayered groups. As SQL requires summarizing during data grouping and it doesnât support object-style relational access, it is difficult to deal with these computations. We have to resort to an advanced technique of window functions nested with multilayered subqueries. esProc, however, can handle them more easily by supporting real grouping and visual object-style relational access.

In practice, there are a lot of cases in which relational computation on groups is needed, such as the one presented on the website http://forums.bit-tech.net/showthread.php?t=207052. Based on this practical example, we have designed a more common one for illustrating in detail how esProc works to realize the relational computation on groups.

Computing target: Query branches of a DVD store where there are less than four categories of DVD copies.

Data Structureï¼Branch Table stores information of the branch stores. DVD Table stores titles and categories of DVDs, in which DVDs, like Transformers IV, are virtual data items instead of physical disks. DVDCopy Table stores physical DVD copies branch stores have. Note: DVDCopy Table is related to Branch Table through BranchID field, and to DVD Table through DVDID field. The following is part of the data:

Branch Tableï¼

Simplifies SQL-style Computations -Relational Computation on Groups - Image 1

DVD Tableï¼

Simplifies SQL-style Computations -Relational Computation on Groups - Image 2

DVDCopy Tableï¼

Simplifies SQL-style Computations -Relational Computation on Groups - Image 3

Descriptionï¼

1. Computed results should be certain records in Branch Table.

2. That the Status field of a record in DVDCopy shows Miss means the DVD is missing, and that a recordâs LastDateReturned field is empty means the DVD has been rented out but not returned. Obviously the DVDs that are missing and havenât been returned are outside of our computing objects and should be filtered away.

3. We should consider the situation that there may be certain branch stores whose information isnât included in DVDCopy Table, though it is rarely seen.

Solution:

1. Select valid, existing DVD copies the branch stores have from DVDCopy Table.

2. Group DVDCopy Table by BID. Each group will contain all DVD copies a branch store has.

3. Select the DVDs corresponding to the DVD copies each branch store has, and compute the number of categories to which these DVDs belong.

4. Select branch stores where the number of categories of existing DVDs is less than four. These branch stores are eligible.

5. Select branch stores which DVDCopy Table hasnât. They are also eligible.

6. Combine the two kinds of eligible branch stores.

Code written in esProc:

Simplifies SQL-style Computations -Relational Computation on Groups - Image 4

A1-A3ï¼Query data from three tables in the database. The three tables are made variables which are named respectively as Branch, DVD and DVDCopy. Computed results are as follows:

Simplifies SQL-style Computations -Relational Computation on Groups - Image 5

A4ï¼ Switch the DVDID field and BID field in DVDCopy Table to corresponding records in DVD Table and Branch Table respectively. Note: This step is the basis of object-style relational access, which requires the use of switch function. After computing, DVDCopy becomes as follows:

Simplifies SQL-style Computations -Relational Computation on Groups - Image 6

Fields in blue have corresponding records. Click one and you can see the details, as shown in the following figure:

Simplifies SQL-style Computations -Relational Computation on Groups - Image 7

Now we can perform object-style relational access only with the operator â.â. For instance, DVDCopy.(DVDID). (Category) represents the category of each DVD copy, and DVDCopy.(BID) gets the detailed information (complete record) about the branch store corresponding to each DVD copy.

A5ï¼=DVDCopy.select(Statues!="Miss" && LastDateReturend!=null). This line of code aims to filter away the DVD copies that are missing and that havenât been returned. After the code is executed, values of A5 are shown as follows:

Simplifies SQL-style Computations -Relational Computation on Groups - Image 8

A6ï¼=A5.group(BID) is to group data in A5 by BID, with each row representing all DVD copies a branch store has. The result is as follows:

Simplifies SQL-style Computations -Relational Computation on Groups - Image 9

Click the data in blue and youâll see members of each group:

Simplifies SQL-style Computations -Relational Computation on Groups - Image 10

It can be seen that group function only groups data, but doesnât summarize the data at the same time. In this point, it is different from the function for grouping in SQL. Sometimes, we need to further process the grouped data, rather than simply summarizing them. To do this, esProcâs group function is more convenient to use, as shown in the code in A7:

A7ï¼=A6.new( ~.BID:BonList, ~.(DVDID).id(Category).count():CatCount )

The above line of code computes the number of categories of DVD copies to which each branch store corresponds. new function can generate a new object A7 based on the data in A6. A7 has two columns: BonList and CatCount. BonList originates directly from column BID of the grouped data in A6, and CatCount originates from column DVDID of the grouped data. There are three steps to compute CatCount: ~.(DVDID) finds the DVD records corresponding to all DVD copies each branch store has; id(Category) removes repeated records of Category from these DVD records; and count() computes the number of categories. The result is as follows:

Simplifies SQL-style Computations -Relational Computation on Groups - Image 11

That is, branch store B002 has three categories of DVD copies, B003 also has three categories and B001 has four categories.

A8ï¼A7.select(CatCount.This line of code makes query to select branch stores whose CatCount is less than 4. The result is as follows:

Simplifies SQL-style Computations -Relational Computation on Groups - Image 12

The above branch stores that are in short supply are computed according to DVDCopy Table. But maybe some branch stores with serious supply shortage are not in the DVDCopy, such as the cases that all the DVD copies in the branch store has been rented out, or that the branch store hasnât any DVD copies. So these branch stores should also be counted. The code for this step is as follows:

A9ï¼=A8.(BonList) | (Branch \ A7.(BonList))

In the above code, operator â|â represents the union operation of two data sets (which can be replaced by union function); operator â\â represents the complement operation (which can be replaced by diff function). A8.(BonList), Branch and A7.(BonList) represent respectively branch stores with supply shortage in DVDCopy Table, all branch stores and branch stores which are included in DVDCopy Table. Their respective values are:

Simplifies SQL-style Computations -Relational Computation on Groups - Image 13

A9 computes the final result of this example. Its values are:

Simplifies SQL-style Computations -Relational Computation on Groups - Image 14

Or we can realize the computation indirectly. For instance, first compute âbranch stores that are not in short supplyâ, and then compute the complement of the result and Branch Table. The final result should be the same as that of A9.

Please note variables like A8 or Branch cannot be used to represent a data set in SQL because it doesnât support explicit set. Thus the simple code in the above have to be replaced by lines of complicated SQL statements.

Besides, esProc can be called by a Java program. The method is similar to that with which a Java program calls an ordinary database. The JDBC provided by esProc can be used to return a computed result of the form of ResultSet to Java main program. For more details, please refer to the related documents

This blog is listed under Development & Implementations and Data & Information Management Community

Post a Comment

Please notify me the replies via email.

Important:
  • We hope the conversations that take place on MyTechLogy.com will be constructive and thought-provoking.
  • To ensure the quality of the discussion, our moderators may review/edit the comments for clarity and relevance.
  • Comments that are promotional, mean-spirited, or off-topic may be deleted per the moderators' judgment.
You may also be interested in
Awards & Accolades for MyTechLogy
Winner of
REDHERRING
Top 100 Asia
Finalist at SiTF Awards 2014 under the category Best Social & Community Product
Finalist at HR Vendor of the Year 2015 Awards under the category Best Learning Management System
Finalist at HR Vendor of the Year 2015 Awards under the category Best Talent Management Software
Hidden Image Url

Back to Top