MyPage is a personalized page based on your interests.The page is customized to help you to find content that matters you the most.


I'm not curious

Simplifies SQL-style Computations - In-group Computation

Published on 06 November 14
364
0
0

During developing the database applications, we often need to perform computations on the grouped data in each group. For example, list the names of the students who have published papers in each of the past three years; make statistics of the employees who have taken part in all previous training; select the top three days when each client gets the highest scores in a golf game; and the like. To perform these computations, SQL needs multi-layered nests, which will make the code difficult to understand and maintain. By contrast, esProc is better at handling this kind of in-group computation, as well as easy to integrate with Java and the reporting tool. We'll illustrate this through an example.

According to the database table SaleData, select the clients whose sales amount of each month in the year 2013 is always in the top 20. Part of the data of SalesData is as follows:
Simplifies SQL-style Computations - In-group Computation - Image 1

To complete the task, first select the sales data of the year of 2013, and then group the data by the month and, in each group, select the clients whose monthly sales amount is in the top 20. Finally, compute the intersection of these groups.

With esProc we can split this complicated problem into several steps and then get the final result. First, retrieve the data of 2013 from SaleData and group it by the month:
Simplifies SQL-style Computations - In-group Computation - Image 2

Note: The code for filtering in A2 can also be written in SQL.

It is the real grouping that esProc separates data into multiple groups. This is different from the case in SQL, whose group by command will compute the summary value of each group directly and won't keep the intermediate results of the grouping. After grouping, the data in A3 are as follows:
Simplifies SQL-style Computations - In-group Computation - Image 3
esProc will sorts the data automatically before grouping. Each group is a set of sales data. The data of March, for example, are as follows:
Simplifies SQL-style Computations - In-group Computation - Image 4

In order to compute every client's sales amount of each month, we need to group the data a second time by clients. In esProc, we just need to perform this step by looping the data of each month and group it respectively. A.(x) can be used to execute the loop on members of a certain group, with no necessity for loop code.

A4�=A3.(~group(Client))

In A4, the data of each month constitute a subgroup of each previous group after the second grouping:
Simplifies SQL-style Computations - In-group Computation - Image 5
At this point, the data of March are as follows:
Simplifies SQL-style Computations - In-group Computation - Image 6

It can be seen that each group of data in March contains the sales data of a certain client.

Please note ~ in the above code represents each member of the group, and the code written with ~ is called in-group computation code, like the above-mentioned ~.group(Client).

Next, select the clients whose rankings of each month are in the top 20 through the in-group computation:

A5�=A4.(~.top(-sum(Amount);20))

A6�=A5.(~.new(Client,sum(Amount):MonthAmount))

A5 computes the top 20 clients of each month in sales amount by looping each month's data. A6 lists the clients and their sales amount every month. The result of A6 is as follows:
Simplifies SQL-style Computations - In-group Computation - Image 7

Finally, list the field Client of each subgroup and compute the intersection of the subgroups:

A7�=A6.(~.(Client))

A8�=A7.isect()

A7 computes the top 20 clients of each month in sales amount. A8 computes the intersection of the field Clients of the twelve months. The result is as follows:
Simplifies SQL-style Computations - In-group Computation - Image 8

As can be seen from this problem, esProc can easily realize the in-group computation, including the second group and sort, on the structured data, make the solving way more visually, and display a clear and smooth data processing in each step. Moreover, the operations, like looping members of a group or computing intersection, become easier in esProc, which will reduce the amount of code significantly.

The method with which a Java program calls esProc is similar to that with which it calls an ordinary database. The JDBC provided by esProc can be used to return a computed result of the form of ResultSet to Java main program. For more details, please refer to the related documents
During developing the database applications, we often need to perform computations on the grouped data in each group. For example, list the names of the students who have published papers in each of the past three years; make statistics of the employees who have taken part in all previous training; select the top three days when each client gets the highest scores in a golf game; and the like. To perform these computations, SQL needs multi-layered nests, which will make the code difficult to understand and maintain. By contrast, esProc is better at handling this kind of in-group computation, as well as easy to integrate with Java and the reporting tool. We'll illustrate this through an example.

According to the database table SaleData, select the clients whose sales amount of each month in the year 2013 is always in the top 20. Part of the data of SalesData is as follows:

Simplifies SQL-style Computations - In-group Computation - Image 1

To complete the task, first select the sales data of the year of 2013, and then group the data by the month and, in each group, select the clients whose monthly sales amount is in the top 20. Finally, compute the intersection of these groups.

With esProc we can split this complicated problem into several steps and then get the final result. First, retrieve the data of 2013 from SaleData and group it by the month:

Simplifies SQL-style Computations - In-group Computation - Image 2

Note: The code for filtering in A2 can also be written in SQL.

It is the real grouping that esProc separates data into multiple groups. This is different from the case in SQL, whose group by command will compute the summary value of each group directly and won't keep the intermediate results of the grouping. After grouping, the data in A3 are as follows:

Simplifies SQL-style Computations - In-group Computation - Image 3

esProc will sorts the data automatically before grouping. Each group is a set of sales data. The data of March, for example, are as follows:

Simplifies SQL-style Computations - In-group Computation - Image 4

In order to compute every client's sales amount of each month, we need to group the data a second time by clients. In esProc, we just need to perform this step by looping the data of each month and group it respectively. A.(x) can be used to execute the loop on members of a certain group, with no necessity for loop code.

A4�=A3.(~group(Client))

In A4, the data of each month constitute a subgroup of each previous group after the second grouping:

Simplifies SQL-style Computations - In-group Computation - Image 5

At this point, the data of March are as follows:

Simplifies SQL-style Computations - In-group Computation - Image 6

It can be seen that each group of data in March contains the sales data of a certain client.

Please note ~ in the above code represents each member of the group, and the code written with ~ is called in-group computation code, like the above-mentioned ~.group(Client).

Next, select the clients whose rankings of each month are in the top 20 through the in-group computation:

A5�=A4.(~.top(-sum(Amount);20))

A6�=A5.(~.new(Client,sum(Amount):MonthAmount))

A5 computes the top 20 clients of each month in sales amount by looping each month's data. A6 lists the clients and their sales amount every month. The result of A6 is as follows:

Simplifies SQL-style Computations - In-group Computation - Image 7

Finally, list the field Client of each subgroup and compute the intersection of the subgroups:

A7�=A6.(~.(Client))

A8�=A7.isect()

A7 computes the top 20 clients of each month in sales amount. A8 computes the intersection of the field Clients of the twelve months. The result is as follows:

Simplifies SQL-style Computations - In-group Computation - Image 8

As can be seen from this problem, esProc can easily realize the in-group computation, including the second group and sort, on the structured data, make the solving way more visually, and display a clear and smooth data processing in each step. Moreover, the operations, like looping members of a group or computing intersection, become easier in esProc, which will reduce the amount of code significantly.

The method with which a Java program calls esProc is similar to that with which it calls an ordinary database. The JDBC provided by esProc can be used to return a computed result of the form of ResultSet to Java main program. For more details, please refer to the related documents

This blog is listed under Development & Implementations and Data & Information Management Community

Post a Comment

Please notify me the replies via email.

Important:
  • We hope the conversations that take place on MyTechLogy.com will be constructive and thought-provoking.
  • To ensure the quality of the discussion, our moderators may review/edit the comments for clarity and relevance.
  • Comments that are promotional, mean-spirited, or off-topic may be deleted per the moderators' judgment.
You may also be interested in
Awards & Accolades for MyTechLogy
Winner of
REDHERRING
Top 100 Asia
Finalist at SiTF Awards 2014 under the category Best Social & Community Product
Finalist at HR Vendor of the Year 2015 Awards under the category Best Learning Management System
Finalist at HR Vendor of the Year 2015 Awards under the category Best Talent Management Software
Hidden Image Url

Back to Top