Accessing Data : Joining data sets
 
Joining data sets
The capability to join data sets is a useful and easy way to combine data from two data sources. For example, you can combine data from two XML files, or combine data from a text file with data from a database table. Before joining data sets, you must create the individual data sets. For example, to combine data from an XML file with data from a text file, you must first create the XML data set and the text file data set.
Joining data sets is similar to joining tables in a database, described earlier in “Combining data from multiple tables,” but with the following two limitations:
*You can join only two data sets. In a database, you can join more than two tables.
*You can create only four types of joins: inner, left outer, right outer, and full outer.
The four types of joins you can use to join data sets yield the same results as the similarly-named database joins. The following list summarizes the function of each join type:
*Inner join returns rows from both data sets where the key values match.
*Left outer join returns all rows from the first data set, even if there are no matches in the second data set.
*Right outer join returns all rows from the second data set, even if there are no matches in the first data set.
*Full outer join returns all rows from both data sets, even if there are no matches in either data set.
Like the database joins, you must specify a column on which to join the two data sets. Joining two data sets creates a BIRT object called a joint data set. Just as you can with a regular data set, you can add computed columns and filters to a joint data set, and preview the results it returns. Once you understand the concepts of joining data sets, you can be creative about combining data from more than two sources, assuming that the data from the various sources relate in some way.
Although each joint data set can join only two data sets, you can use a joint data set as one or both of those data sets. For example, you can create joint data set A and joint data set B, then join both of them. Doing so, in effect, combines data from four data sets. Figure 4‑84 illustrates this concept.
As Figure 4‑84 also shows, each data set can return data from different types of sources. You could also use joint data sets to join multiple tables in a single database. For performance reasons, however, this technique is not recommended. Where possible, you should always join multiple tables through the SQL SELECT statement, as described earlier in this chapter.
You should create joint data sets only to:
*Combine data from disparate data sources.
*Combine data from non-relational data sources, such as XML or text files.