Applies To:
-
ReportMiner Enterprise and Express (all versions)
-
Centerprise Enterprise and Express (all versions)
Overview
A data region is a block of data that makes up the body of your source report within a Report Model. Setting up a data region is one of the first steps to mining data. Data regions are the backbone of Report Models — they tell ReportMiner where to extract data from the file. Once you create a data region, you can add fields to it (data fields specify exactly which data you wish to extract.) Should you export your report document to Excel, these are the values that will be listed in their own cells. For more on data fields, see the data fields article.
Data regions are highlighted once selected. In the Default mode, they are gray. There are four main types of data regions: standard regions, header regions, footer regions, and append regions.
Pattern Matching
The program makes it possible to quickly create data regions using the pattern matching feature. A data region is not complete without matching a pattern to show which data needs to be included in that region. The orange line with blue dots above the report document is called the pattern matching line. Characters can be typed in here to select lines that either contain the same character (such as a specific letter or number), the same kind of character (see below), or the same phrase.
See the Properties section of this document for a breakdown of the pattern matching properties menu.
There is a line of buttons for pattern matching above the report model. These can be used by clicking on the pattern matching line and then selecting the relevant button above the area on your report that it applies to.
They are as follows:
is the “Match any alphabet” button. This button stands for any letter within your document.
is the “Match any digit” button. Similar to the “Ô button, this is used to represent any number. For instance, if you were trying to select lines that contained phone numbers, you would type, “ÑÑÑ-ÑÑÑ-ÑÑÑÑ” into the pattern matching line (without the quotation marks.)
is the “Match any alphabet or digit” button. It can represent numbers and/or letters.
is the “Match any non-blank character” button. This is used for characters that fit into the previously mentioned categories, but also aren’t tabs, spaces, or any other “blank” character.
is the “Match any blank character” button. This can be used to stand for any character that is "invisible," such as a tab, return/enter, or a space.
Example
In the report model below, the user needs to create a data region to extract all lines of data underneath “Book Title.” There are many ways to go about this: we’ll cover two in this example. In this example, the data region has already been created – go here(insert link or page number) to see how to create a data region from scratch.
Each line has a date, and in each date there is a “/” character.
By typing the “/” above where it appears in the lines, the program knows to select every line that has that character in that spot.
Another way to select all the necessary information would be to use the “Ó button, as shown below.
As all of the names begin at the same position in the column, the “Ó button can be used to match any alphabetical character. All of our data is selected, and our data region has been created.
Video
(video goes here)
Data Region Properties
To configure the properties of a data region, right-click it and select Region Properties... from the context menu. The following properties are available:
Region Name is the name of your data region. Should you not change the name, it defaults to "Data." We recommend changing your data region's name to avoid confusion.
Region Details let you further customize your data region. Within the Region Details menu are:
Region End Type changes how ReportMiner defines where your region ends.
-
Line Count ends your region after a specified number of lines.
-
Blank Line ends your region every time there’s a blank line.
-
Last Field ends your data region at the last data field within your data region.
-
Another Region Starts ends your data region where another begins. This is used for variable-length data regions and automatically ends your data region when another one starts.
There is also a Region Properties panel directly above the report model.
Line Count is used to adjust how many lines long a data region is when in Line Count mode.
Pattern Count adjusts the number of patterns that ReportMiner matches on your data region. This is helpful if more than one pattern can be used to identify the beginning of your data region. Up to five patterns can be specified at a time.
Single or Multi-Column is an option that allows ReportMiner to scan multiple columns of a document, such as if a report is formatted in three columns (newspaper-style).
If the "…" is clicked, the following properties window opens:
Starting Points indicate where your columns begin. To find what the coordinates of your data region are, click on the part of the document that you wish to start your data region. As shown below, the bottom right corner of the report model editor shows the coordinates for that point.
Automatically Calculate Columns is also within the Multi-Column Region Properties. This finds where the columns begin and end without needing a set of starting points.
Delete Region lets you delete a data region.
Add Data Region lets you add a data region.
Append Data Region is a region that you can add as part of a report that would be otherwise left out of the data region. For instance, in the report below, the group totals aren’t captured in the main data region.
In order to ensure that the totals are included, an append region needs to be added.
Now, the totals will be present in the exported report.
Example
Adding a Data Region
To add a data region to your report model, right-click on the Record node and select, "Add Data Region."
In order for ReportMiner to select a specific area to act as a data region, a pattern needs to be applied. Type the applicable phrase or select the correct "match any" button. For more on pattern matching, click here.
Your data region is now complete and should be highlighted.
Adding a Header Region
Some files, such as medical reports and order receipts, contain headers. This text area repeats at the top of every page in the report, and can be extracted as a different region. In this case, our document's header contains:
The title, “PATIENT INFORMATION,” the hospital’s name, and the hospital’s address. Like an append region, this might not be captured in a regular data region. Therefore, a header data region is needed.
To create a header region, right click on the record node and select, "Add Header Region."
Since our header repeats, the easiest way to select the header region is to type a word or phrase as your method of pattern matching. As every page on this report starts with a header that says, "PATIENT INFORMATION," you can use that phrase to select the first line of the header in the pattern matching line.
If your headers are not perfectly aligned on your report, be sure to click the "Floating Pattern" box.
To adjust how many lines tall your header region is, adjust the line count field.
Adding Footer Regions
To add a footer data region, right click on the record node and select, "Add Footer Region."
As the footer repeats, the easiest way to select the footer region is to type a word or phrase as your method of pattern matching.
Adding Append Regions
Some reports have data that isn’t captured in an existing data region, but need to be included with each record in the exported data. In the sheet below, the library name isn’t a part of the captured data.
In order to capture this data, select your highest record node and right-click. Select “Add Append Region…” from the drop-down menu.
Once your append data region is added, you can then add your data region via pattern-matching or the auto-create function as you would any data region. Once a data field has also been added, we see that the data is now included with every record.
Creating Multi-Column Regions
Astera offers the “Multi-Column” data region for data that is formatted in multiple columns.
To create a multi-column data region, right click on your record node and select “Add data region…” as you would for a regular region.
Now, check the “Multi-Column” box.
Another bar will appear under the pattern-matching bar. This is for setting column boundaries. Click on the second bar above where your first column starts. A black line should appear. Make sure that this is flush with the left side of your characters. Repeat for each column start point. If your line is in the wrong place, click on the line to delete it.
You can also adjust the number of columns and column margin by clicking on the "..." . Next to the Multi-column option.
From here, pattern-match as normal to create your data region.
Related Sections
-
Container and Overlapping Container Regions
-
Data Fields
-
Pattern Matching
0 Comments