Pages

Building Dimensions and Loading Data in Hyperion Essbase

Building Dimensions and Loading Data

An Essbase database contains dimensions, members, and data values.

Loading data is the process of adding data values to an Essbase database from a data source, such as a spreadsheet or SQL database. If the data source is not perfectly formatted, you need a rules file to load the data values.

Building dimensions is the process of loading dimensions and members to an Essbase database outline by using a data source and a rules file. You can also use Outline Editor to add dimensions and members manually.

This post describes the components involved in loading data values and loading dimensions and members—data sources and rules files.

Some rules file options and data source requirements vary for aggregate storage databases.

Process for Data Loading and Dimension Building

To load data values or dimensions and members into an Essbase database, follow these steps:

1. Set up the data source.

If you are not using a rules file, you must set up the data source outside Essbase.

2. If necessary, set up the rules file.

3. Perform the data load or dimension build.

Data Sources

Data sources contain the information that you want to load into the Essbase database. A data source can contain:

Data values

Information about members, such as member names, member aliases, formulas and consolidation properties

Generation and level names

Currency name and category

Data storage properties

Attributes

UDAs

Supported Data Sources

Essbase supports the following types of data sources:

Text files (flat files) from text backups or external sources

SQL data sources

Essbase export files (export files do not need a rules file to load)

Microsoft Excel files with the .XLS extension, Version 4.0 and higher

Microsoft Excel files, Version 5.0 and higher (loaded only as client artifacts or files in the file system)

Spreadsheet audit log files

Note:

If you are using Administration Services Console to load data or build an outline, spreadsheet files are not supported if the Administration Server is installed on a computer with a UNIX operating system.

To avoid rules file, data load, and dimension build errors, remove formatting in Microsoft Excel data source files; for example, in Excel set color to “Automatic” and “No Fill,” and remove font settings such as bold and italic.

Items in a Data Source

As illustrated in Figure A, a data source is composed of records, fields, and field delimiters.

A record is a structured row of related fields.

A field is an individual value.

A delimiter indicates that a field is complete and that the next character in the record starts another field.

Essbase reads data sources starting at the top and proceeding from left to right.

Figure A:

As illustrated in Figure B, data sources can contain dimension fields, member fields, member combination fields, and data fields.

Figure B:

Dimension fields identify the dimensions of the database, such as Market. Use dimension fields to tell Essbase the order of the dimensions in the data source. In Figure B, for example, the dimension fields are Market, Product, Year, Measures, and Scenario. Fields in the Market column, such as Texas, are members of the Market dimension, and fields in the Product column, such as 100-10, are members of the Product dimension. Although you can set dimension fields in the data source, usually you define dimension fields in the rules file.

Member fields identify the members or member combinations of the specified dimensions. Use member fields to tell Essbase to which members to map new data values, or which members to add to the outline. In Figure B, for example, Texas, 100-10, Jan, Sales, and Actual are all member fields.

Data fields contain the numeric data values that are loaded into the intersections of the members of the database. Each data value must map to a dimension intersection.

In Data Sources Figure B, for example, 42 is the data value that corresponds to the intersection of Texas, 100-10, Jan, Sales, and Actual. You can specify information in the header and in an individual record. In the following example, 100 is the data value that corresponds to the intersection of Jan, Actual, Cola, East, Sales and 200 is the data value that corresponds to the intersection of Jan, Actual, Cola, West, Sales.

Jan, Actual

Cola East Sales 100

Cola West Sales 200

Cola South Sales 300

Data fields are used only for data loading; dimension builds ignore data fields.

Valid Dimension Fields

In a data load, if the data source does not identify every dimension in the Essbase database, the rules file must identify the missing dimensions. For example, the Sample Basic database has a dimension for Year. If several data sources arrive with monthly numbers from different regions, the month itself may not be specified in the data sources. You must specify the month in the data source header or the rules file.

A dimension field must contain a valid dimension name. If you are not performing a dimension build, the dimension must already exist in the database. If you are performing a dimension build, the dimension name can be new, but the new name must be specified in the rules file.

Valid Member Fields

A member field can contain the name of a valid member or an alias. In Figure B for example, Texas and Ohio are valid members of the Market dimension. A blank member field inherits the member name from the previous record. Essbase must know how to map each member field of the data source to a member of the database.

In order to be valid, a member field must meet the following criteria:

The member field must contain or inherit a valid member name or member property.

If you are not performing a dimension build, the member must already exist in the outline. If you are performing a dimension build, the member can be new.

Either the data source or the rules file must specify which dimension each member field maps to.

A member field can map to a single member name, such as Jan (which is a member of the Year dimension), or to a member combination, such as Jan, Actual (which are members of the Year and Scenario dimensions).

Member names that contain the same character as the file delimiter must be surrounded by double quotation marks. For example, if the data source is delimited by spaces, make sure that a member containing spaces, such as “New York,” is surrounded by double quotation marks. If you are performing a data load without a rules file, member names containing some other characters must also be surrounded by quotation marks. When a rules file is not used, blank dimension and member fields are valid. When Essbase encounters a blank dimension or member field while loading data without a rules file, it uses the last dimension or member name encountered for that dimension or member column.

Note:

While it processes each record in a data source for a data load, Essbase does not check to ensure a member specified in a member field belongs to the dimension specified for the dimension field. Essbase loads the data value to the data cell identified by the member combination in the record. In Figure B, for example, if the second record reversed Jan and Sales (Texas, ‘100-10’, Sales, Jan, Actual, 42), Essbase would load 42 to the correct data cell. The exception is for fields in the rules file set as dimension reference method.

Valid Data Fields

If you are performing a dimension build, you can skip this section. Data fields are ignored during a dimension build. Either the data source or the rules file must contain enough information for Essbase to determine where to put each data value. A data field contains the data value for its intersection in the database. In Figure B, for example, 42 is a data field. It is the dollar sales of 100-10 (Cola) in Texas in January.

In a data field, Essbase accepts numbers and their modifiers, with no spaces or separators between them, and the text strings #MI and #MISSING.

If the data source contains a member field for every dimension and one field that contain data values, you must define the field that contains data values as a data field in the rules file. To read the following data source into the Sample Basic database, for example, define the last field as a data field.

Jan Cola East Sales Actual 100

Feb Cola East Sales Actual 200

If the data source contains blank fields for data values, replace them with #MI or #MISSING. If there is no value in the data field (or the value is #MISSING), Essbase does not change the existing data value in the database. Essbase does not replace current values with empty values.

Valid Delimiters

You must separate fields from each other with delimiters. If you are loading data without a rules file, you must use spaces to delimit fields.

If you are using a rules file, delimiters can be any of the following:

Tabs (default)

Spaces

New lines

Carriage returns

Commas

Extra Delimiters Without a Rules File

In data sources that are loaded without a rules file, Essbase ignores extra delimiters. In the following example, the fields are separated by spaces. Essbase ignores the extra spaces between the fields.

East Cola Actual Jan Sales 10

East Cola Actual Feb Sales 21

East Cola Actual Mar Sales 30

Extra Delimiters with a Rules File

In data sources that are loaded with a rules file, Essbase reads extra delimiters as empty fields. For example, if you try to use a rules file to load the file below into the Sample Basic database, the load fails. Essbase reads the extra comma between East and Cola in the first record as an extra field. Essbase then puts Cola into Field 3. In the next record, however, Cola is in Field 2. Essbase expects Cola to be in Field 3 and stops the data load.

East,, Cola, Actual, Jan, Sales, 10

East, Cola, Actual, Feb, Sales, 21

East, Cola, Actual, Mar, Sales, 30

To solve the problem, delete the extra delimiter from the data source.

Valid Formatting Characters

Essbase views some characters in the data source as formatting characters only. For that reason, Essbase ignores the following characters:

== Two or more equal signs, such as for double underlining

-- Two or more minus signs, such as for single underlining

_ _ Two or more underscores

== Two or more IBM PC graphic double underlines (ASCII character 205)

_ _ Two or more IBM PC graphic single underlines (ASCII character 196)

Ignored fields do not affect the data load or dimension build.

For example, Essbase ignores the equal signs in the following data source and loads the other fields normally.

East Actual "100-10"

Sales Marketing

===== ========

Jan 10 8

Feb 21 16

Rules Files

Rules define operations that Essbase performs on data values or on dimensions and members when it processes a data source. Use rules to map data values to an Essbase database or to map dimensions and members to an Essbase outline.

Rules are stored in rules files. A rules file defines which build method to use, whether data values or members are sorted or in random order, and how to transform data values or members before loading them. It is best to create a separate rules file for each dimension.

Essbase reads the data values or members in the data source, changes them based on the rules in the rules file, and loads the changed data values into the database and the changed members into the outline. Essbase does not change the data source. You can re-use a rules file with any data source that requires the same set of rules.

After you create a dimension build rules file, you may want to automate the process of updating dimensions.

Situations That Do and Do Not Need a Rules File

You need a rule file if the data source does not map perfectly to the database or if you are performing any of the following tasks:

Loading data from a SQL data source

Building dimensions

Adding new dimensions and members to the database

Changing existing dimensions and members in the database

Changing the data in any way, including the following:

Ignoring fields or strings in the data source

Changing the order of fields by moving, joining, splitting, or creating fields

Mapping the data in the data source to the database by changing strings

Changing the data values in the data source by scaling data values or by adding data values to existing data values in the data source

Setting header records for missing values

Rejecting an invalid record and continuing the data load

You do not need a rules file if you are performing a data load and the data source maps perfectly to the database.

Note:

If you are using a rules file, each record in the rules file must have the same number of fields.

Data Sources That Do Not Need a Rule File

If you are performing a dimension build, skip this section.

If a data source contains all information required to load the data values in it into the database, you can load the data source directly. This kind of load is called a free-form data load.

To load a data value successfully, Essbase must encounter one member from each dimension before encountering the data value. For example, in Figure B, Essbase loads the data value 42 into the database with the members Texas, 100-10, Jan, Sales, and Actual. If Essbase encounters a data value before a member of each dimension is specified, it stops loading the data source.

To map perfectly, a data source must contain all of the following and nothing else:

One or more valid members from each dimension. A member name must be enclosed in quotation marks if it contains any of the following:

Spaces

Numeric characters (0–9)

Dashes (minus signs, hyphens)

Plus signs

& (ampersands)

If you are performing a data load without a rules file, when Essbase encounters an invalid member field, it stops the data load. Essbase loads all fields read before the invalid field into the database, resulting in a partial load of the data values.

One or more valid data values

If the data source contains blank fields for data values, replace the blank fields with #MI or #MISSING. Otherwise, the data values may not load correctly.

Valid delimiters.

The fields in the data source must be formatted in an order that Essbase understands. The simplest way to format a record is to include a member from each dimension and a data field, as illustrated below:

Sales "100-10" Ohio Jan Actual 25

Sales "100-20" Ohio Jan Actual 25

Sales "100-30" Ohio Jan Actual 25

A data source that is not correctly formatted will not load. You can edit the data source using a text editor and fix the problem. If you find that you must perform many edits (such as moving Data Sources That Do Not Need a Rules File several fields and records), consider using a rules file to load the data source.

The following sections describe more complicated ways to format free-form data sources:

“Formatting Ranges of Member Fields”

“Formatting Columns”

Formatting Ranges of Member Fields

If you are performing a dimension build, skip this section.

You can express member names as ranges within a dimension. For example, Sales and COGS form a range in the Measures dimension. Ranges of member names can handle a series of values.

A data source can contain ranges from more than one dimension at a time. In the example below, Jan and Feb form a range in the Year dimension and Sales and COGS form a range in the Measures dimension.

Actual Texas Sales COGS

Jan Feb Jan Feb

"100-10" 98 89 26 19

"100-20" 87 78 23 32

Notice that Sales is defined for the first two columns and COGS for the last two columns.

The following sections describe additional types of ranges:

“Setting Ranges Automatically”

“Handling Out of Range Data Values”

“Interpreting Duplicate Members in a Range”

“Reading Multiple Ranges”

Setting Ranges Automatically

If you are performing a dimension build, skip this section.

When Essbase encounters two or more members from the same dimension with no intervening data fields, it sets up a range for that dimension. The range stays in effect until Essbase encounters another member name from the same dimension, at which point Essbase replaces the range with the new member or new member range.

The following example contains a range of Jan to Feb in the Year dimension. It remains in effect until Essbase encounters another member name, such as Mar. When Essbase encounters Mar, the range changes to Jan, Feb, Mar.

Texas Sales

Jan Feb Mar

Actual "100-10" 98 89 58

"100-20" 87 78 115

Handling Out of Range Data Values

If you are performing a dimension build, skip this section.

When Essbase encounters a member range, it assumes that there is a corresponding range of data values. If the data values are not in the member range, the data load stops. Essbase loads any data fields read before the invalid field into the database, resulting in a partial load of the data.

The following example contains more data fields than member fields in the defined range of members. The data load stops when it reaches the 10 data field. Essbase loads the 100 and 120 data fields into the database.

Cola Actual East

Jan Feb

Sales 100 120 10

COGS 30 34 32

Interpreting Duplicate Members in a Range

If you are performing a dimension build, skip this section.

Structure ranges in the source data so that Essbase interprets them correctly. If a member appears more than once in a range, Essbase ignores the duplicates.

The following file contains duplicate members and two ranges: Actual to Budget and Sales to COGS:

Cola East

Actual Budget Actual Budget

Sales Sales COGS COGS

Jan 108 110 49 50

Feb 102 120 57 60

Essbase ignores the duplicate members. The members that Essbase ignores have a line through them in the following example:

Cola East

Actual Budget Actual Budget

Sales Sales COGS COGS

Jan 108 110 49 50

Feb 102 120 57 60

For Actual, the first member of the first range, Essbase maps data values to each member of the second range (Sales and COGS). Essbase then proceeds to the next value of the first range, Budget, similarly mapping values to each member of the second range. As a result, Essbase interprets the file as shown below:

Cola East

Actual Budget Actual Budget

Sales Sales COGS COGS

Jan 108 110 49 50

Feb 102 120 57 60

Reading Multiple Ranges

If you are performing a dimension build, skip this section.

As Essbase scans a file, it processes the most recently encountered range first when identifying a range of data values. The example above contains two ranges: Actual and Budget and Sales and COGS. While reading the file from left to right and top to bottom, Essbase encounters the Actual and Budget range first and the Sales and COGS range second. Because the Sales and COGS range is encountered second, Essbase puts data fields in the Sales and COGS part of the database first.

Formatting Columns

If you are performing a dimension build, skip this section.

Files can contain columns of fields. Essbase supports loading data from symmetric columns or asymmetric columns.

Symmetric Columns

If you are performing a dimension build, skip this section. Dimension builds require a rules file.

Symmetric columns have the same number of members under them. In the following example, each dimension column has one column of members under it. For example, Product has one column under it (100-10 and 100-10) and Market has one column under it (Texas and Ohio).

Product Measures Market Year Scenario

100-10" Sales Texas Jan Actual 112

"100-10" Sales Ohio Jan Actual 145

The columns in the following file are also symmetric, because Jan and Feb have the same number of members under them:

Jan Feb

Actual Budget Actual Budget

"100-10" Sales Texas 112 110 243 215

"100-10" Sales Ohio 145 120 81 102

Asymmetric Columns

If you are performing a dimension build, skip this section.

Asymmetric columns have different numbers of members under them. In the following example, the Jan and Feb columns are asymmetric because Jan has two columns under it (Actual and Budget) and Feb has only one column under it (Budget):

Jan Jan Feb

Actual Budget Budget

"100-10" Sales Texas 112 110 243

"100-10" Sales Ohio 145 120 81

If a file contains asymmetric columns, label each column with the appropriate member name.

The example above is valid because the Jan label is now over both Actual and Budget. It is clear to Essbase that both of those columns map to Jan.

The following example is not valid because the column labels are incomplete. The Jan label must appear over both the Actual and Budget columns.

Jan Jan Feb

Actual Budget Budget

"100-10" Sales Texas 112 110 243

"100-10" Sales Ohio 145 120 81

0 comments:

© 2010 Datawarehousing Support | Home | Disclaimer | Privacy Policy