Weigh the benefits and drawbacks of the five most popular ways of importing non-SAP data into your BW system.
Key Concept
Getting external data into BW is usually more challenging than retrieving data from SAP systems. A variety of different methods exists, such as flat files; DB Connect; Universal Data Connect; third-party extraction, transformation, and load tools; and Business Planning and Simulation/Integrated Planning.
Getting data into a data warehouse is usually the most time-consuming activity on a project. Typically, teams spend up to 80 percent of the effort in data warehouse projects on data acquisition.
However, SAP has changed the rules for getting R/3 data into BW with the provision of business content (ready-to-go extractors, InfoSources, InfoCubes, and queries). This means that you can throw traditional data warehousing time frames out the window. One of my clients was recently astonished when I managed to demonstrate a series of sales reports containing data from his development R/3 system on the same day that the BW system became available.
Importing data from R/3 into BW is only part of the story. Most BW implementations integrate non-SAP data, so it’s crucial to know the available data extraction methods.
Last year, I finished a BW pilot using R/3 as the primary data source and my client wanted to know his options for getting a range of non-SAP data into his shiny new BW system. After brainstorming with some colleagues, we easily came up with more than 20 possible ways. Admittedly, some of these were quite creative, but this illustrates how many options exist. I’m going to concentrate on five of the most useful here:
• Flat files
• DB Connect
• Universal Data Connect (UD Connect)
• Third-party extraction, transformation, and load (ETL) tools
• Business Planning and Simulation (BPS) and Integrated Planning
Other techniques include Business Application Programming Interfaces (BAPIs), SAP Exchange Infrastructure (XI), and Computer-Aided Test Tool (CATT) scripts. In my experience, the five techniques I’ll discuss are responsible for more than 90 percent of non-R/3 data loads into BW. In my next article, I will explain how to get data out of SAP BW for interfacing purposes.
Challenges
Working with non-SAP data has a number of challenges. It generally means no business content exists on the extractor side (some Oracle Financials business content is an exception). Non-SAP data is a poor fit to other elements of business content such as InfoObjects. Therefore, you have to build everything from the ground up, which can be time consuming.
This is because every field needs a corresponding custom InfoObject in BW. Modeling these can be challenging when you have numerous attributes and complex relationships between InfoObjects via compounding. You then need to differentiate master data from transaction data and build load routines accordingly. In addition, automating the end-to-end load scenario is generally much more challenging than extracting data from other SAP systems. As a rule, expect more load failures due to factors outside of your control such as connectivity between systems.
Data quality is often a problem area too. Data in SAP systems is actually very clean when compared to many other systems. So more data cleansing is generally necessary to remove unacceptable characters in the data, insert missing values, and convert data into uppercase characters.
Technically, you also need to choose from a range of data transfer and loading techniques, all with associated pros and cons (Table 1). What might be the easiest solution to implement could prove the hardest to support and maintain, or might be the least robust.
Requirement |
Flat file |
DB Connect |
UD Connect |
Third-party ETL tools |
BPS |
Why? |
One-off or infrequent load |
X |
|
|
|
|
Minimal configuration/ setup required |
Large data volumes |
X |
X |
|
X |
|
UD Connect and BPS perform worse than other methods in this case |
Master data and transaction data |
X |
X |
X |
X |
|
BPS currently only able to load transaction data |
Extraction from RDBMS |
|
X |
X |
X |
|
Impossible with flat files or BPS |
Cost |
None |
None |
Some drivers can be costly |
Tools tend to be expensive |
None |
Impossible with flat files or BPS |
|
Table 1 |
When to use each data transfer method |
Flat File
It would be a safe bet to assume that the ubiquitous flat file is the most popular method for getting data into BW. You can use flat files in any version of BW. At the simple end of the spectrum, this might involve just typing the data into a suitable format in a text editor and loading the resulting flat file into BW. I’ve certainly done this more than a few times.
Where more data is involved, or an element of data manipulation is required prior to loading the data, you can use spreadsheets to store and maintain data to load into BW. Spreadsheets can also prove invaluable for generating large amounts of test data (up to 65,536 rows in most versions of Microsoft Excel). When the data is ready for BW, simply export it to a comma-separated value (CSV) file or a fixed-width file. Then, point your InfoPackage to this file. Figure 1 shows flat file data previewed in BW, which is useful for resolving data issues prior to loading the data into BW.

Figure 1
Preview of flat file data in BW
These first two examples (typing or manipulating the data) are useful for periodic loads of relatively static data such as master data or texts. However, as they have a manual overhead, they are not really suitable for regular loads into BW.
A fully automated flat-file scenario might involve another system exporting a flat file to its local disk, and then invoking an FTP file transfer to the target BW machine (from where BW loads the file). Your Basis team must provide the necessary FTP account and a suitable file system. You can also achieve the same result using shared network drives, which is easy on Windows platforms but a bit more involved with Unix.
It is obviously much faster to load a file that resides on the BW server than one that is sitting on your local machine. In addition, when loading files from the application server, you can schedule them to load in the background, whereas immediate is the only option for loading files from your local machine. In terms of file format, fixed-width files load faster than delimited ones, although their preparation tends to be more demanding.
In summary, flat files are the most common way of getting non-SAP data into BW, but you may reach a point when managing large numbers of flat files becomes problematic. Read on for more maintainable solutions.
DB Connect
DB Connect was first introduced in BW 3.0 and enables BW to extract data from other relational databases that SAP products run on. Currently the list is Oracle, Microsoft SQL Server, IBM DB2, and platform-specific variants (DB4 and DB6), Informix, and SAP DB/MaxDB.
To make DB Connect work, you typically need your Basis team to set up the initial connection in BW. You also need input from a technical contact responsible for the source database to provide connectivity and logon information.
The main technical prerequisite for using DB Connect is for the BW server to install the relevant client connectivity software and associated SAP database shared library (DBSL). If your BW system runs on Oracle and you want to access another Oracle database within your organization, this is straightforward. However, if you run Oracle and want to access data from an IBM DB2 database, then you need to install the relevant IBM client software and SAP database libraries on your BW server. This can become even more complicated if your BW system is Unix based and you want to access Microsoft SQL Server. Microsoft does not provide a Unix client for SQL Server, so you can only do this by having at least one Windows-based application server.
The actual BW connection to the source database is straightforward to set up. Ideally you should create a dedicated database user on the source database for extraction purposes. This makes it possible to restrict what the BW extract user can access in the source database and also reduces the number of tables and views that you see in DB Connect (Figure 2).

Figure 2
DB Connect table and view selection in BW
Once you have an active connection, DB Connect behaves similarly to other BW source systems with restrictions and limitations that you need to know. Firstly, table or view names in the source system need to be uppercase and fewer than 26 characters in length. Secondly, the names of columns in the table or view are limited to 16 characters and can only contain letters, numbers, and underscores.
You might think that this would be a major problem, but the way around this is to create new database views on the source database to shorten table names and rename fields into a BW-friendly format. For example, if the table that you want to extract is customer_purchase_history_table (31 characters), you would create a view called BW_CUST_HIST_TRAN (17 characters, uppercase) to enable BW to successfully access this view. Create this view at the database level in the source database using the relevant SQL statement such as create view
BW_CUST_HIST_TRAN
as select FIELD1, FIELD 2 FROM customer_purchase_history_table
. The exact syntax varies by database. Once the fields have the proper naming convention, you can generate DataSources from source fields (Figure 3).

Figure 3
Generate DataSources from source fields
Database views are also invaluable for performing basic data manipulation such as converting the contents of a field to uppercase, performing substring operations (to provide the second to sixth characters of a field), and converting data types such as a native date-time field into a date or time format that BW can handle such as YYYYMMDD.
DB Connect performs very fast. In my experience, it’s faster than extracting data from SAP systems, but this obviously depends on where your source database is located and network performance. The majority of problems I’ve experienced with DB Connect usually relate to data quality, such as non-permitted characters and data type conversion errors. You should plan extra time to account for the increased difficulty of working with non-SAP data.
UD Connect
UD Connect was introduced in BW 3.5 via BW’s Java 2 Platform, Enterprise Edition (J2EE) engine. It allows BW to access potentially hundreds of other platforms and databases.
Four types of connectors are currently available: XML for Analysis (XMLA), Object Linking and Embedding for Databases (OLE DB) for online analytical processing (OLAP) (ODBO), Java Database Connectivity (JDBC), and SAP Query. Of these, JDBC is the most useful in my opinion, so I will concentrate on it. You can find a list of available drivers on the Sun Web site: https://servlet.java.sun.com/ products/jdbc/drivers. Some of these are free, but others can be costly, so investigate licensing issues before pursuing the UD Connect route.
The connection from BW to the source database can be tricky to establish. Normally this involves a fair degree of trial and error because each JDBC driver tends to have quirks and lacks helpful error messages. For example, my first JDBC connection to a SQL Server database took nearly two days to establish. Perform the configuration via the J2EE Administrator. This is a new process for most Basis and BW developers and requires three steps to install a new driver:
Step 1. Install the relevant driver.
Step 2. Create reciprocal Remote Function Call (RFC) destinations from J2EE to BW and vice versa.
Step 3. Provide relevant driver-specific information such as host name, IP address, and logon details for the source database. Figure 4 shows a configured JDBC connection to a SQL Server database.

Figure 4
A configured JDBC connection to a SQL Server database
After you’ve performed the initial configuration, you can extract data into BW via a UD Connect DataSource that you created from the communication structure. Once you generate and assign this DataSource to the communication structure, it behaves like any other BW DataSource. Similar limitations to DB Connect apply, such as field and column names. You probably need to create a number of database views on the source database to facilitate the extraction into BW.
You can use UD Connect DataSources in two ways. Firstly, you can load data into BW in the traditional way: for example, to ODS objects, InfoCubes, and master data. Secondly, you can also define remote InfoCube scenarios where BW extracts the data at query execution time. UD Connect data loads typically perform five to 10 times slower than equivalent DB Connect loads. This means that UD Connect might not be suitable for very large volumes.
Third-Party ETL Tools
A number of ETL tool vendors have adapted their products to work with BW releases 2.0 to 3.5. Some of the better known ones include WebSphere DataStage from IBM (previously Ascential), PowerCenter from Informatica, Cognos 8 Business Intelligence Data Integration from Cognos, dynaSight from Arcplan, and BusinessObjects Data Integrator (formerly Acta) from Business Objects (Figure 5). To use third-party source systems in BW 3.x and SAP NetWeaver 2004s, normally you have to create RFC destinations from BW to an RFC server running on the third-party ETL server.

Figure 5
BusinessObjects Data Integrator data flow showing the source, transformation, and BW target
These tools usually work in a similar way. First, they extract data from the source database. Then, the tool stages the data in an intermediate database before loading it to the target system. Transformations can occur at any point in the data flow, although you will usually want to store an untransformed copy of the data before significantly manipulating it.
The tools I have worked with use an RFC connection to BW and an SAP third-party BAPI for data loading. SAP has enhanced this BAPI so that third-party tools can push data into BW, rather than BW operating as a pull-only mechanism. These tools are usually very fast and able to handle large datasets. However, they can be expensive.
Your organization may want to consider an ETL tool if:
- You have a large number of external data feeds into BW
- You need to load data from mainframe or other legacy applications on a regular basis
- You need to process high volumes of data
- BW is only a small component of your enterprise data warehouse (EDW) and you want to manage all of your ETL logic in one place
The main disadvantages of using ETL tools with BW are:
- You have to support and monitor another tool
- You introduce another point of failure into your data loads
The other potential pitfall of this kind of tool is the high degree of consolidation within the ETL tool market. Many of the products are now no longer in the hands of the founders. So you might find yourself buying one tool today, but discovering another company that may not be as committed to the product supports it tomorrow.
BPS/Integrated Planning
BPS was part of Strategic Enterprise Management (SEM) until BW 3.5, when SAP integrated it into core BW functionality. In SAP NetWeaver 2004s, the planning engine has changed again to become Integrated Planning, although the BPS functionality is still available. I’ll concentrate here on BPS, although most of the principles remain unchanged in Integrated Planning.
BPS manual planning functionality lets users load data directly into transactional InfoCubes in BW. While the functionality generally supports a planning process (e.g., cost center managers uploading costs planned for next year), you may use this functionality for other purposes.
BPS manual planning supports a number of mechanisms for uploading data such as SAPGUI, Excel, and the Web. The Excel option is more popular with users because you can cut and paste, whereas the Web-based front end is more suited to lower volumes of data. See Figure 6 for an example of the Web-based BPS front end.

Figure 6
BPS Web layout
One common BPS scenario is a semiautomatic interface whereby a power user takes data from another system and puts it into an Excel format, and then pastes this into a BPS Excel-based layout. From here, you can upload the data to a transactional InfoCube in BW. You can automate the process via Excel Visual Basic for Applications (VBA) macros. Many users have write access to BW transactional InfoCubes, which have a more sophisticated locking mechanism than basic InfoCubes. The locking mechanism allows users to edit different slices of data simultaneously (e.g., different company codes).
Although the idea of users editing the data in the InfoCube seems dangerous, it is entirely safe because BW writes the data into a special request that all users share. You can delete this request to roll back the data in case of any errors. However, once you load the data, it is possible to have an export DataSource (an SAP standard way of creating a DataSource on an InfoCube) on the transactional InfoCube for loading data into other BW objects such as ODS, master data, and InfoCubes. In this way, you can load plan or transaction data into BW and make it available to the rest of your BW applications.
Mike Curl
Mike Curl is technical director and principal consultant for Bluefin Solutions, a UK-based SAP partner. Mike has more than 10 years’ consultancy experience, covering a wide range of SAP technologies and industries. Mike is a recognized BW expert who has been working with BW since its earliest versions. He helps organizations define and implement their BW strategies, and he also has a particular interest and track record in optimizing the performance of high-volume BW systems.
You may contact the author at mike.curl@bluefinsolutions.com.
If you have comments about this article or publication, or would like to submit an article idea, please contact the editor.