These five data modeling tips help you create a comprehensive mySAP CRM data warehouse using SAP Business Information Warehouse functionality to bring together data from the necessary disparate sources.
Key Concept
Modeling CRM data differs from modeling other data in SAP Business Information Warehouse (BW) in several aspects. For example, mySAP CRM uses a closer interface to BW than other transaction systems such as R/3, allowing for real-time updating. It gets data back from BW so that actions can be initiated based on the results, and is therefore a loop scenario that requires that you think about how your data will be sent to and from your CRM system. Finally, CRM analytics aren’t “pure” analytics (drilling down to find a root cause). Instead they derive information, trends, and patterns from historical data.
When looking at your overall mySAP CRM implementation efforts, you cannot overlook the reporting and analytics requirements for monitoring and measuring your key customer metrics. While you can get or derive certain core reports directly within mySAP CRM, the vast majority of your reporting and analytic time and efforts will be in SAP Business Information Warehouse (BW).
SAP BW 3.5 and mySAP CRM have a unique relationship — a real-time updating of data that allows you immediate visibility to your CRM transaction. This means that you can execute standard reports or conduct data mining activities and know that you have the most up-to-date information available.
To take advantage of this relationship between BW and mySAP CRM, you need to model your BW system accordingly. The data models you create (including keys and dimensions) in BW determine the flexibility you have for creating your standard reports, data mining, and ad hoc analytics. It is important that you — the CRM expert — understand the possibilities and know what to ask for when defining your requirements to the BW team. (This assumes that you have a separate team to handle BW configuration.) Being able to speak their language and frame your requests for the functionality that they need to configure helps assure a successful implementation.
I learned the following five tips for modeling BW for mySAP CRM from my extensive experience implementing SAP reporting and analytic solutions — supporting everything from regular BW implementations to those supporting CRM and custom analytics.
Before you set up your data modeling, though, you have to determine how you want to use and report on the data. For example, an organization that is focused on data mining and predictive analytics will model its overall system differently than an organization that is focused on trend analysis. Similarly, an organization that anticipates having hundreds of millions to billions of rows of data in its reporting and analytic system (like a consumer products company that captures point-of-sale data) will design its solution differently than a smaller organization that is not dealing with those volume levels. Therefore, the key to delivering your reporting and analytics is understanding your organization’s needs in this area, knowing what BW delivers as a foundation, and customizing your BW system so that it complements mySAP CRM.
Once you have identified how you will use your data — what questions you need to answer, what decisions to make based on this information, what reports to generate, where your data comes from and its quantity — you can then model the data to develop the reporting solution. This is when the following tips can help you to meet your goals, including better and faster reports and analytics that enable you to turn CRM data into actionable information.
1. Unify/homogenize your CRM data — from mySAP CRM and non-SAP systems — within BW. A key to any reporting and analytics solution is the ability to deliver a single view of the truth that spans all of your data sources. If you want to report on it together, you have to bring it together. You do not want to make key decisions based on partial or incomplete information. This is equally true with your CRM analytics. Therefore, if you have more than one system (multiple CRM instances, multiple R/3s, Master Data Management [MDM], non-SAP solutions) that you need to access to get your full set of data, you need to create a mechanism to bring that data together in BW. You need to know that customer 12345 in system A is the same as customer ABCDE in system B.
It is critical that you can identify any commonalities of data and bring them together in CRM if you want to get the best overall picture possible. It is also critical that you identify any conflicts in the metadata definitions of objects (alpha vs. numeric, length of field) and address those in your unification process.
Assuming that you are not using MDM functionality to identify and create the relationships among your data, you need to bring together your data when it comes from multiple sources. The first step is to determine if the data values in the disparate systems have a 1:1 correlation. If you can categorically state that customer 12345 in system A is always the same as customer ABCDE in system B, then you are able to relate them in BW via the following options:
- Create a new attribute in the InfoObject (0customer in my example) that contains an alternate or reference customer number. Give your disparate entries a common code that will be your key when reporting or performing other analytics. Update each record (customer master record) with the reference number. As this would likely be a manually intensive effort (depending on how many records are involved), you may wish to consider other options.
- Write the decoding rules into the program generating the extraction (or flat file) from the SAP or non-SAP system. With this option you won’t have to worry about handling the relationship within the BW system — the data goes into BW “clean.”
- Create a lookup/reference table that holds the decoding relationships. This would be similar to creating a pseudo- MDM solution that records the relationships of your customers from different systems or instances. Reference this table in the transfer rules or update rules for the BW loading process (Figure 1 and Figure 2). Remember that if you load this new attribute during the transfer rules cycle, it retains this definition for all references within BW (and then back to mySAP CRM). If you load it as an update rule, then it only applies to the InfoCube that the update rule applies to. If you only load it to master data, then it is referenced throughout the system.

Figure 1
Standard Transfer Structure/Transfer Rules screen

Figure 2
Within Transfer rules or Update rules, you can define a Constant, Formula, or Routine to populate your new attribute
Data mining has long been a goal for many BW users — the ability to use the system to drive out information beyond a standard report or query. This is even truer for CRM analysts who want the ability to work with data to drive out information that they can respond to proactively.
If you aren’t able to identify an easy, systemic way to relate your information to allow for consolidated reporting, you either have to determine a way to make this happen or accept that you will not be able to have true consolidated reporting.
In this case, you may have to bring your reporting up to a different level so that you can view a single picture of the truth. For example, if you cannot harmonize your customer numbers, you may want to start your reporting and analytics at a segment or region level instead of at the customer level. Similarly, you may want to report sales by product group or category instead of by individual product. However, your ideal goal is to create a rule for harmonization. Once that is done and you determine the mapping scheme, you can use any of the options listed above to define your relationships within the system.
2. Build a multilayered data model to ensure you can report against high volumes of data in BW (Figure 3). A common issue faced in the reporting and analytics of CRM data is the system response time with queries. This is partly due to the volume of information and data that mySAP CRM generates and passes to BW. You can implement specific designs within your data model to work with high volumes of data.

Figure 3
Set up a multilayered data model
A multilayered model can expedite your query response time. This type of model, instead of one large InfoCube, allows you to report at multiple layers and drill down to the lower level data. It also means that you are not immediately querying against huge volumes of data — or very broad datasets — but rather you are starting at a higher level and then have the ability to drill into whatever detail you require. In Figure 3 you can see an example of this type of data model.
You must work with the many components of the system to build your solution. You create Operational Data Stores (ODSs) to contain your highly detailed transaction records. My example has an ODS for sales contracts and another for sales orders. Using ODSs for this level of detail is a recommended best practice. While you must account for the size of the ODSs in your hardware and database planning, this helps give you better query performance overall. Additionally, you can define key master data (like customer) as a reportable object — giving you the ability to easily and quickly execute statistical queries and analyses against your key master data.
These then use the ODSs to “feed” an InfoCube. This InfoCube (or series of InfoCubes) is aggregated, but brings along the key identifier (usually a document number such as the sales contract or sales order) into the InfoCube so that you have that link back to the ODSs.
You can use BW’s report/report interface to create connections between the objects (the InfoCubes and the ODSs), as shown in Figure 4. You can either set this up so that any query from the InfoCube can jump to certain destinations or you can define this on a query-by-query level. The configuration of the report/report interface is a task normally done by BW administrators and needs to be performed directly in the production instance of your BW solution. This is not a difficult configuration activity.

Figure 4
Jump to another query, an R/3 transaction, an ABAP report, or Web address using the report/report interface
3. Logically partition your data model. One of the most common issues encountered as a BW solution grows is decreasing query performance. Queries that used to run quickly seem to get slower and slower as time progresses. This is usually because of the growth of the underlying InfoCubes and their corresponding tables. Add to that the fact that you are likely storing years’ worth of data and only querying the most recent months.
As time progresses you create a situation in which you have, for example, five years’ worth of data in your InfoCube. Capturing and retaining this much information is not a bad thing; it gives you the ability to look at history, gauge trends, and do in-depth data mining. However, you are now asking your query to deal with (potentially) sorting through hundreds of millions of records that no longer apply to your general queries. Hence the slower response time.
You can mitigate this issue by creating a data model that is logically partitioned. A logical partition is not database partitioning (which is another helpful task but needs to be performed by your DBAs and differs based on the underlying database of your system), nor is it fact table partitioning (which I will get into in a minute). Logical partitioning says that instead of having one InfoCube with five years’ worth of information — for example, 400,000,000 rows — you break it into a bunch of little InfoCubes and bring them together with a MultiProvider.
Note
Remember that the report/report interface allows you to define “jumps” from within one query out to another query — within the same InfoProvider or out to a completely different InfoProvider. You define what results pass through to the new query. (Define this so that you are jumping into a query that has this result as a key or selection criteria.) With this method, you have the detail available for reporting, but you don’t have to load it all into the InfoCube and therefore cache it into your initial query result.
Note
As of today and the current versions of the SAP software, a MultiProvider — which is a virtual structure — is a union of the data from the associated basic InfoCubes. In the next version of BW (SAP NetWeaver BI 7.0, which begins Ramp-Up in October 2005), your MultiProviders can be a join. This makes your reports look much better and eliminates the need for custom solutions that you may be working with to compress data to replicate that join.
The trick is that you want to break this down into small, logical InfoCubes that fall in line with the type of querying and analytics that your users will execute. For example, you won’t want to create your little InfoCubes based on ABC classification if your queries are keyed on customer number and span a rolling 12-month cycle.
Typically, people define their logical partitions based on year and then bring their model together with a MultiProvider. The system only hits the tables needed to answer each particular query, therefore bypassing potentially hundreds of millions of rows of data. You can bring it all together anytime you want by querying across all years, but for your normal queries, you will see an improved query response time.
4. Use InfoCube (fact table) partitioning (if your database supports it) to break up the tables that the InfoCubes sit on into smaller tables. This is the next step in query performance improvements. InfoCube (fact table) partitioning is different from the logical partitioning discussed above. With fact table partitioning, you actually break up the fact table of your InfoCube from one very large table (even with a logically partitioned data model in place) into a group of small tables.
By setting up fact table partitioning, you are further refining the result set that your query must sort through before returning results. The system processes the query jobs concurrently against each partition (like a bunch of little queries instead of one big one).
Setting this up is easy. You simply go into the basic InfoCube you want to work with (in edit mode) and from the menu bar select Extras>Partitioning (Figure 5).

Figure 5
Select Extras>Partitioning from within the Edit mode of your basic InfoCube
This launches a pop-up selection screen that you use to define your partitions (Figure 6). You select from the time characteristics that you have in your cube. You can only partition based on a time characteristic. Based on the time characteristic you pick (year, month) you define the range — starting and ending — and the maximum number of partitions.

Figure 6
Select time characteristic and define range of named partitions
Let’s say that you select 0CALMONTH as your delimiter for this partition, as in my example. Let’s also say that this InfoCube holds older sales data — from 1998 through 2002. Monthly partition ranges could be 01/1998 to 12/2002. This would give you one partition per month. You would actually create 50 little fact tables — one for each month (48) plus one for anything that falls before 01/1998 and one for anything that may fall after 12/2002. It’s as easy as that. However, here are a couple of “gotchas.”
First, you have to understand how users will query against this data. Remember, you can only partition the fact table based on time characteristics so this solution may not help response time for all users or situations. An example of the type of query that is helped by this is, “What were my profits for this quarter versus this same quarter last year?” An example of a query that is not helped by this is, “Who are the early adopters of product X across all regions?” Queries that are not time-based have to cross all the partitions, whereas queries that are time-based only need to hit those partitions required to answer the question.
Next, only certain databases support this function (Figure 7). Most installations run on supported databases, but if your database is not supported, then you are not able to use this function.

Figure 7
Databases that currently do and don’t support fact table partitioning
The final, and perhaps biggest, gotcha with fact table partitioning is that you can only do this (at present) when your InfoCube is empty. This means that either you have to determine this before you load any data into the InfoCube (which works fine for those just starting) or you have to empty the InfoCube, set your partitions, and reload (which could take a very long time). This can be a difficult choice for those who are currently dealing with performance issues and want to use this functionality. Optimizations are coming to this function in the new version of BW, however.
5. Work with aggregates. Aggregates are another good way to increase the performance of your reporting and analytics. Think of aggregates as mini-InfoCubes or data marts. They are defined at the InfoCube level and hold presummarized data. You can create them manually or let the system propose them for you (based on how users are actually using the data). Queries first look to see if an aggregate is available to satisfy requests. If the system does not find an aggregate, it gets the data and performs the calculations.
The best aggregates are created after users have executed and navigated through queries and created ad hoc queries. That is because the system can then understand how data is used and can best propose optimizations. You can set aggregates up based on particular queries or for cubes in general, depending on your needs and the needs of your users.
Note
The tools and techniques I’ve described will continue to expand and mature in future releases of both BW and mySAP CRM. New features on the immediate horizon include process chain integration, writing into flat files and InfoCubes, write-backs to SAP operational systems, variables, enhanced transformation with a formula library, and prognosis capabilities.
If you do use aggregates you will find that although they do improve query performance, they take up a lot of space. Additionally, your aggregates have to be updated with new data — just like loading your cubes — and your overall data loading process may take longer. To set this up you need a combination of BW data modeling skills and understanding of data querying and analysis.
Analysis Process Designer and Data Mining
In a subsequent article, I’ll explain how to use the Analysis Process Designer (APD) and data mining to improve your CRM analytics. The APD, which is broadly available starting with BW 3.5, allows you — on the fly — to create new relationships between existing data within BW.
Data mining provides the ability to use the system to drive out information beyond a standard report or query. This allows CRM analysts to derive information that they can respond to proactively.
Penny Silvia
Penny Silvia is managing director, SAP Analytics, for Deloitte Consulting. She is part of the cross-sector executive leadership team for the SAP service line at Deloitte Consulting, responsible for developing service offerings and go-to-market strategies for SAP Analytics and SAP HANA-based solutions including SAP S/4HANA and SAP BW/4HANA. This includes both internally and externally focused activities, sales enablement, solution development, client strategies, and thought leadership.
You may contact the author at psilvia@deloitte.com.
If you have comments about this article or publication, or would like to submit an article idea, please contact the editor.