As BW transformations usually involve data packages of 50,000 data records or more, it is not advisable to perform individual SELECT statements within the LOOP in ABAP routines as this would result in database selection for every single data record in the data package. Even if you have a rapid in-memory database such as SAP HANA, this does not automatically mean that millions of data records can be read in an instant. This is in because of the overhead for each instance of access. It is better to save the relevant data with one or a few SELECT statements in internal tables before the transformation LOOP, and then read out these tables using the READ statement. Hashed tables are best used here.
Whether you then perform multiple SELECT statements per data package (1)…
SELECT
key_field1
key_field2
key_field3
group_field
lookup_field
FROM dbTab INTO TABLE itab_group1
FOR ALL ENTRIES IN result_package
WHERE key_field1 = result_package-key_field1
AND key_field2 = result_package-key_field2
AND key_field3 = result_package-key_field3
AND group_field = 'GR1'.
“If necessary, implement sorting / aggregation / reduction of fields on itab_group1
SELECT
key_field1
key_field2
key_field3
group_field
lookup_field
FROM dbTab INTO TABLE itab_group2
FOR ALL ENTRIES IN result_package
WHERE key_field1 = result_package-key_field1
AND key_field2 = result_package-key_field2
AND key_field3 = result_package-key_field3
AND group_field = 'GR2'.
“If necessary, implement sorting / aggregation / reduction of fields from itab-group1
… or use a SELECT/ENDSELECT loop (2) to populate multiple internal tables is not of significance in terms of speed.
SELECT
key_field1
key_field2
key_field3
group_field
lookup_field
FROM dbTab INTO ls_all
FOR ALL ENTRIES IN result_package
WHERE key_field1 = result_package-key_field1
AND key_field2 = result_package-key_field2
AND key_field3 = result_package-key_field3
AND ( group_field = 'GR1'
OR group_field = 'GR2')
AND NOT lookup_date = '00000000'.
CASE ls_rest_plus-group_field.
WHEN 'GR1'.
IF itab_group1 IS INITIAL.
"From this point, the field with the group is no longer important => we therefore only transfer the remaining lines
MOVE-CORRESPONDING ls_all TO ls_group.
INSERT ls_group INTO TABLE itab_group1.
ELSE.
READ TABLE itab_group1
WITH TABLE KEY key_field1 = ls_group-key_field1
key_field2 = ls_group-key_field2
key_field3 = ls_group-key_field3
ASSIGNING <fs_group>.
IF sy-subrc = 0.
"If necessary, implement update of line from itab_group1
ELSE.
MOVE-CORRESPONDING ls_all TO ls_group.
INSERT ls_group INTO TABLE itab_group1.
ENDIF.
ENDIF.
WHEN OTHERS.
"As above, but from itab_group2
ENDCASE.
CLEAR ls_all.
CLEAR ls_group.
UNASSIGN <fs_group>.
ENDSELECT.
For a DTP with a package size of 200,000 and nearly 16 million data records to load, (1) has taken us 43m 39s and (2) 42m 25s. We have read from a table with nearly 190 million data records and have had to create a total of 5 groups of data in internal tables, which then had to be searched in the transformation LOOP for the lookup. And yes, these loads were implemented on BW on HANA. J
To compare: were we to read from the database for the 5 lookups within the LOOP, we would need c. 5 hours for a load with the same amount of data.
The comparative speed of methods (1) and (2) is also explored in the theory that the commonly poor reputation of SELECT/ENDSELECT loops is not justified. While these loops do have certain disadvantages, they are in performance terms not necessarily slower than SELECT INTO statements. They are in fact to be preferred when they allow the development of simpler and clearer programming logic. The disadvantages of SELECT/ENDSELECT loops are less relevant for in-memory databases.
If you are also careful only to select those data records and fields that you later require, you will generally ensure high-performance transformations. The implementation work is definitely worthwhile!