Updated: Jul 4, 2022
Incremental / Delta Load has always been a challenging tasks in ETL between SAP and Azure Data Factory. To answer this global problem SAP has introduced a new connector which streamlines access to SAP Data with in Azure services like Synapse Analytics.
New connector uses SAP Operational Data Provisioning (ODP). SAP has vast list of ODP data sources from SAP ECC Applications.
In the current setup SAP has different connectors like SAP BW via MDS, BW Open Hub, SAP Table, SAP ECC and SAP HANA. However all these are helpful in batch processing where individual carries data volumes which is always a new data and does not compare the old request data. However for incremental loads, often it is required to check the old data available and load only the new records. Which can decrease the overhead on resources and custom manual processes.
To overcome this problem, SAP CDC connector is introduced which connects to SAP Table, SAP BW Info providers and S/4 HANA. The connectors run on SAP systems, converts the data into data packages in Operational Delta Queues(ODQ) that can be utilized by Azure Data Factory. Azure copy activity can be used on SAP CDC connector to extract the SAP data and dropped into Azure Data Lake (ADLS Gen2 ) / Azure Blob storage.
This will subsequently reduce the data load timings and data availability as per business SLAs.
The new SAP CDC connectors can be used on:
SAP ECC Table
S/4 Hana Table
ABAP CDS View
BW Info Provider
Traditional approach of identifying a timestamp column from a dataset and writing a custom logic to extract delta records will no more be required, as the ODQ is going to extract only the newly added / changed records to ADF.
Below are readily available pipeline templates which can be used.