Transcription

IBM Information Management SoftwareFront coverSmarter BusinessDynamic Information with IBMInfoSphere Data Replication CDCLog-based for real-time high volumereplication and scalabilityHigh throughput replication withintegrity and consistencyProgramming-freedata integrationAlec BeatonAnzar NoorJudy ParkesBill Shubinibm.com/redbooksChuck BallardMark KetchieFrank KetelaarsDeepak RangaraoWim Van Tichelen

International Technical Support OrganizationSmarter Business: Dynamic Information with IBMInfoSphere Data Replication CDCMarch 2012SG24-7941-00

Note: Before using this information and the product it supports, read the information in“Notices” on page ix.First Edition (March 2012)This edition applies to Version 6.5 of IBM InfoSphere Change Data Capture (product number5724-U70). Copyright International Business Machines Corporation 2012. All rights reserved.Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADPSchedule Contract with IBM Corp.

ContentsNotices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixTrademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiThe team who wrote this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiNow you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . xviComments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviiStay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviiChapter 1. Introduction and overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Optimized data integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 InfoSphere architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Chapter 2. InfoSphere CDC: Empowering information management. . . . . 92.1 The need for dynamic data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2 Data delivery methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3 Providing dynamic data with InfoSphere CDC . . . . . . . . . . . . . . . . . . . . . 122.3.1 InfoSphere CDC architectural overview . . . . . . . . . . . . . . . . . . . . . . 142.3.2 Reliability and integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Chapter 3. Business use cases for InfoSphere CDC . . . . . . . . . . . . . . . . . 193.1 InfoSphere CDC techniques for transporting changed data . . . . . . . . . . . 203.1.1 Option 1: Database staging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.1.2 Option 2: Message queue (MQ) based integration . . . . . . . . . . . . . . 223.1.3 Option 3: File-based integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.1.4 Option 4: InfoSphere DataStage Direct Connect . . . . . . . . . . . . . . . 243.2 Data warehousing and business intelligence . . . . . . . . . . . . . . . . . . . . . . 243.2.1 Active data warehousing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.2.2 Slowly changing dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.3 Consolidation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.3.1 Consolidation: Sample implementation 1 . . . . . . . . . . . . . . . . . . . . . 293.3.2 Consolidation: Sample implementation 2 . . . . . . . . . . . . . . . . . . . . . 303.4 Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.4.1 Distribution: Sample implementation 1 . . . . . . . . . . . . . . . . . . . . . . . 323.4.2 Distribution: Sample implementation 2 . . . . . . . . . . . . . . . . . . . . . . . 333.5 Database migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.5.1 Database migration: Sample implementation . . . . . . . . . . . . . . . . . . 353.6 Application integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.6.1 Application integration: Sample implementation 1 . . . . . . . . . . . . . . 37 Copyright IBM Corp. 2012. All rights reserved.iii

3.6.2 Application integration: Sample implementation 2 . . . . . . . . . . . . . . 383.6.3 Application integration: Sample implementation 3 . . . . . . . . . . . . . . 393.7 Integration with master data management . . . . . . . . . . . . . . . . . . . . . . . . 403.7.1 Integration with master data management: Sample implementation 413.8 Integration with IBM Information Server . . . . . . . . . . . . . . . . . . . . . . . . . . 433.8.1 Integration with IBM Information Server: Sample implementation . . 433.9 Operational business intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.9.1 Operational business intelligence: Sample implementation . . . . . . . 45Chapter 4. Solution topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.1 Unidirectional replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.2 Cascading replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494.3 Bidirectional replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524.4 Consolidation replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544.5 Data distribution replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.6 Hub-and-Spoke replication with propagation . . . . . . . . . . . . . . . . . . . . . . 584.7 Destination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624.7.1 JMS Message Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.7.2 Flat files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.7.3 DataStage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674.7.4 Web services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Chapter 5. InfoSphere CDC features and functionality . . . . . . . . . . . . . . . 755.1 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775.1.1 Column functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785.1.2 Journal control fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805.1.3 Joining. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805.1.4 User exits for customizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 825.1.5 Considerations for using transformational functionality. . . . . . . . . . . 835.2 Replication modes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 835.2.1 Refresh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 835.2.2 Continuous mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 875.2.3 Scheduled end (net change) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 885.3 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 895.3.1 Row level. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 895.3.2 Column level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 905.4 Apply methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 915.4.1 Standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 915.4.2 LiveAudit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925.4.3 Adaptive Apply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935.4.4 Summarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935.4.5 Row consolidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945.4.6 Soft deletes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95ivSmarter Business: Dynamic Information with IBM InfoSphere Data Replication CDC

5.4.7 Custom apply methods (user exits). . . . . . . . . . . . . . . . . . . . . . . . . . 965.4.8 Flat files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 965.4.9 DataStage direct connect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 975.4.10 JMS message queues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 985.5 Conflict detection and resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99Chapter 6. Understanding the architecture. . . . . . . . . . . . . . . . . . . . . . . . 1036.1 Component overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1046.1.1 InfoSphere CDC instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1076.1.2 Interoperability between the InfoSphere CDC components . . . . . . 1106.2 Management Console fundamentals. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1116.2.1 Access Manager Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1126.2.2 Configuration Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1146.2.3 Monitoring Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1176.2.4 InfoSphere CDC API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1196.2.5 Access Server fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1206.3 The InfoSphere CDC engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1206.3.1 Bookmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1206.3.2 The InfoSphere CDC Linux, UNIX, and Windows engine . . . . . . . . 1226.3.3 The InfoSphere CDC for System i engine. . . . . . . . . . . . . . . . . . . . 1316.3.4 The InfoSphere CDC for z/OS engine. . . . . . . . . . . . . . . . . . . . . . . 1326.4 Communications between source and target . . . . . . . . . . . . . . . . . . . . . 1346.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135Chapter 7. Environmental considerations . . . . . . . . . . . . . . . . . . . . . . . . 1377.1 Globalization with InfoSphere CDC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1387.1.1 Time zone considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1387.1.2 Encoding conversions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1477.2 Firewall configurations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1497.2.1 How InfoSphere CDC uses TCP/IP . . . . . . . . . . . . . . . . . . . . . . . . 1497.2.2 Firewalls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1527.2.3 InfoSphere CDC in a firewalled network environment . . . . . . . . . . 1537.2.4 Configuring source port restrictions . . . . . . . . . . . . . . . . . . . . . . . . 1547.2.5 Troubleshooting CDC connection issues . . . . . . . . . . . . . . . . . . . . 1567.3 Log retention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1587.3.1 Log retention general guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . 1587.3.2 Log retention platform-specific guidelines . . . . . . . . . . . . . . . . . . . . 1597.4 Remote processing capabilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1627.4.1 Remote source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1637.4.2 Remote target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1647.4.3 Remote source and target. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1657.4.4 Log shipping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1667.5 Using InfoSphere CDC in resilient environments . . . . . . . . . . . . . . . . . . 168Contentsv

7.5.1 InfoSphere CDC reachability: Virtual IP . . . . . . . . . . . . . . . . . . . . . 1697.5.2 InfoSphere CDC binary files and metadata for the Linux, UNIX, andWindows engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1707.5.3 InfoSphere CDC on a shared volume . . . . . . . . . . . . . . . . . . . . . . . 1717.5.4 InfoSphere CDC on separate nodes with a shared database . . . . . 1737.5.5 InfoSphere CDC on separate servers with separate databases . . . 1757.5.6 System i resilient environments . . . . . . . . . . . . . . . . . . . . . . . . . . . 1787.5.7 z/OS / Sysplex and InfoSphere CDC in resilient environments . . . 1867.6 Change management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1907.6.1 Understanding InfoSphere CDC bookmarks . . . . . . . . . . . . . . . . . . 1917.6.2 Change Management sample environment . . . . . . . . . . . . . . . . . . 1937.6.3 DDL changes in a service window . . . . . . . . . . . . . . . . . . . . . . . . . 1937.6.4 DDL changes without a service window . . . . . . . . . . . . . . . . . . . . . 207Chapter 8. Performance analysis and design considerations . . . . . . . . 2118.1 High volume between two systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2128.1.1 Latency and throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2128.1.2 InfoSphere CDC architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2138.2 Identification of potential bottlenecks . . . . . . . . . . . . . . . . . . . . . . . . . . . 2148.3 Performance monitoring in InfoSphere CDC environments . . . . . . . . . . 2158.3.1 Performance monitoring using the Management Console . . . . . . . 2158.3.2 System monitoring tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2158.4 Using workflow for performance issues. . . . . . . . . . . . . . . . . . . . . . . . . . 2168.5 Installation considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2178.5.1 Silent installations and instance creation . . . . . . . . . . . . . . . . . . . . 2178.6 Design considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2218.6.1 Using multiple parallel subscriptions . . . . . . . . . . . . . . . . . . . . . . . . 2218.6.2 Using multiple InfoSphere CDC instances . . . . . . . . . . . . . . . . . . . 2238.6.3 Using an n-tiered architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2248.6.4 Using cascading replication to spread the workload . . . . . . . . . . . . 2268.6.5 Continuous scraping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227Chapter 9. Customization and automation . . . . . . . . . . . . . . . . . . . . . . . . 2319.1 Options for managing InfoSphere CDC. . . . . . . . . . . . . . . . . . . . . . . . . . 2329.2 Management Console GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2329.3 Management Console commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2339.3.1 Common uses for the Management Console commands . . . . . . . . 2349.3.2 Compiling Management Console command scripts . . . . . . . . . . . . 2349.4 InfoSphere CDC engine commands (CLI). . . . . . . . . . . . . . . . . . . . . . . . 2379.4.1 Running commands for the Linux, UNIX, and Windows engine . . . 2379.4.2 Running CL commands for System i. . . . . . . . . . . . . . . . . . . . . . . . 2389.4.3 Running console commands for IBM System z . . . . . . . . . . . . . . . 2389.4.4 Sample scripts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239viSmarter Business: Dynamic Information with IBM InfoSphere Data Replication CDC

9.4.5 Checking an InfoSphere CDC engine and subscriptions activity . . 2469.4.6 Removing obsolete database logs . . . . . . . . . . . . . . . . . . . . . . . . . 2579.5 InfoSphere CDC API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2629.5.1 Development environment setup . . . . . . . . . . . . . . . . . . . . . . . . . . 2639.5.2 Contents of the api.jar file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2639.5.3 Connecting to and managing the Access Server . . . . . . . . . . . . . . 2709.5.4 Connecting to the data stores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2779.5.5 Configuring InfoSphere CDC replication . . . . . . . . . . . . . . . . . . . . . 2839.5.6 Creating a subscription . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2849.5.7 Procedure for mapping tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2879.5.8 Table mapping example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2899.5.9 Procedure for removing mapped tables . . . . . . . . . . . . . . . . . . . . . 3039.5.10 Table mapping removal example . . . . . . . . . . . . . . . . . . . . . . . . . 3049.5.11 Row and column filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3099.5.12 Derived columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3139.5.13 Encoding conversions (before and after Version 6.5) . . . . . . . . . . 3159.5.14 Operations and user exits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3229.5.15 Common procedures (updating table definitions) . . . . . . . . . . . . . 3259.5.16 Deploying subscription changes and considerations . . . . . . . . . . 3389.5.17 Starting, stopping, and monitoring subscriptions . . . . . . . . . . . . . 3399.5.18 Monitoring latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3469.5.19 Monitoring event logs using the API . . . . . . . . . . . . . . . . . . . . . . . 3599.6 Monitoring and integration with external monitoring solutions. . . . . . . . . 3659.6.1 Components to monitor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3669.6.2 InfoSphere CDC instance activity . . . . . . . . . . . . . . . . . . . . . . . . . . 3669.6.3 Subscription activity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3679.6.4 Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3679.6.5 Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3699.7 User exits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3699.7.1 Common uses for user exits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3709.7.2 User exit programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3719.7.3 Derived expression user exits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3739.7.4 Table and row-level user exits . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3839.7.5 Subscription-level (unit of work) . . . . . . . . . . . . . . . . . . . . . . . . . . . 4029.7.6 Java user exit for flat file custom formatter . . . . . . . . . . . . . . . . . . . 4129.7.7 Notifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425Appendix A. Single scrape events and errors . . . . . . . . . . . . . . . . . . . . . 431Single scrape error events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433Appendix B. Additional material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435Locating the web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435Using the web material. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436Contentsvii

How to use the web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437Abbreviations and acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451viiiSmarter Business: Dynamic Information with IBM InfoSphere Data Replication CDC

NoticesThis information was developed for products and services offered in the U.S.A.IBM may not offer the products, services, or features discussed in this document in other countries. Consultyour local IBM representative for information on the products and services currently available in your area.Any reference to an IBM product, program, or service is not intended to state or imply that only that IBMproduct, program, or service may be used. Any functionally equivalent product, program, or service thatdoes not infringe any IBM intellectual property right may be used instead. However, it is the user'sresponsibility to evaluate and verify the operation of any non-IBM product, program, or service.IBM may have patents or pending patent applications covering subject matter described in this document.The furnishing of this document does not give you any license to these patents. You can send licenseinquiries, in writing, to:IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.The following paragraph does not apply to the United Kingdom or any other country where suchprovisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATIONPROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS ORIMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimerof express or implied warranties in certain transactions, therefore, this statement may not apply to you.This information could include technical inaccuracies or typographical errors. Changes are periodically madeto the information herein; these changes will be incorporated in new editions of the publication. IBM maymake improvements and/or changes in the product(s) and/or the program(s) described in this publication atany time without notice.Any references in this information to non-IBM Web sites are provided for convenience only and do not in anymanner serve as an endorsement of those Web sites. The materials at those Web sites are not part of thematerials for this IBM product and use of those Web sites is at your own risk.IBM may use or distribute any of the information you supply in any way it believes appropriate withoutincurring any obligation to you.Information concerning non-IBM products was obtained from the suppliers of those products, their publishedannouncements or other publicly available sources. IBM has not tested those products and cannot confirmthe accuracy of performance, compatibility or any other claims related to non-IBM products. Questions onthe capabilities of non-IBM products should be addressed to the suppliers of those products.This information contains examples of data and reports used in daily business operations. To illustrate themas completely as possible, the examples include the names of individuals, companies, brands, and products.All of these names are fictitious and any similarity to the names and addresses used by an actual businessenterprise is entirely coincidental.COPYRIGHT LICENSE:This information contains sample application programs in source language, which illustrate programmingtechniques on various operating platforms. You may copy, modify, and distribute these sample programs inany form without payment to IBM, for the purposes of developing, using, marketing or distributing applicationprograms conforming to the application programming interface for the operating platform for which thesample programs are written. These examples have not been thoroughly tested under all conditions. IBM,therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. Copyright IBM Corp. 2012. All rights reserved.ix

TrademarksIBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International BusinessMachines Corporation in the United States, other countries, or both. These and other IBM trademarkedterms are marked on their first occurrence in this information with the appropriate symbol ( or ),indicating US registered or common law trademarks owned by IBM at the time this information waspublished. Such trademarks may also be registered or common law trademarks in other countries. A currentlist of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtmlThe following terms are trademarks of the International Business Machines Corporation in the United States,other countries, or both:AIX AS/400 Cognos DataMirror DataStage DB2 Universal Database DB2 Distributed Relational DatabaseArchitecture DRDA eServer IBM iCluster IMS Informix InfoSphere iSeries Language Environment LiveAudit MVS RACF Redbooks Redbooks (logo) Smarter Planet System i System p System z Tivoli Transformation Server WebSphere z/OS The following terms are trademarks of other companies:Netezza, and N logo are trademarks or registered trademarks of IBM International Group B.V., an IBMCompany.Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States,other countries, or both.Java, and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or itsaffiliates.UNIX is a registered trademark of The Open Group in the United States and other countries.Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, IntelSpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or itssubsidiaries in the United States and other countries.Linux is a trademark of Linus Torvalds in the United States, other countries, or both.Other company, product, or service names may be trademarks or service marks of others.xSmarter Business: Dynamic Information with IBM InfoSphere Data Replication CDC

PrefaceA new generation of smarter products, embedded with increasingly sophisticatedsoftware and instrumentation, able to connect to other devices, and respondintelligently to user needs, is transforming the way the world works. Theseproducts are the building blocks of smarter solutions in every industry, fromhospitals to manufacturing to the energy grid.IBM Smarter Planet software solutions are primarily event-driven. The triggerfor an action is not typically a direct user query or an administrative command. Itis an event triggered by an IT system, an automated business process, or aphysical sensor, such as the reading of an RFID tag or an anomaly detected inthe flow of electricity in a smart energy grid. Although event-driven systems canbe created based on software passing messages (using message-orientedmiddleware or an enterprise service bus), the simplest and most flexible way toachieve the same result is to monitor changes to data from existing applications.This technique, using IBM InfoSphere Change Data Capture (InfoSphere CDC)technology, requires no changes to the application generating the event. Thistechnique allows the consumer of events to use a single and unified changetracking interface.In today's demanding environment, businesses expect real-time access topersonalized information and instantaneous updates, setting a new level ofexpectations within organizations. To make more informed business decisions,better serve customers, and increase operational efficiencies, organizationsneed to be aware of changes to key data as they occur. These changes must beimmediately delivered to the people and processes that need to act upon them.This ability to sense and respond to data changes is fundamental to DynamicWarehousing, Master Data Management, and other key initiatives. How wouldyou tie all these independent systems together and process the immense dataflow requirements while using your existing IT infrastructure?In this IBM Redbooks publication, we provide examples of InfoSphere DataReplication change data capture technology being used to achieve a wide varietyof business purposes. We also demonstrate how to implement this technology tobest use your existing infrastructure and achieve the scalability you require. Copyright IBM Corp. 2012. All rights reserved.xi

In this book, we introduce you to InfoSphere Data Replication, which iscomposed of a number of IBM technologies. One of those technologies is IBMInfoSphere Change Data Capture (InfoSphere CDC). We provide an overview ofInfoSphere CDC and position it within the InfoSphere architecture.In addition, we describe the various InfoSphere CDC topologies that can be usedto develop

InfoSphere Data Replication CDC Chuck Ballard Alec Beaton Mark Ketchie Anzar Noor Frank Ketelaars Judy Parkes Deepak Rangarao Bill Shubin Wim Van Tichelen Log-based for real-time high volume replication and scalability High throughput replication with integrity and