Goldengate 12.2 Heartbeat Table

http://blog.perftuning.com/oracle-goldengate-12-2-heartbeat-table-issue-resolved/

Oracle GoldenGate 12.2 Heartbeat Table Issue Resolved

Monitoring lag in GoldenGate has always been an important part of monitoring GoldenGate. Lags are reported in several ways. When using the ggsci command lag only the latest lag is reported with a 1 second resolution providing the last reported lag. This isn’t accurate and does not provide a history for the Oracle GoldenGate 12.2 Heartbeat. In the past, GoldenGate implementers have created heartbeat tables manually.

In Oracle GoldenGate 12.2 a built-in heartbeat table feature has been added. This heartbeat table allows for more accurate heartbeats and heartbeat history. It works by creating an artificial transaction every minute that contains timing information that is used for heartbeats. The heartbeat tables are accessed via views that provide accurate lag data and lag history.

Recently I setup a pair of Oracle databases with Oracle GoldenGate 12.2. I setup these systems in hope of testing the new Heartbeattable feature that was introduced in GoldenGate version 12.2. The internal heartbeat mechanism is a great improvement in that it provides automatic and accurate lag times and includes a lag history.

I implemented GoldenGate 12.2 between two Oracle 11.2.0.4 databases using standard parameter files that I typically use as a starting point for GoldenGate projects or testing. In addition, database connectivity was set up using the GoldenGate credentialstore. In this case, for testing I have set up GoldenGate to replicate the HR example tables. Unfortunately the heartbeat mechanism failed to work. This blog entry describes the issue that I had and potential solutions.

Configuring GoldenGate

In order to configure replication I used the following GLOBALS and extract parameter files:

GLOBALS

GGSCHEMA ggadmin

EXT1HR

———————————-

— Local extract for HR schema

————————————
Extract EXT1HR
SETENV (NLS_LANG = AMERICAN_AMERICA.AL32UTF8
USERIDALIAS ggadm
ReportCount Every 30 Minutes, Rate
Report at 01:00
ReportRollover at 01:15
DiscardFile dirrpt/EXT1HR.dsc, Append
DiscardRollover at 02:00 On Sunday
DDL INCLUDE MAPPED
TRANLOGOPTIONS EXCLUDEUSER ggadmin
GETTRUNCATES
ExtTrail dirdat/la
Table HR.*;

At first look I thought that this should work. The GLOBALS parameter GGSCHEMA ggadmin is required for both DDL replication and the heartbeattable. Unfortunately this turned out to be part of the problem.

The pump parameter file was configured for passthrough as shown here:

PUMP1HR
————————————
— Pump extract for HR schema
————————————
Extract PUMP1HR
PASSTHRU
USERIDALIAS ggadm
ReportCount Every 1000 Records, Rate
Report at 01:00
ReportRollover at 01:15
DiscardFile dirrpt/PUMP1HR.dsc, Append
DiscardRollover at 02:00 ON SUNDAY
RmtHost target, MgrPort 7809
RmtTrail dirdat/ra
Table HR.*;

The initial extract and pump were registered using an obey file containing the following commands:

dblogin useridalias ggadm
add extract EXT1HR, tranlog, begin now
add exttrail dirdat/la, extract EXT1HR, megabytes 100
add extract PUMP1HR, exttrailsource dirdat/la
add rmttrail dirdat/ra, extract PUMP1HR, megabytes 100

On the target GoldenGate 12.2 was setup using the following GLOBALS and replciat parameter files:

GLOBALS
GGSCHEMA ggadmin
REP1HR

————————————
— Replicat for HR schema
————————————
replicat REP1HR
SETENV (NLS_LANG = AMERICAN_AMERICA.AL32UTF8)
USERIDALIAS ggadm
BATCHSQL
HandleCollisions
— Only one of these is used at a time
AssumeTargetDefs
ReportCount Every 30 Minutes, Rate
Report at 01:00
ReportRollover at 01:15

DiscardFile dirrpt/REP1HR.dsc, Append
DiscardRollover at 02:00 ON SUNDAY

Map HR.COUNTRIES, Target HR.COUNTRIES ;
Map HR.DEPARTMENTS, Target HR.DEPARTMENTS ;
Map HR.EMPLOYEES, Target HR.EMPLOYEES ;
Map HR.JOBS, Target HR.JOBS ;
Map HR.JOB_HISTORY, Target HR.JOB_HISTORY ;
Map HR.LOCATIONS, Target HR.LOCATIONS ;
Map HR.REGIONS, Target HR.REGIONS ;

This replicat was registered using an obey file that contained the following commands:

dblogin USERIDALIAS ggadm
add checkpointtable ggadmin.rep1hr_chkpt
add replicat REP1HR, exttrail dirdat/ra, checkpointtable ggadmin.rep1hr_chkpt

All of the GoldenGate processes were started and test transactions run. Once GoldenGate was verified to be running correctly it was time to set up the 12.2 heartbeat table and see if it worked.

Setting up the Built-In Heartbeat Table

The heartbeat table was setup first on the target system and then on the source and the results of that are shown here:

ADD HEARTBEATTABE on target

GGSCI > add heartbeattable

2016-02-18 10:00:13 INFO OGG-14001 Successfully created heartbeat seed table [“GG_HEARTBEAT_SEED”].
2016-02-18 10:00:13 INFO OGG-14032 Successfully added supplemental logging for heartbeat seed table [“GG_HEARTBEAT_SEED”].
2016-02-18 10:00:13 INFO OGG-14000 Successfully created heartbeat table [“GG_HEARTBEAT”].
2016-02-18 10:00:13 INFO OGG-14033 Successfully added supplemental logging for heartbeat table [“GG_HEARTBEAT”].
2016-02-18 10:00:13 INFO OGG-14016 Successfully created heartbeat history table [“GG_HEARTBEAT_HISTORY”].
2016-02-18 10:00:13 INFO OGG-14023 Successfully created heartbeat lag view [“GG_LAG”].
2016-02-18 10:00:13 INFO OGG-14024 Successfully created heartbeat lag history view [“GG_LAG_HISTORY”].
2016-02-18 10:00:13 INFO OGG-14003 Successfully populated heartbeat seed table with [ORCLS].
2016-02-18 10:00:13 INFO OGG-14004 Successfully created procedure [“GG_UPDATE_HB_TAB”] to update the heartbeat tables.
2016-02-18 10:00:13 INFO OGG-14017 Successfully created procedure [“GG_PURGE_HB_TAB”] to purge the heartbeat history table.
2016-02-18 10:00:13 INFO OGG-14005 Successfully created scheduler job [“GG_UPDATE_HEARTBEATS”] to update the heartbeat tables.
2016-02-18 10:00:13 INFO OGG-14018 Successfully created scheduler job [“GG_PURGE_HEARTBEATS”] to purge the heartbeat history table.

ADD HEARTBEATTABLE on source

GGSCI > add heartbeattable

2016-02-18 10:02:36 INFO OGG-14001 Successfully created heartbeat seed table [“GG_HEARTBEAT_SEED”].
2016-02-18 10:02:37 INFO OGG-14032 Successfully added supplemental logging for heartbeat seed table [“GG_HEARTBEAT_SEED”].
2016-02-18 10:02:37 INFO OGG-14000 Successfully created heartbeat table [“GG_HEARTBEAT”].
2016-02-18 10:02:37 INFO OGG-14033 Successfully added supplemental logging for heartbeat table [“GG_HEARTBEAT”].
2016-02-18 10:02:37 INFO OGG-14016 Successfully created heartbeat history table [“GG_HEARTBEAT_HISTORY”].
2016-02-18 10:02:37 INFO OGG-14023 Successfully created heartbeat lag view [“GG_LAG”].
2016-02-18 10:02:37 INFO OGG-14024 Successfully created heartbeat lag history view [“GG_LAG_HISTORY”].
2016-02-18 10:02:37 INFO OGG-14003 Successfully populated heartbeat seed table with [ORCLP].
2016-02-18 10:02:37 INFO OGG-14004 Successfully created procedure [“GG_UPDATE_HB_TAB”] to update the heartbeat tables.
2016-02-18 10:02:37 INFO OGG-14017 Successfully created procedure [“GG_PURGE_HB_TAB”] to purge the heartbeat history table.
2016-02-18 10:02:37 INFO OGG-14005 Successfully created scheduler job [“GG_UPDATE_HEARTBEATS”] to update the heartbeat tables.
2016-02-18 10:02:37 INFO OGG-14018 Successfully created scheduler job [“GG_PURGE_HEARTBEATS”] to purge the heartbeat history table.

Once the heartbeat table was created it should have been an easy matter to go to the target system and query the ggadmin.gg_heartbeat and ggadmin.gg_heartbeat_history tables to see the automated heartbeats. Unfortunately at this point there were no rows in these tables. It would take a little bit of investigation in order to determine what the issue was.

Debugging the Issue

Since there are a number of parts to the heartbeat table including the extract, pump and replicat I had to decide where to start. I knew that the heartbeat mechanism took advantage of the replication that was currently configured in order to move its heartbeat information from the source to the target. I also believed that it was using GoldenGate replication.

In order to see if anything was moving in the trail file I used the GoldenGate stats command against the extract. I only saw the transactions that I had run for my own testing of the replication. This led me to believe that it was a problem at the source side. I also ran logdump and looked at the trail files and I saw no “heartbeat” records in the trail.

I then looked in the database at the database schedule to see if the GG_UPDATE_HEARTBEATS job was running. It was and it was updating the GG_HEARTBEAT_SEED HEARTBEAT_TIMESTAMP column. So, the scheduler job was running and the column was being updated so there was nothing in the trail file, so it was most likely an issue with replication.

Looking back at the extract parameter file it became apparent that this might be related to the parameter TRANLOGOPTIONS EXCLUDEUSER ggadmin parameter. So, I commented out that parameter and suddenly the GG_HEARTBEAT and GG_HEARTBEAT_HISTORY table began populating on the target side. In addition, after a while a stats command against the extract showed updates to the GG_HEARTBEAT_SEED table.

*** Daily statistics since 2016-02-18 00:00:00 ***

Total inserts 0.00

Total updates 603.00

Total deletes 0.00

Total discards 0.00

Total operations 603.00

In addition the gg_lag_history view showed the data that I was looking for:

column heartbeat_received_ts format a30
column incoming_path format a40
column incoming_timestamp format 9.99999999

select heartbeat_received_ts, incoming_path, incoming_lag from gg_lag_history;

18-FEB-16 11.41.41.893670 AM EXT1HR ==> PUMP1HR ==> REP1HR 4.868161000

18-FEB-16 11.42.42.944752 AM EXT1HR ==> PUMP1HR ==> REP1HR 5.925239000

18-FEB-16 11.43.41.993427 AM EXT1HR ==> PUMP1HR ==> REP1HR 4.970112000

18-FEB-16 11.44.42.041501 AM EXT1HR ==> PUMP1HR ==> REP1HR 5.021219000

18-FEB-16 11.45.43.091402 AM EXT1HR ==> PUMP1HR ==> REP1HR 6.065255000

18-FEB-16 11.46.42.140396 AM EXT1HR ==> PUMP1HR ==> REP1HR 5.121159000

18-FEB-16 11.47.41.189519 AM EXT1HR ==> PUMP1HR ==> REP1HR 4.136698000

18-FEB-16 11.48.43.240700 AM EXT1HR ==> PUMP1HR ==> REP1HR 6.215541000

18-FEB-16 11.49.43.289275 AM EXT1HR ==> PUMP1HR ==> REP1HR 6.268965000

18-FEB-16 11.50.42.338042 AM EXT1HR ==> PUMP1HR ==> REP1HR 5.312789000

18-FEB-16 11.51.42.386426 AM EXT1HR ==> PUMP1HR ==> REP1HR 5.368242000

18-FEB-16 11.52.42.435722 AM EXT1HR ==> PUMP1HR ==> REP1HR 5.412282000

18-FEB-16 11.53.43.487633 AM EXT1HR ==> PUMP1HR ==> REP1HR 6.470509000

18-FEB-16 11.54.42.538333 AM EXT1HR ==> PUMP1HR ==> REP1HR 5.523112000

18-FEB-16 11.55.42.588896 AM EXT1HR ==> PUMP1HR ==> REP1HR 5.563893000

This was now successful.

Problem Recap

Because I was setting up for bi-directional replication and using the ggadmin user for both source and target I had configured the excludeuser parameter. This keeps GoldenGate from re-replicating replicated transactions by ignoring transactions submitted by the ggadmin user. This was fine for normal transactions but I didn’t expect that the heartbeat transactions would be excluded as well.

The excludeuser ggadmin caused updates to the GG_HEARTBEAT_SEED table to not be replicated. You cannot exclude the ggadmin user in the extract. The heartbeat schema is defined by the GGSCHEMA parameter in the GLOBALS file. In addition, the GGSCHEMA parameter defines the schema for DDL replication. This causes a bit of a problem for bi-directional replication when you want to use the ggadmin user for both extract and replicat.

Solutions

I thought about a number of different solutions to this problem and consulted some of my colleagues. We decided that the best approach to this problem was to simply use a different Oracle database user for extract and replicat. This would allow us to still maintain the same GGSCHEMA user for heartbeat and DDL replication. The new user account used for the replicat would be excluded in the TRANLOGOPTIONS EXCLUDEUSER parameter and everything should work well.

This would allow the ggadmin user at the source to submit both DDL and heartbeats, since you can only have one setting for GGSCHEMA which both must use. The different user that is used for replicating back to the source will be excluded via TRANLOGOPTIONS EXCLUDEUSER .

I really would like to see the GoldenGate developers take a look at this and internally allow replication of the GoldenGate lag tables to be excluded from the excludeuser option.

Managing Heartbeat Data

As seen above the heartbeat table is created via the ADD HEARTBEATTABLE command within ggsci. By default a heartbeat is generated every minute, retained for 30 days then purged. The frequency of the heartbeat, the history retention and how often the purge process runs is configurable. This is done via the ALTER HEARTBEATTABLE command.

Viewing Heartbeat Data

Viewing the heartbeat table is done via the two heartbeat views; GG_LAG and GG_LAG history. These views provide information on lags for each set of ext -> pump -> replicat that is configured. This information as well as the history is valuable for monitoring the performance of the GoldenGate configuration.

I have implemented viewing these tables via the following scripts. The scripts and output are shown here:

lag.sql

col local_database format a10
col current_local_ts format a30
col remote_database format a10
col incoming_path format a30
col incoming_lag format 999,999.999999

select local_database, current_local_ts, remote_database, incoming_path, incoming_lag from gg_lag;

Output of lag.sql

TARGET on GG16B:ggadmin > @lag

LOCAL_DATA CURRENT_LOCAL_TS REMOTE_DAT INCOMING_PATH INCOMING_LAG

———- —————————— ———- ——————————

ORCLS 19-FEB-16 08.23.40.715171 PM ORCLP EXT1HR ==> PUMP1HR ==> REP1HR 5.848900

Lag_history.sql
set pagesize 100
col local_database format a10
col heartbeat_received_ts format a30
col remote_database format a10
col incoming_path format a32
col incoming_lag format 999,999.999999

select local_database, heartbeat_received_ts, remote_database, incoming_path, incoming_lag from gg_lag_history;

Output of lag_history.sql

ORCLS 19-FEB-16 08.30.40.678817 PM ORCLP EXT1HR ==> PUMP1HR ==> REP1HR 3.656291

ORCLS 19-FEB-16 08.31.41.702019 PM ORCLP EXT1HR ==> PUMP1HR ==> REP1HR 4.681604

ORCLS 19-FEB-16 08.32.41.724873 PM ORCLP EXT1HR ==> PUMP1HR ==> REP1HR 4.700153

ORCLS 19-FEB-16 08.33.41.747616 PM ORCLP EXT1HR ==> PUMP1HR ==> REP1HR 4.729306

ORCLS 19-FEB-16 08.34.41.770055 PM ORCLP EXT1HR ==> PUMP1HR ==> REP1HR 4.748563

ORCLS 19-FEB-16 08.35.41.793370 PM ORCLP EXT1HR ==> PUMP1HR ==> REP1HR 4.769183

ORCLS 19-FEB-16 08.36.41.816100 PM ORCLP EXT1HR ==> PUMP1HR ==> REP1HR 4.798405

ORCLS 19-FEB-16 08.37.41.839139 PM ORCLP EXT1HR ==> PUMP1HR ==> REP1HR 4.817937

The output of the lag history can be used to monitor lags over long periods of time and be used for alerting and monitoring. The lag history can be imported into a spreadsheet and graphed as well.

GGLagGraph

Sobre Alexandre Pires

ORACLE OCS Goldengate Specialist, OCE RAC 10g R2, OCP 12C, 11g, 10g , 9i e 8i - Mais de 25 anos de experiência na área de TI. Participei de projetos na G&P alocado na TOK STOK, EDINFOR alocado na TV CIDADE "NET", 3CON Alocado no PÃO DE AÇUCAR, DISCOVER alocado na VIVO, BANCO IBI e TIVIT, SPC BRASIL, UOLDIVEO alocado no CARREFOUR e atualmente na ORACLE ACS atendendo os seguintes projetos: VIVO, CLARO, TIM, CIELO, CAIXA SEGUROS, MAPFRE, PORTO SEGURO, SULAMERICA, BRADESCO SEGUROS, BANCO BRADESCO, BASA, SANTANDER, CNJ, TSE, ELETROPAULO, EDP, SKY, NATURA, ODEBRESHT, NISSEI, SICREDI, CELEPAR, TAM, TIVIT, IBM, SMILES, CELEPAR, SERPRO,OKI,BANCO PAN, etc
Nota | Esse post foi publicado em GOLDENGATE e marcado , , , , , , , , , . Guardar link permanente.

Deixe uma resposta

Preencha os seus dados abaixo ou clique em um ícone para log in:

Logotipo do WordPress.com

Você está comentando utilizando sua conta WordPress.com. Sair / Alterar )

Imagem do Twitter

Você está comentando utilizando sua conta Twitter. Sair / Alterar )

Foto do Facebook

Você está comentando utilizando sua conta Facebook. Sair / Alterar )

Foto do Google+

Você está comentando utilizando sua conta Google+. Sair / Alterar )

Conectando a %s