Sybase NNTP forums - End Of Life (EOL)

The NNTP forums from Sybase - forums.sybase.com - are now closed.

All new questions should be directed to the appropriate forum at the SAP Community Network (SCN).

Individual products have links to the respective forums on SCN, or you can go to SCN and search for your product in the search box (upper right corner) to find your specific developer center.

Replication stops and is time consuming to restart

9 posts in Replication Last posting was on 2008-09-02 08:49:27.0Z
Tom Arleth Posted on 2008-07-31 09:57:41.0Z
From: "Tom Arleth" <ta@ascott.dk>
Newsgroups: Advantage.Replication
Subject: Replication stops and is time consuming to restart
Date: Thu, 31 Jul 2008 11:57:41 +0200
Lines: 140
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_001E_01C8F304.A2CC0EC0"
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.5512
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512
NNTP-Posting-Host: 62.242.32.56
Message-ID: <48918c0e@solutions.advantagedatabase.com>
X-Trace: 31 Jul 2008 03:55:26 -0700, 62.242.32.56
Path: solutions.advantagedatabase.com!solutions.advantagedatabase.com!62.242.32.56
Xref: solutions.advantagedatabase.com Advantage.Replication:294
Article PK: 1134149

Hi,

We use replication 4 remote places in Greenland to maintain up-to-date central copies of the de-central databases. The 4 sites have thin and apparently also unstable lines to Nuuk where central copies resides.
Every week or so one of the replications hangs whith an error like:
Error 7200:  AQE Error:  State = HY000;   NativeError = 7057;  [Extended Systems][Advantage SQL][ASA] Error 7057:  The record update failed.  The key value produced from this record was not unique, and an index for the current table has the UNIQUE property.  The key value supplied for AESOPERATIONLOG:ID is not unique. 
EntryID=332394, Subscription=central_backup_sub, Table=AESOPERATIONLOG, Record=67302, SQL=INSERT INTO "AESOPERATIONLOG" ("ID", "DATETIME", "USER_NAME", "APP_NAME", "TABLE_NAME", "OPERATION", "KEY1", "KEY1_VALUE", "__CLIENTID", "ADD_INFO") VALUES ( ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )
When at the receiver end I open the table AESOPERATIONLOG with the architect there is no recod with the offending recordID and a "delete from AESOPERATIONLOG where ID=67302" succedes with 0 affected records. 
If, however, I try to insert a record with that ID I receive an error about violating a uniqueness constraint. So the record is not there but still it IS there "a little"
The only way I have found arround the problem is to:
1) Break the replication connection by supplying wrong credentials at the sender side (just to make sure)
2) Modify the table properties by altering the key from autoinc to integer (or vice versa) - then I can insert a record with the "forbidden" key value and delete it again
3) Reverse the modification of the table properties
4) Restore the replication connection by supplying the rigth credentials
5) Possibly repeating 1-4 on an other table in case the replikation record was part of a replikation transaktion
 
Is there something I should do different to avoid the errors or easen the repair?
 

--
Venlig hilsen
Tom Arleth
Ascott Software Danmark A/S


Mark Wilkins Posted on 2008-08-01 14:57:08.0Z
From: "Mark Wilkins" <mark@no.email>
Newsgroups: Advantage.Replication
References: <48918c0e@solutions.advantagedatabase.com>
Subject: Re: Replication stops and is time consuming to restart
Date: Fri, 1 Aug 2008 08:57:08 -0600
Lines: 230
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_03D9_01C8F3B4.94579610"
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.3138
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198
NNTP-Posting-Host: 10.24.38.228
Message-ID: <489322d9@solutions.advantagedatabase.com>
X-Trace: 1 Aug 2008 08:51:05 -0700, 10.24.38.228
Path: solutions.advantagedatabase.com!solutions.advantagedatabase.com!10.24.38.228
Xref: solutions.advantagedatabase.com Advantage.Replication:295
Article PK: 1134150

Hi Tom,
 
I'm not completely sure I understand all the steps, but I think you are confusing the unique ID with the record number.  That error message portion that has "Record=67302" is actually referring to the physical record number of the source table that contains the problematic data.  In other words, if you look at that physical record in AESOPERATIONLOG on the source database, you should be able to see the ID that is causing the problem.  I suppose that if the ID field is an auto-increment value and there have never been any deleted records, then the physical record number and the ID value would be the same, but that generally is not the case. 
 
You could also turn on the option that logs the data values with the error message.  I believe it is off by default for security reasons.  You can get to that option in the subscription properties (in Advantage Data Architect).  The Advanced tab has a checkbox "Log Data for Failed Replication Updates".  That will cause the failure error message to include the data associated with the INSERT statement.
 
Note that using and auto-increment field with replication can be problematic.  If the ID field is auto-increment and someone appends a record at the target table independent of the replication operation, then it will produce an auto-increment value that will likely conflict with a future replicated value.  The help file topic titled "Auto-Updating Fields" discusses some of this.

Mark Wilkins
Advantage R&D
 
"Tom Arleth" <ta@ascott.dk> wrote in message news:48918c0e@solutions.advantagedatabase.com...
Hi,

We use replication 4 remote places in Greenland to maintain up-to-date central copies of the de-central databases. The 4 sites have thin and apparently also unstable lines to Nuuk where central copies resides.
Every week or so one of the replications hangs whith an error like:
Error 7200:  AQE Error:  State = HY000;   NativeError = 7057;  [Extended Systems][Advantage SQL][ASA] Error 7057:  The record update failed.  The key value produced from this record was not unique, and an index for the current table has the UNIQUE property.  The key value supplied for AESOPERATIONLOG:ID is not unique. 
EntryID=332394, Subscription=central_backup_sub, Table=AESOPERATIONLOG, Record=67302, SQL=INSERT INTO "AESOPERATIONLOG" ("ID", "DATETIME", "USER_NAME", "APP_NAME", "TABLE_NAME", "OPERATION", "KEY1", "KEY1_VALUE", "__CLIENTID", "ADD_INFO") VALUES ( ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )
When at the receiver end I open the table AESOPERATIONLOG with the architect there is no recod with the offending recordID and a "delete from AESOPERATIONLOG where ID=67302" succedes with 0 affected records. 
If, however, I try to insert a record with that ID I receive an error about violating a uniqueness constraint. So the record is not there but still it IS there "a little"
The only way I have found arround the problem is to:
1) Break the replication connection by supplying wrong credentials at the sender side (just to make sure)
2) Modify the table properties by altering the key from autoinc to integer (or vice versa) - then I can insert a record with the "forbidden" key value and delete it again
3) Reverse the modification of the table properties
4) Restore the replication connection by supplying the rigth credentials
5) Possibly repeating 1-4 on an other table in case the replikation record was part of a replikation transaktion
 
Is there something I should do different to avoid the errors or easen the repair?
 

--
Venlig hilsen
Tom Arleth
Ascott Software Danmark A/S


Tom Arleth Posted on 2008-08-04 06:19:51.0Z
From: "Tom Arleth" <ta@ascott.dk>
Newsgroups: Advantage.Replication
References: <48918c0e@solutions.advantagedatabase.com> <489322d9@solutions.advantagedatabase.com>
Subject: Re: Replication stops and is time consuming to restart
Date: Mon, 4 Aug 2008 08:19:51 +0200
Lines: 320
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_002F_01C8F60A.E3E32840"
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.5512
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512
NNTP-Posting-Host: 62.242.32.56
Message-ID: <48969f0f@solutions.advantagedatabase.com>
X-Trace: 4 Aug 2008 00:17:51 -0700, 62.242.32.56
Path: solutions.advantagedatabase.com!solutions.advantagedatabase.com!62.242.32.56
Xref: solutions.advantagedatabase.com Advantage.Replication:296
Article PK: 1134151

Hi Mark,
 
You are somewhat right. I was confusing the Unique ID with the recordnumber but they are identical for my case, since the ID is an auto-inc and there are never  any deletes (the table is a log of the operations in the database).
It is not a problem for me to find where the record _should_ be at the receiver (it is always the last in the table) - the problem is that the record is not there if I try to delete it, yet IS there if I (or the replication sender) tries to insert it again. 
It seems like the record is "reserved" for insertion by the replication as part of a transaction with a number of other records pending and then the connection between the sender and receiver is lost - leaving the sender trying to resend the entire transaction (which was not ack'ed the the receiver) and leaving the receiver waiting for the rest of the records in the transaction unwilling to allow a new insert of the same "main" record.
Question is: Is there a better way for me to untie the knot than restructuring the table forth and back between auto-inc and integer for the ID-field? Or ofcource even better to avoid the problem alltogether?
 
PS: The problem with somebody else messing up the auto-inc values at the receiver is not likely since the only userse with write acces to the database is the replication user and ADSSYS (me). The receiver is basically a central backup of a de-central server.

--
--
Venlig hilsen
Tom Arleth
Ascott Software Danmark A/S
"Mark Wilkins" <mark@no.email> skrev i en meddelelse news:489322d9@solutions.advantagedatabase.com...
Hi Tom,
 
I'm not completely sure I understand all the steps, but I think you are confusing the unique ID with the record number.  That error message portion that has "Record=67302" is actually referring to the physical record number of the source table that contains the problematic data.  In other words, if you look at that physical record in AESOPERATIONLOG on the source database, you should be able to see the ID that is causing the problem.  I suppose that if the ID field is an auto-increment value and there have never been any deleted records, then the physical record number and the ID value would be the same, but that generally is not the case. 
 
You could also turn on the option that logs the data values with the error message.  I believe it is off by default for security reasons.  You can get to that option in the subscription properties (in Advantage Data Architect).  The Advanced tab has a checkbox "Log Data for Failed Replication Updates".  That will cause the failure error message to include the data associated with the INSERT statement.
 
Note that using and auto-increment field with replication can be problematic.  If the ID field is auto-increment and someone appends a record at the target table independent of the replication operation, then it will produce an auto-increment value that will likely conflict with a future replicated value.  The help file topic titled "Auto-Updating Fields" discusses some of this.

Mark Wilkins
Advantage R&D
 
"Tom Arleth" <ta@ascott.dk> wrote in message news:48918c0e@solutions.advantagedatabase.com...
Hi,

We use replication 4 remote places in Greenland to maintain up-to-date central copies of the de-central databases. The 4 sites have thin and apparently also unstable lines to Nuuk where central copies resides.
Every week or so one of the replications hangs whith an error like:
Error 7200:  AQE Error:  State = HY000;   NativeError = 7057;  [Extended Systems][Advantage SQL][ASA] Error 7057:  The record update failed.  The key value produced from this record was not unique, and an index for the current table has the UNIQUE property.  The key value supplied for AESOPERATIONLOG:ID is not unique. 
EntryID=332394, Subscription=central_backup_sub, Table=AESOPERATIONLOG, Record=67302, SQL=INSERT INTO "AESOPERATIONLOG" ("ID", "DATETIME", "USER_NAME", "APP_NAME", "TABLE_NAME", "OPERATION", "KEY1", "KEY1_VALUE", "__CLIENTID", "ADD_INFO") VALUES ( ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )
When at the receiver end I open the table AESOPERATIONLOG with the architect there is no recod with the offending recordID and a "delete from AESOPERATIONLOG where ID=67302" succedes with 0 affected records. 
If, however, I try to insert a record with that ID I receive an error about violating a uniqueness constraint. So the record is not there but still it IS there "a little"
The only way I have found arround the problem is to:
1) Break the replication connection by supplying wrong credentials at the sender side (just to make sure)
2) Modify the table properties by altering the key from autoinc to integer (or vice versa) - then I can insert a record with the "forbidden" key value and delete it again
3) Reverse the modification of the table properties
4) Restore the replication connection by supplying the rigth credentials
5) Possibly repeating 1-4 on an other table in case the replikation record was part of a replikation transaktion
 
Is there something I should do different to avoid the errors or easen the repair?
 

--
Venlig hilsen
Tom Arleth
Ascott Software Danmark A/S


Mark Wilkins Posted on 2008-08-04 21:31:08.0Z
From: "Mark Wilkins" <mark@no.email>
Newsgroups: Advantage.Replication
References: <48918c0e@solutions.advantagedatabase.com> <489322d9@solutions.advantagedatabase.com> <48969f0f@solutions.advantagedatabase.com>
Subject: Re: Replication stops and is time consuming to restart
Date: Mon, 4 Aug 2008 15:31:08 -0600
Lines: 192
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_0568_01C8F647.1DA96210"
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.3138
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198
NNTP-Posting-Host: 10.24.38.228
Message-ID: <489773b0@solutions.advantagedatabase.com>
X-Trace: 4 Aug 2008 15:25:04 -0700, 10.24.38.228
Path: solutions.advantagedatabase.com!solutions.advantagedatabase.com!10.24.38.228
Xref: solutions.advantagedatabase.com Advantage.Replication:297
Article PK: 1134152

Hi Tom,
 
It sounds like the index file (unique index on the ID field) is somehow getting the new key value in it, but the table is not.  That would explain the 7057 error that occurs on the insert but no record is deleted.  The unique index violation check happens before any updates are made, but the DELETE statement would not actually find the record.  I am unable to come up with an explanation for that situation.  I can cause it manually on free tables by renaming index files, removing records, renaming files again, etc.  But it is not at all clear to me how it could happen in normal operations. 
 
If you could send us a copy of the target table that is getting the 7057 error as well as the error log files (ads_err.*) at the target location, it might help us understand what is going on.  If you do, you can send them to advantage@ianywhere.com attn: Mark Wilkins, I can look at them. 
 
If you are using the destination as a backup dataset, one way to avoid this problem would be to simply change the ID index at the target to a non-unique index.  That would eliminate the 7057 error, and the INSERTs would succeed.  However, that is still just hiding the source of the problem.  Also, if the problem is actually that somehow an extra key is ending up in the index file, you could "fix" individual occurrences of the problem by reindexing the table (rather than doing the restructure operation).

Mark Wilkins
Advantage R&D
"Tom Arleth" <ta@ascott.dk> wrote in message news:48969f0f@solutions.advantagedatabase.com...
Hi Mark,
 
You are somewhat right. I was confusing the Unique ID with the recordnumber but they are identical for my case, since the ID is an auto-inc and there are never  any deletes (the table is a log of the operations in the database).
It is not a problem for me to find where the record _should_ be at the receiver (it is always the last in the table) - the problem is that the record is not there if I try to delete it, yet IS there if I (or the replication sender) tries to insert it again. 
It seems like the record is "reserved" for insertion by the replication as part of a transaction with a number of other records pending and then the connection between the sender and receiver is lost - leaving the sender trying to resend the entire transaction (which was not ack'ed the the receiver) and leaving the receiver waiting for the rest of the records in the transaction unwilling to allow a new insert of the same "main" record.
Question is: Is there a better way for me to untie the knot than restructuring the table forth and back between auto-inc and integer for the ID-field? Or ofcource even better to avoid the problem alltogether?
 
PS: The problem with somebody else messing up the auto-inc values at the receiver is not likely since the only userse with write acces to the database is the replication user and ADSSYS (me). The receiver is basically a central backup of a de-central server.

--
--
Venlig hilsen
Tom Arleth
Ascott Software Danmark A/S
 


Tom Arleth Posted on 2008-08-05 11:15:10.0Z
From: "Tom Arleth" <ta@ascott.dk>
Newsgroups: Advantage.Replication
References: <48918c0e@solutions.advantagedatabase.com> <489322d9@solutions.advantagedatabase.com> <48969f0f@solutions.advantagedatabase.com> <489773b0@solutions.advantagedatabase.com>
Subject: Re: Replication stops and is time consuming to restart
Date: Tue, 5 Aug 2008 13:15:10 +0200
Lines: 233
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_000D_01C8F6FD.500E4390"
X-Newsreader: Microsoft Outlook Express 6.00.2900.5512
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512
NNTP-Posting-Host: 62.242.32.56
Message-ID: <489835c8@solutions.advantagedatabase.com>
X-Trace: 5 Aug 2008 05:13:12 -0700, 62.242.32.56
Path: solutions.advantagedatabase.com!solutions.advantagedatabase.com!62.242.32.56
Xref: solutions.advantagedatabase.com Advantage.Replication:298
Article PK: 1134154

Hi Mark,
 
I'll download a (live)backup so I can get the target table (should I free it from the database or will an export make do?). Would the ads_err.* files from the sender (in addition to those from the target) be of any help too?

--
Venlig hilsen
Tom Arleth
Ascott Software Danmark A/S
"Mark Wilkins" <mark@no.email> skrev i en meddelelse news:489773b0@solutions.advantagedatabase.com...
Hi Tom,
 
It sounds like the index file (unique index on the ID field) is somehow getting the new key value in it, but the table is not.  That would explain the 7057 error that occurs on the insert but no record is deleted.  The unique index violation check happens before any updates are made, but the DELETE statement would not actually find the record.  I am unable to come up with an explanation for that situation.  I can cause it manually on free tables by renaming index files, removing records, renaming files again, etc.  But it is not at all clear to me how it could happen in normal operations. 
 
If you could send us a copy of the target table that is getting the 7057 error as well as the error log files (ads_err.*) at the target location, it might help us understand what is going on.  If you do, you can send them to advantage@ianywhere.com attn: Mark Wilkins, I can look at them. 
 
If you are using the destination as a backup dataset, one way to avoid this problem would be to simply change the ID index at the target to a non-unique index.  That would eliminate the 7057 error, and the INSERTs would succeed.  However, that is still just hiding the source of the problem.  Also, if the problem is actually that somehow an extra key is ending up in the index file, you could "fix" individual occurrences of the problem by reindexing the table (rather than doing the restructure operation).

Mark Wilkins
Advantage R&D
"Tom Arleth" <ta@ascott.dk> wrote in message news:48969f0f@solutions.advantagedatabase.com...
Hi Mark,
 
You are somewhat right. I was confusing the Unique ID with the recordnumber but they are identical for my case, since the ID is an auto-inc and there are never  any deletes (the table is a log of the operations in the database).
It is not a problem for me to find where the record _should_ be at the receiver (it is always the last in the table) - the problem is that the record is not there if I try to delete it, yet IS there if I (or the replication sender) tries to insert it again. 
It seems like the record is "reserved" for insertion by the replication as part of a transaction with a number of other records pending and then the connection between the sender and receiver is lost - leaving the sender trying to resend the entire transaction (which was not ack'ed the the receiver) and leaving the receiver waiting for the rest of the records in the transaction unwilling to allow a new insert of the same "main" record.
Question is: Is there a better way for me to untie the knot than restructuring the table forth and back between auto-inc and integer for the ID-field? Or ofcource even better to avoid the problem alltogether?
 
PS: The problem with somebody else messing up the auto-inc values at the receiver is not likely since the only userse with write acces to the database is the replication user and ADSSYS (me). The receiver is basically a central backup of a de-central server.

--
--
Venlig hilsen
Tom Arleth
Ascott Software Danmark A/S
 


Mark Wilkins Posted on 2008-08-05 13:28:16.0Z
From: "Mark Wilkins" <mark@no.email>
Newsgroups: Advantage.Replication
References: <48918c0e@solutions.advantagedatabase.com> <489322d9@solutions.advantagedatabase.com> <48969f0f@solutions.advantagedatabase.com> <489773b0@solutions.advantagedatabase.com> <489835c8@solutions.advantagedatabase.com>
Subject: Re: Replication stops and is time consuming to restart
Date: Tue, 5 Aug 2008 07:28:16 -0600
Lines: 94
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_05AD_01C8F6CC.D3CD2F60"
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.3138
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198
NNTP-Posting-Host: 10.24.38.228
Message-ID: <48985404@solutions.advantagedatabase.com>
X-Trace: 5 Aug 2008 07:22:12 -0700, 10.24.38.228
Path: solutions.advantagedatabase.com!solutions.advantagedatabase.com!10.24.38.228
Xref: solutions.advantagedatabase.com Advantage.Replication:299
Article PK: 1134153

Hi Tom,
 
If the table is not encrypted, just a simple file copy is the best thing (copy the .adt, .adi., and .adm if there is one).  Exporting it would "correct" any problems.  If the table is encrypted, I will need the table password as well.  Decrypting it on your end would probably cause a reindex, which would fix the problem.
 
The error logs from the sender might also be useful.

Mark Wilkins
Advantage R&D
"Tom Arleth" <ta@ascott.dk> wrote in message news:489835c8@solutions.advantagedatabase.com...
Hi Mark,
 
I'll download a (live)backup so I can get the target table (should I free it from the database or will an export make do?). Would the ads_err.* files from the sender (in addition to those from the target) be of any help too?

--
Venlig hilsen
Tom Arleth
Ascott Software Danmark A/S
 


Tom Arleth Posted on 2008-08-29 12:50:50.0Z
From: "Tom Arleth" <ta@ascott.dk>
Newsgroups: Advantage.Replication
References: <48918c0e@solutions.advantagedatabase.com> <489322d9@solutions.advantagedatabase.com> <48969f0f@solutions.advantagedatabase.com> <489773b0@solutions.advantagedatabase.com> <489835c8@solutions.advantagedatabase.com> <48985404@solutions.advantagedatabase.com>
Subject: Re: Replication stops and is time consuming to restart
Date: Fri, 29 Aug 2008 14:50:50 +0200
Lines: 123
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_00A0_01C909E6.A45A7710"
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.5512
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579
NNTP-Posting-Host: 62.242.32.56
Message-ID: <48b7efa4@solutions.advantagedatabase.com>
X-Trace: 29 Aug 2008 06:46:28 -0700, 62.242.32.56
Path: solutions.advantagedatabase.com!solutions.advantagedatabase.com!62.242.32.56
Xref: solutions.advantagedatabase.com Advantage.Replication:302
Article PK: 1134157

Hi Mark,
 
I have just sendt the files to you by email.

--
--
Venlig hilsen
Tom Arleth
Ascott Software Danmark A/S
"Mark Wilkins" <mark@no.email> skrev i en meddelelse news:48985404@solutions.advantagedatabase.com...
Hi Tom,
 
If the table is not encrypted, just a simple file copy is the best thing (copy the .adt, .adi., and .adm if there is one).  Exporting it would "correct" any problems.  If the table is encrypted, I will need the table password as well.  Decrypting it on your end would probably cause a reindex, which would fix the problem.
 
The error logs from the sender might also be useful.

Mark Wilkins
Advantage R&D
"Tom Arleth" <ta@ascott.dk> wrote in message news:489835c8@solutions.advantagedatabase.com...
Hi Mark,
 
I'll download a (live)backup so I can get the target table (should I free it from the database or will an export make do?). Would the ads_err.* files from the sender (in addition to those from the target) be of any help too?

--
Venlig hilsen
Tom Arleth
Ascott Software Danmark A/S
 


Mark Wilkins Posted on 2008-08-29 18:17:25.0Z
From: "Mark Wilkins" <mark@no.email>
Newsgroups: Advantage.Replication
References: <48918c0e@solutions.advantagedatabase.com> <489322d9@solutions.advantagedatabase.com> <48969f0f@solutions.advantagedatabase.com> <489773b0@solutions.advantagedatabase.com> <489835c8@solutions.advantagedatabase.com> <48985404@solutions.advantagedatabase.com> <48b7efa4@solutions.advantagedatabase.com>
Subject: Re: Replication stops and is time consuming to restart
Date: Fri, 29 Aug 2008 12:17:25 -0600
Lines: 200
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_06F6_01C909D1.323133F0"
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.3138
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3350
NNTP-Posting-Host: 10.24.38.228
Message-ID: <48b83bc3@solutions.advantagedatabase.com>
X-Trace: 29 Aug 2008 12:11:15 -0700, 10.24.38.228
Path: solutions.advantagedatabase.com!solutions.advantagedatabase.com!10.24.38.228
Xref: solutions.advantagedatabase.com Advantage.Replication:303
Article PK: 1134158

Hi Tom,
 
The error log in the target server has quite a few 7009 errors, which are file write errors.  They are preceded by OS error code 33.  The following information is from the Windows error header file winerror.h:
 
//  The process cannot access the file because
//  another process has locked a portion of the file.
//
#define ERROR_LOCK_VIOLATION             33L
 
Some process appears to be locking portions of a number of different tables.  I would suspect some kind of backup software.  This is preventing Advantage from writing to various files and is causing problems.  It might also be due to virus scanning software, but that seems less likely to me; I have not aware of virus scanners locking portions of files like that to cause write problems. 
 
I am pretty sure what is happening is the following:
 
The source server is replicating a transaction to the target.  The "33" error occurs at some point in the transaction and prevents Advantage from writing something (possibly a record in a table).  When this happens the error is sent back to the source, and the source server will then attempt to roll back the transaction.  During the rollback, the same error (33) is occurring when the target tries to undo the transaction updates.  At this point, it has to abort the transaction - it can neither go forward to commit nor backward to rollback.  The server logs a 9057 at this time (it isn't really an internal error, though, so the 9000 class error is not terribly accurate).
 
One side affect of this type of situation is that an extra key can be added to an index file without the corresponding record.  Because the rollback could not be completed, that key does not get removed.  And I'm certain it is this key that causes the 7057 errors that are stopping replication.  The next replication attempt is probably re-trying that transaction which tries to insert the key that could not be rolled back.
 
In the two tables you sent, I can see that there is an extra key in each one.  If you download the utility at http://devzone.advantagedatabase.com/CodeCentral/Project.aspx?ProjID=120, it can be used to verify the index information.  In one table you sent, it reports one extra key. 
 
Once this occurs, you can resolve the immediate problem by reindexing the problem table at the target server.  To prevent it from occurring again, it will be necessary to resolve the 7009 (OS error 33) problem, which means figuring out what software is causing the problem.
 

Mark Wilkins
Advantage R&D
 
"Tom Arleth" <ta@ascott.dk> wrote in message news:48b7efa4@solutions.advantagedatabase.com...
Hi Mark,
 
I have just sendt the files to you by email.

--
--
Venlig hilsen
Tom Arleth
Ascott Software Danmark A/S
 


Tom Arleth Posted on 2008-09-02 08:49:27.0Z
From: "Tom Arleth" <ta@ascott.dk>
Newsgroups: Advantage.Replication
References: <48918c0e@solutions.advantagedatabase.com> <489322d9@solutions.advantagedatabase.com> <48969f0f@solutions.advantagedatabase.com> <489773b0@solutions.advantagedatabase.com> <489835c8@solutions.advantagedatabase.com> <48985404@solutions.advantagedatabase.com> <48b7efa4@solutions.advantagedatabase.com> <48b83bc3@solutions.advantagedatabase.com>
Subject: Re: Replication stops and is time consuming to restart
Date: Tue, 2 Sep 2008 10:49:27 +0200
Lines: 240
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_0020_01C90CE9.9309B510"
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.5512
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579
NNTP-Posting-Host: 62.242.32.56
Message-ID: <48bcfd11@solutions.advantagedatabase.com>
X-Trace: 2 Sep 2008 02:45:05 -0700, 62.242.32.56
Path: solutions.advantagedatabase.com!solutions.advantagedatabase.com!62.242.32.56
Xref: solutions.advantagedatabase.com Advantage.Replication:304
Article PK: 1134159

Hi Mark,
 
Thanks. I'll go hunting for the software causing the problem - it may well be that the backup is not configured to avoid the folders with the database tables. Until I find the guilty process at least I have a greatly improved way of fixing the problem. 

--
--
Venlig hilsen
Tom Arleth
Ascott Software Danmark A/S
"Mark Wilkins" <mark@no.email> skrev i en meddelelse news:48b83bc3@solutions.advantagedatabase.com...
Hi Tom,
 
The error log in the target server has quite a few 7009 errors, which are file write errors.  They are preceded by OS error code 33.  The following information is from the Windows error header file winerror.h:
 
//  The process cannot access the file because
//  another process has locked a portion of the file.
//
#define ERROR_LOCK_VIOLATION             33L
 
Some process appears to be locking portions of a number of different tables.  I would suspect some kind of backup software.  This is preventing Advantage from writing to various files and is causing problems.  It might also be due to virus scanning software, but that seems less likely to me; I have not aware of virus scanners locking portions of files like that to cause write problems. 
 
I am pretty sure what is happening is the following:
 
The source server is replicating a transaction to the target.  The "33" error occurs at some point in the transaction and prevents Advantage from writing something (possibly a record in a table).  When this happens the error is sent back to the source, and the source server will then attempt to roll back the transaction.  During the rollback, the same error (33) is occurring when the target tries to undo the transaction updates.  At this point, it has to abort the transaction - it can neither go forward to commit nor backward to rollback.  The server logs a 9057 at this time (it isn't really an internal error, though, so the 9000 class error is not terribly accurate).
 
One side affect of this type of situation is that an extra key can be added to an index file without the corresponding record.  Because the rollback could not be completed, that key does not get removed.  And I'm certain it is this key that causes the 7057 errors that are stopping replication.  The next replication attempt is probably re-trying that transaction which tries to insert the key that could not be rolled back.
 
In the two tables you sent, I can see that there is an extra key in each one.  If you download the utility at http://devzone.advantagedatabase.com/CodeCentral/Project.aspx?ProjID=120, it can be used to verify the index information.  In one table you sent, it reports one extra key. 
 
Once this occurs, you can resolve the immediate problem by reindexing the problem table at the target server.  To prevent it from occurring again, it will be necessary to resolve the 7009 (OS error 33) problem, which means figuring out what software is causing the problem.
 

Mark Wilkins
Advantage R&D
 
"Tom Arleth" <ta@ascott.dk> wrote in message news:48b7efa4@solutions.advantagedatabase.com...
Hi Mark,
 
I have just sendt the files to you by email.

--
--
Venlig hilsen
Tom Arleth
Ascott Software Danmark A/S