Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
Description
Parameter fullSyncTimeout_secs, used in scheduleAndAwaitFullSyncOfGroup, is hard-coded to 5 minutes, while parameter coordinationTimeout_secs, used in lockForFullSyncWhenNoIncrementalIsUnderway and lockForIncrementalProvisioningWhenNoFullSyncIsUnderway, is user configurable.
We are finding that, when very large groups (involving tens of thousands of users) are added or deleted, a FullSync is scheduled by PSPNG and although we are able to set a high value for coordinationTimeout_secs, the FullSync times out after 300 seconds, because of the hard coded value of fullSyncTimeout_secs, so the sync fails to be performed.
See error below:
300 seconds,clog=clog #1879294 / ChangeLog type: group: updateGroup,group=XXXXXXXXXXXXXXXX]
2018-12-05 16:01:02,616: [DefaultQuartzScheduler_Worker-10] WARN PspChangelogConsumerShim.processChangeLogEntries(111) - - Provisioning summary: Summary: 998 successes/1 failures. (998 successful entries
will be retried because they follow a failure in the queue.) First error was: FullSync timed out after 300 seconds
2018-12-05 16:01:02,617: [DefaultQuartzScheduler_Worker-10] ERROR ChangeLogHelper.processRecords(286) - - Did not get all the way through the batch! -1 != 1880292
2018-12-05 16:01:02,621: [DefaultQuartzScheduler_Worker-10] ERROR GrouperLoaderJob.runJob(485) - - Error on job: CHANGE_LOG_consumer_licensing
java.lang.RuntimeException: Error in loader job: null, check logs: Summary: 998 successes/1 failures. (998 successful entries will be retried because they follow a failure in the queue.) First error was: Fu
llSync timed out after 300 secondsDid not get all the way through the batch! -1 != 1880292
at edu.internet2.middleware.grouper.app.loader.GrouperLoaderJob.runJob(GrouperLoaderJob.java:474)
at edu.internet2.middleware.grouper.app.loader.GrouperLoaderJob.execute(GrouperLoaderJob.java:345)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
2018-12-05 16:01:02,627: [DefaultQuartzScheduler_Worker-10] ERROR GrouperLoaderJob.execute(348) - - Error running up job
java.lang.RuntimeException: Error in loader job: null, check logs: Summary: 998 successes/1 failures. (998 successful entries will be retried because they follow a failure in the queue.) First error was: Fu
llSync timed out after 300 secondsDid not get all the way through the batch! -1 != 1880292
at edu.internet2.middleware.grouper.app.loader.GrouperLoaderJob.runJob(GrouperLoaderJob.java:474)
at edu.internet2.middleware.grouper.app.loader.GrouperLoaderJob.execute(GrouperLoaderJob.java:345)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)