Details
-
Improvement
-
Resolution: Unresolved
-
Minor
-
None
-
None
Description
When we've experienced performance problems in our Loader jobs, we found it hard to understand the timing breakdown, eg, Reading the Source Group Data, Reading the Membership Data, Resolving subjects, etc.
I migrated some PSPNG tools – ProgressMonitor, MDC, some date utilities, dynamic log levels – and sprinkled their use throughout the Loader (at least the SQL_GROUP_LIST flow).
After I create this Jira, I'll create a branch to hold this. I know we haven't generally embraced logging in some of these ways, or that PSPNG's MDC ideas haven't been generalized, so this Jira and Branch are really just a bookmark and example.
Some example log messages from running this at GT:
grouper-api;grouper_error.log;;;2019-10-04 04:07:32,997: [DefaultQuartzScheduler_Worker-8] INFO ProgressMonitor.logProgressIfNecessary(154) [why:;who:loader/;what:SQL_GROUP_LIST__etc:gatech:loaderjobs:affiliations2__e49b881e07234a959fc04ae2fe57e782/;step:reading db/] - - Loader: Fetching data Progress: 1850620 in 30.001s (3700000 items/min overall, 3700000 items/min recently).
grouper-api;grouper_error.log;;;2019-10-04 04:07:51,475: [DefaultQuartzScheduler_Worker-8] INFO ProgressMonitor.completelyDone(110) [why:;who:loader/;what:SQL_GROUP_LIST__etc:gatech:loaderjobs:affiliations2__e49b881e07234a959fc04ae2fe57e782/;step:reading db/] - - Loader: Fetching data Completed (Fetched all data): 3638327 items in 48.479s (4502972.8 items/minute)
This is promoted to INFO because the group changed:
grouper-api;grouper_error.log;;;2019-10-04 04:33:20,960: [pool-1-thread-323] INFO GrouperLoaderType.syncOneGroupMembership(3132) [why:;who:loader/;what:SQL_GROUP_LIST__etc:gatech:loaderjobs:EntitlementLoader__5d0f9ad9785b4f41bfb336864e1cee42/;step:metadata fetch/] - - gt:ref:affiliations:employee@dlpe done syncing membership, processed 495 records. Total members: 495, inserts: 1, deletes: 0