* Memory management in fhSQL() and ADODB
- Mark1834
- Megastar
- Posts: 2644
- Joined: 27 Oct 2017 19:33
- Family Historian: V7
- Location: South Cheshire, UK
Memory management in fhSQL() and ADODB
This is a spin-off from memory management in the Ancestry Sync plugin, but is more general so I've started a new thread.
The OP has kindly been testing various refinements of the memory management to improve robustness with very large datasets. Most of the memory usage is in building and comparing large tables, and introducing additional garbage collection to these steps was relatively ineffective as Lua already handles tables very well, so there was little to collect.
The plugin is structured with an orchestrating function that controls when tables are built and compared, and the best strategy was to have no garbage collection during these individual processes, but to ensure that memory was immediately freed once the tables were finished with, either when they went out of scope, or by deliberately setting them to nil. Another advantage was that relatively few collections were required, so impact on performance was minimal.
However, there is one issue that I cannot resolve. The test plugin writes a log of its memory use during the various steps, and the first attachment shows a typical report from my largest test dataset of 65k individuals and around half a million facts. The first time I call it, everything is well behaved. You can see it build a large initial table, then additional tables that are garbage collected when finished with, and finally release the main table memory. However, if I call it a second time, it will fail partway through with the error shown in the second attachment (although sometimes it just gives a much simpler message to the effect of “out of memory”). This error appears to arise in Luacom/fhSQL library rather than my script. Up until the error, memory usage is virtually identical. Once it has failed like this, I have to close and restart FH to restore normal behaviour – just rerunning the plugin is not sufficient. A smaller test script of 25k individuals and 325k facts runs repeatedly (at least ten times) without any issues.
My qualitative hypothesis is that fhSQL() (or most probably the ADOBD object it creates) somehow has its own memory space that is not being cleared properly between runs, but we’re very much getting into the nitty gritty plumbing here that is above my pay grade!
Any suggestions for how to resolve this? I don’t know if fhSQL() was written by FHUG contributors or purely in-house by CP. I’ll raise it with them if we can’t resolve it here, but I suspect there may not be much they can do about it if it’s a limitation in a non-FH control or library.
The OP has kindly been testing various refinements of the memory management to improve robustness with very large datasets. Most of the memory usage is in building and comparing large tables, and introducing additional garbage collection to these steps was relatively ineffective as Lua already handles tables very well, so there was little to collect.
The plugin is structured with an orchestrating function that controls when tables are built and compared, and the best strategy was to have no garbage collection during these individual processes, but to ensure that memory was immediately freed once the tables were finished with, either when they went out of scope, or by deliberately setting them to nil. Another advantage was that relatively few collections were required, so impact on performance was minimal.
However, there is one issue that I cannot resolve. The test plugin writes a log of its memory use during the various steps, and the first attachment shows a typical report from my largest test dataset of 65k individuals and around half a million facts. The first time I call it, everything is well behaved. You can see it build a large initial table, then additional tables that are garbage collected when finished with, and finally release the main table memory. However, if I call it a second time, it will fail partway through with the error shown in the second attachment (although sometimes it just gives a much simpler message to the effect of “out of memory”). This error appears to arise in Luacom/fhSQL library rather than my script. Up until the error, memory usage is virtually identical. Once it has failed like this, I have to close and restart FH to restore normal behaviour – just rerunning the plugin is not sufficient. A smaller test script of 25k individuals and 325k facts runs repeatedly (at least ten times) without any issues.
My qualitative hypothesis is that fhSQL() (or most probably the ADOBD object it creates) somehow has its own memory space that is not being cleared properly between runs, but we’re very much getting into the nitty gritty plumbing here that is above my pay grade!
Any suggestions for how to resolve this? I don’t know if fhSQL() was written by FHUG contributors or purely in-house by CP. I’ll raise it with them if we can’t resolve it here, but I suspect there may not be much they can do about it if it’s a limitation in a non-FH control or library.
Mark Draper
- ColeValleyGirl
- Megastar
- Posts: 5643
- Joined: 28 Dec 2005 22:02
- Family Historian: V7
- Location: Cirencester, Gloucestershire
- Contact:
Re: Memory management in fhSQL() and ADODB
fhSQl is just a wrapper on Luacom/ADODB, plus some code to navigate the result set created by a select operation.
If you wanted to try to extract more info about the error, https://www.fhug.org.uk/forum/viewtopic ... 32#p121932 has some code that might help.
If you wanted to try to extract more info about the error, https://www.fhug.org.uk/forum/viewtopic ... 32#p121932 has some code that might help.
Helen Wright
ColeValleyGirl's family history
ColeValleyGirl's family history
- tatewise
- Megastar
- Posts: 28921
- Joined: 25 May 2010 11:00
- Family Historian: V7
- Location: Torbay, Devon, UK
- Contact:
Re: Memory management in fhSQL() and ADODB
I don't think I can add any constructive ideas, but sympathise with those memory issues.
Over the years, as large Projects have raised memory issues with my plugins, I have compiled a ragbag of workarounds.
Usually, with the cooperation of users, I hit upon a suitable solution that does not affect run time too much.
I can't be sure, but like Mark, I think that rerunning the same plugin sometimes escalates the memory issues.
Over the years, as large Projects have raised memory issues with my plugins, I have compiled a ragbag of workarounds.
Usually, with the cooperation of users, I hit upon a suitable solution that does not affect run time too much.
I can't be sure, but like Mark, I think that rerunning the same plugin sometimes escalates the memory issues.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
- Mark1834
- Megastar
- Posts: 2644
- Joined: 27 Oct 2017 19:33
- Family Historian: V7
- Location: South Cheshire, UK
Re: Memory management in fhSQL() and ADODB
Thanks - I tried using a local copy of fhSQL() and turning off the luacom abort as described so I could inspect the error, but all I succeeded in doing was generating an alternative error message -
I think there is a clue in the detailed trace above, where it describes "unknown error 7" at line 382, which is in the fhSQL select() function. That appears to be programmer-speak for "it's gone wrong, but not in any way that I anticipated" .
I'll raise it with CP and see what they say.
I suspect the plugin just ignored the error and carried on, but fell over and triggered this more generic-looking error message.I think there is a clue in the detailed trace above, where it describes "unknown error 7" at line 382, which is in the fhSQL select() function. That appears to be programmer-speak for "it's gone wrong, but not in any way that I anticipated" .
I'll raise it with CP and see what they say.
Mark Draper
- Jane
- Site Admin
- Posts: 8543
- Joined: 01 Nov 2002 15:00
- Family Historian: V7
- Location: Somerset, England
- Contact:
Re: Memory management in fhSQL() and ADODB
A good way to debug into fhSQL is simply to copy the code into the top of your test project, rather than including it that way you can debug down into it.
Jane
My Family History : My Photography "Knowledge is knowing that a tomato is a fruit. Wisdom is not putting it in a fruit salad."
My Family History : My Photography "Knowledge is knowing that a tomato is a fruit. Wisdom is not putting it in a fruit salad."
Re: Memory management in fhSQL() and ADODB
I have a couple programs that are real hog slaughters that I had to give up. I have less ambitious versions, but someday...
what I learned though ...
at your final executable statement before the plugin ends (the return or end) set a breakpoint. only one you need.
go to options and check everything to be visible.
when you hit the final statement, however you want to do it, get the list of everything visible then make statements to set them nil, (iup is especially dirty)
and collectgarbage('collect') even twice.
even when your program runs correctly.
can you now run two runs before it fails?
I had some heated discourse with cp, because I feel when I am out of a plugin, no matter what, they should reclaim all memory to zero, and end lua altogether, so it starts anew every plugin. I also said that when they throw up a window that says... plugin has been changed on disk, do you want to reload, they should clear all breakpoints automatically rather than sticking the debug point in some tacky chunk of memory, as they do.
what I learned though ...
at your final executable statement before the plugin ends (the return or end) set a breakpoint. only one you need.
go to options and check everything to be visible.
when you hit the final statement, however you want to do it, get the list of everything visible then make statements to set them nil, (iup is especially dirty)
and collectgarbage('collect') even twice.
even when your program runs correctly.
can you now run two runs before it fails?
I had some heated discourse with cp, because I feel when I am out of a plugin, no matter what, they should reclaim all memory to zero, and end lua altogether, so it starts anew every plugin. I also said that when they throw up a window that says... plugin has been changed on disk, do you want to reload, they should clear all breakpoints automatically rather than sticking the debug point in some tacky chunk of memory, as they do.
FH V.6.2.7 Win 10 64 bit
- Mark1834
- Megastar
- Posts: 2644
- Joined: 27 Oct 2017 19:33
- Family Historian: V7
- Location: South Cheshire, UK
Re: Memory management in fhSQL() and ADODB
Thanks Ron, I don't think the issue is conventional plugin memory, as ADODB fails with successive large retrievals even if collectgarbage('count') indicates that memory use is under control.
The other variation I tried last night was to create a separate connection for each retrieval, then close it and garbage collect as soon as it was finished with, but it didn't make any difference.
The other variation I tried last night was to create a separate connection for each retrieval, then close it and garbage collect as soon as it was finished with, but it didn't make any difference.
Mark Draper
Re: Memory management in fhSQL() and ADODB
well that post went in the bit bucket...
As Manuel of Fawlty Towers says: E-Ven-TU-ally
short and unexplained this time... but create them as weak tables.
https://www.lua.org/pil/17.html
https://stigmax.gitbook.io/lua-guide/co ... eak-tables
I believe you are resurrecting the result sets, because gc cant know what you are doing when you re-reference them.
https://www.lua.org/wshop18/Ierusalimschy.pdf
anyway its threaded. you cannot know when the objects are finalized as the gc travels to the root object, therefore
collectgarbage()
collectgarbage()
As Manuel of Fawlty Towers says: E-Ven-TU-ally
short and unexplained this time... but create them as weak tables.
https://www.lua.org/pil/17.html
https://stigmax.gitbook.io/lua-guide/co ... eak-tables
I believe you are resurrecting the result sets, because gc cant know what you are doing when you re-reference them.
https://www.lua.org/wshop18/Ierusalimschy.pdf
anyway its threaded. you cannot know when the objects are finalized as the gc travels to the root object, therefore
collectgarbage()
collectgarbage()
FH V.6.2.7 Win 10 64 bit
- Mark1834
- Megastar
- Posts: 2644
- Joined: 27 Oct 2017 19:33
- Family Historian: V7
- Location: South Cheshire, UK
Re: Memory management in fhSQL() and ADODB
Yes, I’d looked at both of those references, but got a bit lost in the technical jargon of the pdf in particular!
It probably needs some structured experiments working directly with luacom/ADODB rather than as part of a big plugin that’s doing lots of other things as well. Maybe that’s a job for a rainy day, but it’s a niche application of a niche plugin, so isn’t necessarily at the top of the to-do list.
CP replied very quickly, and it’s been “referred to the developers” for review, but (perfectly reasonably) it won’t be top of their list either.
It probably needs some structured experiments working directly with luacom/ADODB rather than as part of a big plugin that’s doing lots of other things as well. Maybe that’s a job for a rainy day, but it’s a niche application of a niche plugin, so isn’t necessarily at the top of the to-do list.
CP replied very quickly, and it’s been “referred to the developers” for review, but (perfectly reasonably) it won’t be top of their list either.
Mark Draper