Introduction
Some PluginsPlugins are small programs that allow new features to be added without upgrading Family Historian itself; some plugins are written by Calico Pie and others are written by users. perform intensive repetitive operations, which on a large database in excess of 10,000 Individuals, may take a long time or need large amounts of memory.
This article suggests how these resources can be minimised by using a few simple techniques. If the Plugin run time is measured in minutes rather than seconds, then even a 10% saving becomes significant.
Global v Local
As a general rule local variables are more efficient than global variables, but there are exceptions.
When a function requires a lookup table of constants, such as below, then it is faster if it is global.
TblLookup = { A=1,E=2,I=3,O=4,U=5 }
This is because the global table is only created once, whereas a local table is created every time the function is called.
It can also help to define local variables that reference global variables, especially an indexed table entry. e.g.
local tblCode = TblLookup local intNumb = IntNumber local tblMode = TblMode[intNumb]
Where the same table lookup or other complex operation is required multiple times, then assign the result to a local variable and use it multiple times instead. e.g.
if tblCode[strA] > 1 and tblCode[strA] < intLast then intSum = intSum + tblCode[strA] end
becomes
local intCode = tblCode[strA] -- Look up the value once if intCode > 1 and intCode < intLast then intSum = intSum + intCode -- Use the local variable three times end
Some of the above techniques are illustrated in the Soundex (code snippet) examples, where the Global Variable Version runs about 5 times faster than the Local Variable Version, and the Function Prototype Version takes things one step further.
Although a local function within another function clarifies its scope of use, a global function is faster. Maybe it is due to the global function being defined only once, whereas a local function is defined every time its container function is called.
Progress Bar
The Progress Bar (code snippet) provides useful feedback and a cancel option for long running Plugins. However, if the Global Variable Version is called too frequently, its own code can significantly extend the run time.
Therefore, it may be better to avoid calling the ProgressDisplay.Step function on every loop step. e.g.
ProgressDisplay.Start("Loop Progress",9000)
intSteps = 0
for i = 1, 9000 do
intSteps = intSteps + 1
if intSteps == 100 then -- Note that if intSteps % 100 == 0 then is slower
intSteps = 0
ProgressBar.Step(100) -- Only update Progress Bar every 100 steps
end
if ProgressBar.Stop() then break end
end
This problem has been mitigated in the Function Prototype Version by only updating the display when necessary instead of every Step.
Do not make the ProgressBar.Stop() function conditional on the intSteps count, otherwise it may make interrupting the loop and other interactions less responsive.
Large Files
Sometimes it is necessary to process the contents of large files line by line. For this it is much faster to use table.insert and table.concat than string concatenation strText = strText..strLine. e.g.
local tblText = {}
for strLine in io.lines(strFile) do -- Read through the file line by line
strLine = strLine:gsub("abc","xyz")
table.insert(tblText,strLine) -- Insert the line of text as the next table entry
end
local strText = table.concat(tblText,"\n") -- Concatenate the lines of text separated by newline
SaveStringToFile(strText,strFile) -- See the Save String To File (code snippet)
Large Tables
Very large tables of data can arise, say when keeping results of each Individual Record in the database compared with every other Individual RecordEvery person in your tree will have a single Individual Record, which holds all the information about that individual that you have entered. You can view and edit Individual records in the Property Box Dialogue.. For smaller databases up to 10,000 Individuals, this amounts to less than 50,000,000 entries, but quickly escalates for larger databases, and can exhaust available memory.
To avoid this problem the table of results should be sorted and the lowest entries pruned off. e.g.
if intScore >= intMinimum then -- Continue if score is above lowest retained Results entry
table.insert(tblResults,{ Score=intScore, ... })
if #tblResults >= 2000 then -- Prune low scores from Results to avoid exhausting memory
table.sort( tblResults, function(tblA,tblB) return tblA["Score"] > tblB["Score"] end )
for i = 1 , #tblResults / 2 do
table.remove(tblResults) -- Remove the lower 50% of the sorted Results
end
intMinimum = tblResults[#tblResults]["Score"]
end
end
Data Tables
When testing for alternative data values it is tempting to use if … then … else … structures, but when there are more than a few values it can become inefficient. Consider the following where each data reference tag is tested several times:
ptrIndi = fhNewItemPtr()
ptrIndi:MoveToFirstRecord("INDI")
while ptrIndi:IsNotNull() do
local ptrData = fhNewItemPtr()
ptrData:MoveToFirstChildItem(ptrIndi)
while ptrData:IsNotNull() do
local strTag = fhGetTag(ptrData)
if strTag == "NAME" then
-- Handle names
elseif strTag == "FAMS" then
-- Handle spouse
elseif strTag == "SOUR" then
-- Handle source
end
ptrData:MoveNext()
end
ptrIndi:MoveNext()
end
The following data table method is more efficient, as each data reference is only tested once. It becomes even more efficient as the number of alternatives increases. The only condition is that all the functionsA 'function' is an expression which returns values based on computations. Typically, functions require data to be supplied to them as 'parameters'. A function in Family Historian is similar to a 'function' as used in spreadsheet applications (such as MS must support the same parameters.
function HandleNames(ptrData)
-- Handle names here
end
function HandleSpouse(ptrData)
-- Handle spouse here
end
function HandleSource(ptrData)
-- Handle source here
end
function Null()
-- Handle anything else
end
tblWhat = { -- Translate data tag to function
NAME = HandleNames;
FAMS = HandleSpouse;
SOUR = HandleSource;
}
ptrIndi = fhNewItemPtr()
ptrIndi:MoveToFirstRecord("INDI")
while ptrIndi:IsNotNull() do
local ptrData = fhNewItemPtr()
ptrData:MoveToFirstChildItem(ptrIndi)
while ptrData:IsNotNull() do
local strTag = fhGetTag(ptrData)
local action = tblWhat[strTag] or Null
action(ptrData) -- Call one of the functions above
ptrData:MoveNext()
end
ptrIndi:MoveNext()
end