* Using a named list to add a fact to each indi in the list

For users to report plugin bugs and request plugin enhancements; and for authors to test new/new versions of plugins, and to discuss plugin development (in the Programming Technicalities sub-forum). If you want advice on choosing or using a plugin, please ask in General Usage or an appropriate sub-forum.
Post Reply
User avatar
paultt
Famous
Posts: 116
Joined: 18 Jan 2005 21:59
Family Historian: V7
Location: Hampshire, England
Contact:

Using a named list to add a fact to each indi in the list

Post by paultt »

This is a plea for Help!
Is there, or has anyone developed a plugin, that will loop through a named list and add an existing fact for each individual in that list?
I am using v7, and have tried to build a plugin without any success at all! Cannot get my brain around the coding of lua plugins!
Currently my gedcom on the 1820 Settlers to South Africa and their descendants has upwards of 175,000 individuals, of which just over 3000 are original settlers whom I have flagged. I can by using queries, determine the the names of their direct descendants, which are nearly 100,000 of those listed, and add them to a named list, 1Settler Descendants'. I can then use another query to determine how many of them DO NOT have a particular FACT and save that result into another list. The fact is:
1 FACT Yes
2 TYPE 1820 Lineage

Is it possible by a plugin to automate my manual process, and has anyone got a 'skeleton' plugin that I could possibly adapt, and use to get started in writing my own plugins?
Thanks
User avatar
tatewise
Megastar
Posts: 28333
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Using a named list to add a fact to each indi in the list

Post by tatewise »

I assume you have explored FHUG Getting Started Writing Plugins and followed its links to further advice.

Unfortunately, the Plugin API does not support direct access to Named Lists.
However, a Plugin to replicate your entire process does not need Named Lists as it can keep its own table of records.
It would loop through all Individual records and remember those with the record Flag in a Lua table.
It would loop through again testing if each Individual IsDescendantOf(...) anyone in the table and missing the 1820 Lineage fact.
It would then add that fact to those Individual records, and possibly create a Result Set list of them for you to review.

Some of the Sample Plugin Scripts perform similar Individual record looping with tests and changes.

If you are still completely stuck then ask again.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
paultt
Famous
Posts: 116
Joined: 18 Jan 2005 21:59
Family Historian: V7
Location: Hampshire, England
Contact:

Re: Using a named list to add a fact to each indi in the list

Post by paultt »

Thanks Mike for the tips on the process, which I have built in the attached script, but I have become stuck on how to loop through the built table.
Building the Settler table works and produces the correct number of Settlers (2922) who would potentially be all the progenitors. Now I can loop through the gedcom again, discarding known descendants and others, but how do I check if the record I am on, is a descendant of any one of those settlers in the tblSettlers, and then add the FACT if the 'Is Descendant Of' returns true, and then break out of that table loop to the next record in the gedcom.

I have attached my code so far, and my issue is between lines 107 and 125

Guidance or sample/actual code will be most welcome. No rush.
Attachments
Set Settler Lineage Fact.fh_lua
(5.68 KiB) Downloaded 109 times
User avatar
tatewise
Megastar
Posts: 28333
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Using a named list to add a fact to each indi in the list

Post by tatewise »

The way to loop through a table is to use a for j, k in ipairs (table) do statement. (There is a similar pairs variant.)
e.g.

Code: Select all

for num, dptr in ipairs (tblSettler) do
	local bDec = fhCallBuiltInFunction('IsDescendantOf',gptr,dptr)
	if bDec then	-- 1st is desc of 2nd
		local ptag = fhCreateItem(strTag,gptr)
		local isOK = fhSetValueAsText(ptag,"Yes")
		break	-- break out of the for loop
	end
end
To determine the GEDCOM tag value of your 1820 Lineage attribute for strTag use the API once near the beginning:
local strTag, strError = fhGetFactTag("1820 Lineage","Attribute","INDI",false)

Tips:
Don't use "NOT 1820 Settler Direct Descendant" flag as these Individuals may become descendants later.
Don't use "1820 Descendant" flag because your "1820 Lineage" attribute performs the same function and can be tested for existence in much the same way as you test for flags. i.e. if fhGetItemPtr(gptr, "~."..strTag):IsNull() then ...
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
paultt
Famous
Posts: 116
Joined: 18 Jan 2005 21:59
Family Historian: V7
Location: Hampshire, England
Contact:

Re: Using a named list to add a fact to each indi in the list

Post by paultt »

Brilliant, thanks Mike. I will give it a go!
User avatar
paultt
Famous
Posts: 116
Joined: 18 Jan 2005 21:59
Family Historian: V7
Location: Hampshire, England
Contact:

Re: Using a named list to add a fact to each indi in the list

Post by paultt »

Update. I have my first plugin to work, thanks for the help Mike, but, ..there is always a but...I am going to have to rethink my process! With 180,000 individuals in the gedcom, 100,000 already known as descendants of the 2900 settlers, leaves me with about 77,000 others to check. Running the script in the debug modes in the editor, it takes about 1 minute per person to check if they are a descendant and set the appropriate flags and facts in the loop. So, runtime of 77,000 minutes with is more than 50 days 24/7 ! Assuming that the debug mode is slower to process than if run directly, and is 10 times quicker, my math tells me that it is still approx 5 days to run! (wishing I had thought of this 15 years ago when the gedcom had only 6,000 indis).
I will have to keep my 'confirmed' Not a Settler descendant flag, and manually/visually based on my knowledge go through the 77000 and set that flag so the script ignores them, and I can possibly reduce the settler loop from the 2900 down to just the progenitor of the settler family, which could be about 500.
I've got some thinking to do, but I have learnt a lot about writing a plugin, and a lot about who I have in my gedcom!
Over and out for now!
Thanks once again
User avatar
tatewise
Megastar
Posts: 28333
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Using a named list to add a fact to each indi in the list

Post by tatewise »

Paul, can I interrogate your calculations?

To check one person, the largest time penalty is calling fhCallBuiltInFunction('IsDescendantOf',gptr,dptr) against upto 2,900 settlers, but on average that function should only be called 1,450 times because the break should abort the loop.
The time to add a fact and flag is trivial.

So you seem to be claiming that fhCallBuiltInFunction('IsDescendantOf',gptr,dptr) called 1,450 times takes a minute.
I've called that function that many times and it only takes a few seconds in debug mode, i.e. 20 times less than 1 minute.
Nevertheless, repeating that for 77K people will still take a long time.

This is the consequence of computing with combinations of large quantities of items. Efficiency becomes a major factor.
It is then a good idea to search for shortcuts or ways to reduce the combinations.
For example, are all your Individual records in the same Relationship Pool or does your Project have multiple Pools?
Individuals in different Pools cannot possibly be related so do not need to be checked for 'IsDescendantOf'.

So perhaps the plugin could be adapted to work on one Relationship Pool at a time.
i.e. Run the plugin as now but test each record using fhCallBuiltInFunction('RelationPool',ptr) and only if that matches the currently chosen Pool number will the record be added to tblSettler in the 1st loop or checked in the 2nd loop.
Hopefully, that dramatically reduces the combinational numbers with a consequential reduced run time, and the plugin can be run on batches of one Pool at a time.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
Jane
Site Admin
Posts: 8507
Joined: 01 Nov 2002 15:00
Family Historian: V7
Location: Somerset, England
Contact:

Re: Using a named list to add a fact to each indi in the list

Post by Jane »

I wonder if it might be quicker for each settler to "walk down" the tree and build a table of descendants, a simple table with the reference numbers in. Repeat with each settler, this should result in a table with each settler, skipping the branch if someone on it is already in the list. At the end you will have a list of all the descendants to check for your flags, facts etc.

This plugin has the function to get lists of Descendants etc
https://pluginstore.family-historian.co ... ant-counts
Jane
My Family History : My Photography "Knowledge is knowing that a tomato is a fruit. Wisdom is not putting it in a fruit salad."
User avatar
paultt
Famous
Posts: 116
Joined: 18 Jan 2005 21:59
Family Historian: V7
Location: Hampshire, England
Contact:

Re: Using a named list to add a fact to each indi in the list

Post by paultt »

Thanks for the ideas, Jane.
I will have a look at the code you suggest and experiment with getting subsets of it working for me.
About 80% of the settlers tree are all in one pool, due to the inter-marriages that took place, and others scattered in separate pools, so I am looking at that, and also as my current Settler table has approx 2920 indis, I am cutting that back with a different flag to only include a 'head' of the family and not include the spouses and children who were settlers. This should give me an estimated 1100 to loop through instead of the 2920.
Fortunately I will only need to run a successful script before exporting the data to my website, which is about once every two weeks.
Ta.
User avatar
paultt
Famous
Posts: 116
Joined: 18 Jan 2005 21:59
Family Historian: V7
Location: Hampshire, England
Contact:

Re: Using a named list to add a fact to each indi in the list

Post by paultt »

Update: I have managed to build my plugin to do exactly what I want, based on the logic that Jane supplied in her attachment she posted. In the process, I have become quite familiar with some of the peculiarities of lua (being a previous programmer in Assembler, Cobol, Basic, C and Php). However, I have an anomaly!
My script reads through my 175,000 INDI gedcom[78000kb], and builds a table with all those who have a progenitor flag, approx 1100 INDIs. I then read through that table and using Jane's (modified)code check each level of children in the descendancy of that progenitor, and check whether the Lineage attribute exist; if it does, check and update if necessary the value to Yes from yes; if it doesn't exist, create the Lineage attribute and the value. Each of the changed records are written out to another table which I display at the end of processing. The Progressbar increments with each progenitor completed.
For development and testing I created a duplicate of my master project using the gedcom task of copy project. Between each iteration of testing, I revert the gedcom to the initial copy, so I know I am working on a 'clean' set with record that should but don't have the Lineage fact..
Anomalies are:
1. Current tests using the Edit->Debug->Go and no breakpoints produces no errors and a perfect result, ending with my fhMessage and display of the result table. [ Duration timed as 33 minutes, which I can live with as in my normal process I would only need to run this prior to exporting the gedcom to TNG.]
2. If I reset the gedcom and select my Plugin from the Tools->Plugins and click Run, the Progress bar zips along to about 60% in approx 20 seconds, and then crashes, shutting down FH!
3. If I reset the gedcom and go back to the Edit-> Debug and Select Debug->Run, I get the same effect as 2 above. Crash and closure of FH.
4. If I reset the gedcom and go back to stage 1 again, it runs through to completion in the 33 minutes.
Questions:
a. Has anyone else seen this effect happen with the change from Debug->Go to Debug->Run? if so, what was the cause?
b. Does the plugin editor and debug have a bug?
c. How can I implement a trace in my plugin that will output the details/state of play at the point or immediately before the point of crash and shutdown?

For interest, my plugin script is attached.
Attachments
Set Settler Lineage Factv24.fh_lua
(9.82 KiB) Downloaded 104 times
User avatar
Jane
Site Admin
Posts: 8507
Joined: 01 Nov 2002 15:00
Family Historian: V7
Location: Somerset, England
Contact:

Re: Using a named list to add a fact to each indi in the list

Post by Jane »

Try adding
iup.SetGlobal("CUSTOMQUITMESSAGE","YES");
After the iuplua require

This is needed when you use iuplua in FH7 and you can get all sorts of unexpected results if it's not there.

If that does not fix it, then what I normally do is to use print() to write to the output on the editor, or append messages to a log file, opening and closing the file each time so I know it's getting saved.
Jane
My Family History : My Photography "Knowledge is knowing that a tomato is a fruit. Wisdom is not putting it in a fruit salad."
User avatar
paultt
Famous
Posts: 116
Joined: 18 Jan 2005 21:59
Family Historian: V7
Location: Hampshire, England
Contact:

Re: Using a named list to add a fact to each indi in the list

Post by paultt »

That's BRILLIANT, thanks Jane.

Adding iup.SetGlobal("CUSTOMQUITMESSAGE","YES"); has solved the run! Now just to tidy up and comment some of the code so I know what I did when I look at it again....old age memory problem!
User avatar
Ron Melby
Megastar
Posts: 917
Joined: 15 Nov 2016 15:40
Family Historian: V6.2

Re: Using a named list to add a fact to each indi in the list

Post by Ron Melby »

paul,

in CheckDuplicate line 106 you compare eq nill, and that must be a typo. it cant work correctly.

either == nil

or

function CheckDuplicate(table, ptr)
local id = fhGetRecordId(ptr)
if table[id]
return true
else
table[id] = id
return false
end
end

or
function CheckDuplicate(table, ptr)
local id = fhGetRecordId(ptr)
if table[id] return true end
table[id] = id
return false
end

or whatever you like.

I have very fat fingers myself.
-
FH V.6.2.7 Win 10 64 bit
User avatar
tatewise
Megastar
Posts: 28333
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Using a named list to add a fact to each indi in the list

Post by tatewise »

Yes, that mistake exists in the original 'Create and Update Ancestor and Descendant Counts' plugin script.

Paul, your plugin does not need the initial warning message about " ... create/update facts containing counts ... " because it does not update those counts but creates '1820 Lineage' attributes instead. However, for your personal use, the message is redundant or at least needs rewording.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
Jane
Site Admin
Posts: 8507
Joined: 01 Nov 2002 15:00
Family Historian: V7
Location: Somerset, England
Contact:

Re: Using a named list to add a fact to each indi in the list

Post by Jane »

BTW The original plugin does work, as nill is not set and therefore == matches as nil. So although it is incorrect it does work and will not cause any problems.
Jane
My Family History : My Photography "Knowledge is knowing that a tomato is a fruit. Wisdom is not putting it in a fruit salad."
User avatar
Ron Melby
Megastar
Posts: 917
Joined: 15 Nov 2016 15:40
Family Historian: V6.2

Re: Using a named list to add a fact to each indi in the list

Post by Ron Melby »

yes, clearly it does, but is undefined behavior, and may not always work, and brings no light to people trying to understand code. It works by serendipity, and that is a very hard mathematical function to produce a proof for.
FH V.6.2.7 Win 10 64 bit
Post Reply