* Splitting lumped Generic or Templated sources by plugin

Importing from another genealogy program? This is the place to ask. Questions about Exporting should go in the Exporting sub-forum of the General Usage forum.
User avatar
Mark1834
Megastar
Posts: 2458
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Splitting lumped Generic or Templated sources by plugin

Post by Mark1834 »

There has been considerable discussion at various time on how to optimise sources when importing a tree from products such as FTM or RM that make extensive use of “lumped” sources, a format that GEDCOM (and therefore FH) doesn’t handle very well for anything other than the simplest examples. A Python script worked well for advanced users, but not everybody will be comfortable with installing additional software and making the necessary customisations.

The same functionality has now been implemented in a standard easy-to-use FH6/7 plugin.

Citations to a lumped source are regarded as equivalent if all of the below match (if present):
  • Where within Source
  • Text from Source (citation)
  • Attached media records (citation)
  • Notes (local and note records)
  • All citation level fields for a Templated source (FH7)
This is typical of how a lumped citation would be characterised in FTM or RM, with one common citation irrespective of how many records and facts it is copied to. When imported to FH, each individual citation is duplicated, making future revisions much more complicated. This problem is particular pronounced for Census citations, due to the large number of facts that could be cited. For example, a test database that originated in RM had 58 citations to the 1900 Census, but this became 471 in FH due to duplicated entries. After running the plugin, it was back to the original 58.

Each new split sources inherits its Title, Short Title, Repository, etc (and FH7 Template and source level data fields if relevant) from the original lumped source. For Generic sources, the “Where within Source” is appended to both the Title and Short Title. For example, if the lumped source is “Baptisms, Chelsea St Luke”, and the individual citation is to “1856, John Smith”, the new source is called “Baptism Chelsea St Luke: 1856, John Smith”. For Templated sources, up to three citation fields can be specified to form part of the new Title/Short Title.

All of the items listed above are moved from the old lumped citation to the new split source, so recorded only once. Other fields, such as Date Entered, Rating, Event Responsible (FH7 only) and citation level templated fields are kept separate for each citation, but also moved to the new split source.

The plugin does not delete the original lumped source, but it is left with no remaining citations (except any Rich Text hyperlinks in FH7), so can be deleted manually if no longer required.

I’ve tested it on my original FTM import from 2017 and on a test RM database and not found any problems, so feel free to try it out (initially on a copy of your project).

NOTE - Attachment deleted, see version 2 below
Last edited by Mark1834 on 21 May 2021 12:43, edited 1 time in total.
Mark Draper
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Splitting lumped Generic or Templated sources by plugin

Post by tatewise »

Just as an experiment I've run this plugin on the FH v7 Family Historian Sample Project.

The lumped Source Citations for GRO Birth Index and GRO Marriage Index all split perfectly OK.

The Source Citations for GRO Death Index are two identical Citations with blank fields and the plugin says:
"GRO Death Index is not a lumped source!"
I added a third Citation with a Where Within Source field value.
The plugin then says:
"GRO Death Index has just one unique citation, which will be converted to source level data.
Continue processing?"
When I click OK it produces the error:
"...:280: bad argument #2 to 'MoveToRecordById' (number expected, got nil)"
It seems that a Citation without any field values upsets the plugin (although I accept that is rather unlikely).

May I suggest a few refinements:
  1. Allow multiple Source records to be selected within the plugin or before running the plugin.
    Projects imported from such as FTM & RM need all Source Citations converted and selecting them one by one is tedious.
    In this case, is a progress bar required, or will users still have to confirm each Source needs splitting one by one?
  2. Add fhInitialise(6,0,0,"save_recommended") before calling main().
    That ensures only FH v6.0.0 or later is used and advises that any unsaved changes are saved.
  3. Add a Check Version Store feature to inform users when there is a later version available.
    See lines 1866-1916 in my Flexible CSV Importer plugin.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
Mark1834
Megastar
Posts: 2458
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Splitting lumped Generic or Templated sources by plugin

Post by Mark1834 »

Thanks Mike - thought you might be the first to take it for a test drive... ;)

A plugin should never crash, no matter how outlandish the input data. I had not considered a citation with no data apart from the name of the lumped source (even though CP used one in the Sample Project!), and it crashes when a blank "fingerprint" is used as a table key. Easy to fix for version 2.

Deliberate choice to keep version 1 to a single source. It keeps testing simple, and once I've established it works as planned on real user data, I can add the ability to process multiple, or even all, sources automatically. No progress bar at present as it is near-instantaneous even on a large project, but a multiple source version will need a simple one.

Not sure about fhInitialise() - I can't test it on FH5, but at the moment I have no reason to exclude that version. Is there a list of functions introduced at Version 6 that I can check against? FH seems to check for unsaved changes without it, as I get a warning if I try to quit FH immediately after running.

This could become a Store plugin once it has been tested thoroughly on real user projects, so that would be the time to look at an update check. I deliberately started a new thread for this so I can keep updates together, as forum plugins could be replaced by later versions without the user knowing.
Mark Draper
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Splitting lumped Generic or Templated sources by plugin

Post by tatewise »

The idea behind fhInitialise() is to save changes before running plugin just in case it crashes FH and changes are lost.

I assume you have FH v6 installed so check its FH Plugin Help and see if there is a 'What's New in Version 6' page.
Next time I fire up FH v6 I will do the same.
It seems that those Family Historian Plugin Help pages are NOT online via the KB.
There are main FH Help pages for FH v6 and FH v7:
https://www.family-historian.co.uk/help ... ialog.html
https://www.family-historian.co.uk/help ... ialog.html
They both have links to the Family Historian Plugin Help ms-its:fh_plugins.chm::/introduction.htm but they are broken links!

You need to add the Check Version Store feature ASAP so that users of the prototypes get alerted about the Plugin Store version and users of that first published version get alerted about the second version.
My functions cope with no matching plugin in the Plugin Store and say nothing.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
Mark1834
Megastar
Posts: 2458
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Splitting lumped Generic or Templated sources by plugin

Post by Mark1834 »

Ok, clear. fhInitialise() sounds like good discipline for any non-trivial published plugin. I’ll add the update check once I figure out exactly what it’s doing, so that’s my next assignment... :D

There’s nothing I can see that won’t work in FH5, so I won’t exclude it - whether it will ever be used is another question of course!
Mark Draper
User avatar
Mark1834
Megastar
Posts: 2458
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Splitting lumped Generic or Templated sources by plugin

Post by Mark1834 »

I've implemented a basic version checker, so users will be alerted automatically to an eventual store update (but not a new development version on this forum). I also recommend saving any pending changes to the database, and capture any empty source citations.

Dealing with multiple sources needs a bit more thought, as templated sources will always need to be defined individually due to their individual list of field names. That can come later, depending on user feedback on how they anticipate using the plugin.

Note: Attachment deleted as the plugin is now in the plugin store.
Mark Draper
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Splitting lumped Generic or Templated sources by plugin

Post by tatewise »

I suspect the behaviour when all Citation fields are empty is what you intended but does not fit the original description:
"The plugin does not delete the original lumped source, but it is left with no remaining citations (except any Rich Text hyperlinks in FH7), so can be deleted manually if no longer required."
In the case of empty Citations, the original lumped Source is left with the empty Citations, so should not be deleted manually.
Users need to be made aware that in these rare cases the original lumped Source will still have Citations.

Otherwise, it is looking good.

Handling lumped templated Source Citations in bulk does pose more of a challenge.
Currently, that would only apply to Projects imported from RM and needs a user dialogue for each Source.
Bulk generic Source Citations should be straightforward and satisfy most users.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
Mark1834
Megastar
Posts: 2458
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Splitting lumped Generic or Templated sources by plugin

Post by Mark1834 »

It should be clear already from the Citation count in the Records Window, but it’s easy enough to alert to any remaining citations at the end of the process.
Mark Draper
User avatar
BakerJL75
Famous
Posts: 201
Joined: 14 Dec 2020 11:29
Family Historian: V7

Re: Splitting lumped Generic or Templated sources by plugin

Post by BakerJL75 »

I tried this and it worked great! I don't mind running it on one source at a time. Thanks!
Thanks,
Jackie
User avatar
Mark1834
Megastar
Posts: 2458
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Splitting lumped Generic or Templated sources by plugin

Post by Mark1834 »

It's worth noting that the way you have presumably done the splitting, by importing the RM7 Source Templates first, and then splitting the templated sources, is safe but tedious. Even after the plugin is modified to allow multiple generic sources to be split with one pass, it will still be necessary to process templated sources individually.

If you apply the two steps the other way around, and split the imported sources before reapplying the templates, the split uses just the Where Within citation text, as for generic sources. The UDF fields are NOT copied to the new split sources and citations, resulting in data loss.

Although the number of users this could impact is small, it will be worth adding a check so the plugin does not split sources and citations where UDF fields are present.
Mark Draper
User avatar
Mark1834
Megastar
Posts: 2458
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Splitting lumped Generic or Templated sources by plugin

Post by Mark1834 »

Latest update attached.
  • Now allows multiple source selections. The user is presented with a Yes/No/Cancel confirmation for each source as they are processed, as I think this is a better option than just processing everything with no further confirmations.
  • It does not split sources with a RM template as a UDF (i.e., an RM import where the Source Templates have not yet been imported).
  • It warns about any other source UDF fields, but allows splitting after confirmation.
  • Has a stricter definition of what is a lumped source that can be split (a source with more than one citation with distinct and non-blank citation data (text, where within, media, etc), so cannot accidentally modify a non-lumped source.
  • Optionally deletes all unused sources at the end of the process (those with no links).
  • I have removed the version check against the Plugin Store, as it caused a noticeable lag in starting the plugin, and on occasions has given an error when the store is unavailable. There are other more general methods for checking that plugins are up to date, so this one has no reason to do a specific check in addition.
As expected, the download numbers are relatively low (not an active topic on the forum at the moment, and most users probably reducing their FH time as lockdowns ease and the weather improves :D), but it can be submitted to the Plugin Store later in the summer once we are confident there are no obvious errors.

Note: Attachment deleted as the plugin is now in the plugin store.
Mark Draper
User avatar
BakerJL75
Famous
Posts: 201
Joined: 14 Dec 2020 11:29
Family Historian: V7

Re: Splitting lumped Generic or Templated sources by plugin

Post by BakerJL75 »

Just wanted to let you know I've been using this and it's working great. Saving me tons of time! I am however, only doing one a a time as I like to look the source template over to make sure I like it before the split.
Thanks,
Jackie
User avatar
Mark1834
Megastar
Posts: 2458
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Splitting lumped Generic or Templated sources by plugin

Post by Mark1834 »

Thanks Jackie. Reassuring to see it’s doing ok with real user data.
Mark Draper
User avatar
Mark1834
Megastar
Posts: 2458
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Splitting lumped Generic or Templated sources by plugin

Post by Mark1834 »

Just a thought - users who want to process sources one at a time (and that makes perfect sense for templated sources ex RM) may want to modify line 17 of the plugin code by changing local tblS = fhPromptUserForRecordSel('SOUR') to local tblS = fhPromptUserForRecordSel('SOUR', 1). This gives a single pane selection window that is probably easier to use than having the wasted second pane of the general selector.
Mark Draper
User avatar
BakerJL75
Famous
Posts: 201
Joined: 14 Dec 2020 11:29
Family Historian: V7

Re: Splitting lumped Generic or Templated sources by plugin

Post by BakerJL75 »

Thanks, I'll try that later.
Thanks,
Jackie
avatar
andyho
Newbie
Posts: 1
Joined: 21 Jun 2021 04:52
Family Historian: V7

Re: Splitting lumped Generic or Templated sources by plugin

Post by andyho »

The plugin v0.3 worked just great. My Humpty-Dumpty source in Family Tree Maker (FTM) had 41 death certificates, comprising 446 source citations. Upon importing this source by GEDCOM into Family Historian (FH), my Humpty-Dumpty source broke into 446 pieces. The plugin put Humpty-Dumpty back together, casting him in a slightly different shape. RESULT: 1 source (in FTM) with 41 death certificates was reshaped into 41 sources (in FH), 1 for each death certificate. Nice work!
User avatar
Mark1834
Megastar
Posts: 2458
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Splitting lumped Generic or Templated sources by plugin

Post by Mark1834 »

Welcome to the FHUG, Andy. Glad it worked for you. Humpty Dumpty sources were my biggest bugbear when moving to FH. You may have seen it already, but the background to why FH mashes up source citations from FTM and RM is in this Knowledge Base article. Splitting up sources isn't fundamentally "better" than keeping them lumped, but it is a much better fit with the limitations of GEDCOM, and therefore FH design.
Mark Draper
avatar
MFriend
Famous
Posts: 111
Joined: 30 Jan 2021 07:43
Family Historian: V7

Re: Splitting lumped Generic or Templated sources by plugin

Post by MFriend »

Hi Mark:
As a test I used your plugin to convert a copy of my database from lumped (imported from FTM 2019) to sources.
I converted about 66,000 citations (from about 150 sources) with it as a test.

It does exactly what you want it to do... but it won't work for me. I now use FH 7 as my main program, but then I export a gedcom and media and import into FTM 2019 so I can upload updated trees with media (my goal is every few months). The issue for anyone using FTM and (or) ancestry is you can't upload media attached to a source, only a citation.

My question for you (and you can completely ignore this if its difficult to do... I have no idea how difficult it might be) is how difficult would it be to have your plugin (or a version of it) copy the citation information instead of moving the citation/media. Yes I know the information would be repeated in more than one place in the combined source/citation, but it would work better for uploading to ancestry I think. (of course it may not be worth your effort as this is probably a niche situation).

Matthew
User avatar
Mark1834
Megastar
Posts: 2458
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Splitting lumped Generic or Templated sources by plugin

Post by Mark1834 »

Matthew - my first thought is that it should be fairly easy to adapt the plugin to not delete the original citations. I would guide you on how to make the change (editing just a couple of lines of code) rather than post an alternative version. However, it's messy duplication.

That would not keep new sources and citations up to date, so you would have to duplicate the media in both source and citation every time if you wanted that to happen.

One possibility might be to run with your live data fully in the split form, as that works best in FH, and have a fairly simple plugin that generates a copy project with all the media copied from source to citation for upload. It's very niche, but if you wanted to start dabbling in plugins that could be your objective...
Mark Draper
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Splitting lumped Generic or Templated sources by plugin

Post by tatewise »

Matthew, in addition to Mark's comments, there are other considerations that I think are still true but open to correction.
FTM only supports Citations on Facts, so in FH you would need to manually apply that restriction.
It also does not allow Citation local Notes.

The Export Gedcom File plugin set for FTM handles some of the above scenarios (and could do more perhaps).
See https://pluginstore.family-historian.co ... maker-2019 for details.
It converts whole record Citations to synthesised Fact Citations.
All Media links in a Source Record are copied to every Citation of that Source, which solves part of your problem.
Perhaps it could copy Text From Source too, but I'm not sure how to create Where Within Source from a Source Record.

So, have you tried the Export Gedcom File plugin set for FTM after you have split your sources?
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Splitting lumped Generic or Templated sources by plugin

Post by tatewise »

This discussion of handling Media for FTM continues in Exporting Source Citation Media to FTM/Ancestry (19630).
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
Mark1834
Megastar
Posts: 2458
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Splitting lumped Generic or Templated sources by plugin

Post by Mark1834 »

This plugin has now been approved for the CP Plugin Store, and is available here. The release version is functionally equivalent to the latest draft (0.3), but with an added top level menu that gives access to the nicer single record menu if you just want to split one source, and a link to online help.
Mark Draper
avatar
MFriend
Famous
Posts: 111
Joined: 30 Jan 2021 07:43
Family Historian: V7

Re: Splitting lumped Generic or Templated sources by plugin

Post by MFriend »

Hi Mark:

Mike made some adjustments to his gedcom export plugin that makes the needed adjustments for those that might want to use FTM to upload data/media to Ancestry so there is no need to worry about making any changes to your plugin.

Exporting Source Citation Media to FTM/Ancestry (19630)

Thank you for the plugin :)

Matthew
User avatar
BakerJL75
Famous
Posts: 201
Joined: 14 Dec 2020 11:29
Family Historian: V7

Re: Splitting lumped Generic or Templated sources by plugin

Post by BakerJL75 »

Thanks. It’s been a great help.
Thanks,
Jackie
avatar
Ruth
Gold
Posts: 23
Joined: 27 Jun 2018 07:57
Family Historian: V7

Re: Splitting lumped Generic or Templated sources by plugin

Post by Ruth »

Thank you. Just tried this and it works well. Just sorry I didn't see this earlier.
Post Reply