* Using Excel VBA to convert from lumped to split sources

Importing from another genealogy program? This is the place to ask. Questions about Exporting should go in the Exporting sub-forum of the General Usage forum.
User avatar
Mark1834
Megastar
Posts: 2528
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Using Excel VBA to convert from lumped to split sources

Post by Mark1834 »

One final refinement - I have split out the source splitting and general format tidying into two separate workbooks, as they perform different functions and the latter is purely optional (a neat and tidy import to FH rather than hundreds of comments in the log files).

The source splitting is now completely generalised, and I have set it up to automatically find and split all sources that contain terms like Baptism Register, Probate Calendar, Census, etc. The new source name is generally a combination of the original lumped source, the citation, and other fields as appropriate to the source type to make it more readable to us mere humans, and hopefully keep it unique (at least in the context of my databases). Examples include
Baptism Register, Chelsea St. Luke, MDX: John Smith, 22 MAR 1817
Census, 1851: Dover, Kent, England (HO107/1632, folio 471 p33)
England and Wales National Probate Calendar: 1925, Jane Smith, and so on.
Adding both names to marriage sources is slightly more complicated, but I might do that later. It also adds an appropriate Source Type automatically.
Duplication is identified on the basis of the new name, so the final GEDCOM contains just one unique copy of each source. All media file links are transferred, but they are now source media rather than citation media.

As well as the conventional GEDCOM file, the output comprises an Excel workbook containing the original and new GEDCOM, plus a list of all the sources in the original file with those split highlighted for audit purposes.

Although it is not yet tested thoroughly, a quick check of a few randomly selected records did indicate that it also works as intended on the FH database GEDCOM, not just the FTM import. So in principle it could also be useful for established databases, not just new imports.

As discussed before, this is a rather niche application, as it relies on a third party software package (MS Excel) that not everybody will have, and the code does need some customisation to how the user names their sources. But it could be useful for demonstration pending the development of a user-friendly generalised FH plug-in, or even a simple stand-alone source splitter. It has at least saved me from a lot of tedious re-keying or copy/pasting as I gradually review my data over the coming weeks :) ....
Mark Draper
User avatar
Mark1834
Megastar
Posts: 2528
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Using Excel VBA to convert from lumped to split sources

Post by Mark1834 »

This is quite an old thread now that has probably run its course, but for completion I include below my final crib sheet for my FTM - FH migration. It is slightly high level rather than detailed "how to" instructions, and parts will be specific to my recording style and use of Excel as a GEDCOM processor, but it may serve as input to a more general FH plug in converter in the future.

1. In Family Tree Maker:

a. Identify new source types that will be split and reallocate individual citations as necessary (e.g. separating GRO certificates from index entries).
b. Identify CENS/DEAT/BURI records with notes, identify any incorrect allocations (e.g. individual note recorded as source note or vice versa), and combine note with description. (Optional step but avoids multiple notes for the same event, which personally I find a little neater).
c. Add individual citation details to source records that will be split, such as party details for Parish Register Marriages where hitherto all data has been stored in the text field. Splitting can be on basis of just the citation or the full text. For simplicity, I split on citation (PRO census reference, GRO index details, etc), but my Parish Register entries had no citation with all the details in the text. The splitting macro adds party names and dates automatically for events related to individuals, but this work around avoids the more complex processing required to add both party names to family events such as marriage.
d. Once data are finalised in FTM, export to GEDCOM with ANSI character set. No further updates should be made to FTM database.

2. In Excel

a. Process source converter to produce interim file of split records. Inspect list of split sources to ensure all required splittings have taken place (i.e. need to ensure source names are consistent between Excel macro and FTM).
b. Process this intermediate file using the format tidier to produce a final GEDCOM file ready for import into FH. (Optional step, but makes for a much simpler and neater import log and also handles a couple of other errors/omissions in the FTM file that FH misses).

3. In Family Historian

a. Start new database by importing final Excel-derived GEDCOM file. Inspect import logs to ensure no errors encountered.
b. Run plug-in to convert single-use note records to in-line event notes.
c. Delete source records with no citations (the original lumped source titles from census, parish register, civil registration and probate entries).
d. Practice with FH before committing this as your live database. In particular, assess the use of Ancestral Sources for new data entry (if you are converting from another program, you may have a lot of data already set up slightly differently - nothing wrong with that, so case by case whether it works for you).
Mark Draper
Post Reply