* Import of RM7 Citations

Writing and using plugins for Version 5 and above.
avatar
Mark1834
Megastar
Posts: 1030
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Import of RM7 Citations

Post by Mark1834 » 04 May 2021 18:33

Thanks, that makes sense. I’ll send you a PM. I’m getting increasingly confident that we are seeing anomalies in the RM export rather than plugin errors, so I’ll tweak the code to continue processing the file and list potential errors at the end, rather than aborting the process if it sees something it doesn’t understand.
Mark Draper

User avatar
BakerJL75
Famous
Posts: 162
Joined: 14 Dec 2020 11:29
Family Historian: V7

Re: Import of RM7 Citations

Post by BakerJL75 » 04 May 2021 21:27

Files sent
Thanks,
Jackie

avatar
Mark1834
Megastar
Posts: 1030
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Import of RM7 Citations

Post by Mark1834 » 04 May 2021 22:34

Had a quick look tonight before packing up. My suspicions were correct. All the errors are data anomalies in the RM GEDCOM export, either a source level field being recorded at citation level, or a field being mis-named when cited.

The non-living extract you sent me generated three errors due to source/citation confusion, all for the same source record, and one mis-named field. It may not be coincidence that they all relate to the same individual, Amelia Auvil. They are different errors to the ones you reported on the full database, suggesting that they are not linked to specific records.

The solution is clear. I can't do anything about how RM structures its GEDCOM file, so I will modify the plugin to tabulate data errors for manual checking later, but not abort the program when one occurs.
Mark Draper

User avatar
BakerJL75
Famous
Posts: 162
Joined: 14 Dec 2020 11:29
Family Historian: V7

Re: Import of RM7 Citations

Post by BakerJL75 » 04 May 2021 22:53

Thanks for looking. And that seems like a fine solution. Thanks
Thanks,
Jackie

User avatar
tatewise
Megastar
Posts: 22765
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import of RM7 Citations

Post by tatewise » 05 May 2021 09:18

In FH there is the File > Validate... command and in RM7 there are File > Database Tools... such as Test database integrity that perhaps may repair those GEDCOM data anomalies.

FH also has the Compare/Sync Source Templates feature. Does RM7 have anything similar?

Jackie, have you any recollection of that CreditLine field possibly being a Citation field that has since become a Source field in the Source Template definition?
If Source Citations had been created using CreditLine while it was a Citation field, they might get left behind in the database after it gets changed to be a Source field.
The Test database integrity tool might correct that anomaly.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry

User avatar
BakerJL75
Famous
Posts: 162
Joined: 14 Dec 2020 11:29
Family Historian: V7

Re: Import of RM7 Citations

Post by BakerJL75 » 05 May 2021 11:45

I ran all the database tools as soon as this came up. I've also run the database tools in Tom Holden's RMTrix which will sometimes catch things that RM misses.

RM7 automatically syncs the source template with the sources/citations as soon as you change them.

I used SQL Lite to examine (a copy of) the database. All the source tables look correct. The specific ones we are having issues with I also followed the links and they look correct. (Links may be the wrong term, I haven't done any SQL work in 30 years.)

It is possible the CreditLine was once a citation moved to a source, but I don't think it's likely I would do that. I always copy the RM templates and then modify. And I'd have no reason to have changed the CreditLine. And I checked the field in the source table to make sure CreditLine was a source.

All that said, this is an old, well used database. It originally came from TMG, then Legacy for awhile, then RM for a long while. So over the years, anything could have happened.

For me at least, Mark's plan to "modify the plugin to tabulate data errors for manual checking later" is a good solution. Even if I have a larger number of problem templates to correct manually, it will still save me a lot of work in changing my sources.
Thanks,
Jackie

User avatar
tatewise
Megastar
Posts: 22765
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import of RM7 Citations

Post by tatewise » 05 May 2021 11:53

That begs the question of where does the GEDCOM CreditLine citation field get its:
VALUE citing Pennsylvania Historical and Museum Commission, Harrisburg, Pennsylvania, Pennsylvania County Marriages, 1852-1973; County: Fulton; Year Range: 1947 - 1956; Roll Number: 549832

Does that text VALUE exist anywhere in the SQL?
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry

avatar
Mark1834
Megastar
Posts: 1030
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Import of RM7 Citations

Post by Mark1834 » 07 May 2021 22:31

All sorted now, thanks to Jackie kindly sending me a RM original of a large test file. It does appear that the RM GEDCOM export is flawed. However, a random sample of reported errors were all either source level data being duplicated in the citation or adding fields that were not present in the RM original. So far, we have found no examples of data being lost or corrupted. The large file gave around 400 such reports in 60k items of data being moved from UDF fields.

To recap, the plugin does 4 things consecutively
  1. Import the user-defined source templates from the RM GEDCOM file that FH discards. The resulting Source Template records are virtually identical to the RM originals.
  2. Reconstruct the built-in source templates as far as possible.
  3. Link RM sources back to their original templates, recreating as much as possible of the detailed structure that was lost when imported into FH.
  4. Move all the granular source data that are hidden in UDF fields back to their proper structure in the templated sources.
When the plugin encounters a data anomaly, the offending data are left as UDF fields and a Note record created to record the anomaly. This has details of exactly which template, source, fact, field, and individual/family it relates to, with clickable links for easy checking. If the attached Query is used to format the Records Window columns, you can view a spreadsheet-like table of all the anomalies that can be sorted on any column.

Over to the ex-RM users to try it out if you are interested. Test on a copy of your project first. It doesn’t matter if you have updated your project since importing from RM. The only things it uses the GEDCOM for are the template definitions. No old data are imported, and no project data are amended or deleted. It only moves UDF data back to their proper place.
Attachments
RM Source Import Error Records.fhq
(1 KiB) Downloaded 53 times
Import RootsMagic 7 Templated Sources (0.6).fh_lua
(26.51 KiB) Downloaded 56 times
Mark Draper

avatar
Mark1834
Megastar
Posts: 1030
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Import of RM7 Citations

Post by Mark1834 » 08 May 2021 09:21

As a wet weekend experiment, I have imported the test database into RM8 and exported from there. As far as I can tell, the results were identical. Good news - the plugin works on an RM8 export as well with no changes. Bad news - the data anomalies have not been fixed.
Mark Draper

User avatar
BakerJL75
Famous
Posts: 162
Joined: 14 Dec 2020 11:29
Family Historian: V7

Re: Import of RM7 Citations

Post by BakerJL75 » 14 May 2021 18:48

I've imported my sources and have been 'playing' with them. As far as I can tell, the imported source definitions are in the Gedcom, but not in the Project database. It would be nice to be able to export the definition, clean them up with an editor, and then import them back. For example, Rootsmagic is happy with <[AccessType]|viewed]>, but FH7 needs it to be <[AccessType]>. Or if a URL is hard coded, like http:///www.ancestry.com then FH7 needs only 2 //. ie. http://www.ancestry.com

I'm doubtful it can be done, but I'm always amazed what the folks in the forum know.
Thanks,
Jackie

User avatar
BakerJL75
Famous
Posts: 162
Joined: 14 Dec 2020 11:29
Family Historian: V7

Re: Import of RM7 Citations

Post by BakerJL75 » 14 May 2021 19:35

I will correct myself. It looks like the Vertical bar construction should work. I'll play with it some more.
Thanks,
Jackie

avatar
Mark1834
Megastar
Posts: 1030
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Import of RM7 Citations

Post by Mark1834 » 15 May 2021 07:49

Jackie, I'm not clear what you mean by "the imported source definitions are in the Gedcom, but not in the Project database". You can view the source template definitions (where features such as footnotes etc are defined) in the Records Window in the same way as any other record type, and edit in the Property Box.

If you don't have a Source Templates tab in the Records Window, it will be because it is not enabled (FH7 hides it by default, even if you are using Templated Sources). On the main FH menu, select Tools > Preferences > Records Window. Set the drop down option for Source Templates to either Hide if none or Always show.
Mark Draper

User avatar
BakerJL75
Famous
Posts: 162
Joined: 14 Dec 2020 11:29
Family Historian: V7

Re: Import of RM7 Citations

Post by BakerJL75 » 15 May 2021 11:59

Sorry I wasn't clear. If you go to Tools, Source Template Definitions the imported templates are not there. They are only in the records window. That's not a problem unless you want to export them to edit or use them in a different problem. File, Import/Export, Source Template Collection does not 'see' them, so you can't export them.

It's not a big issue, I just wondered if I was missing something simple.
Thanks,
Jackie

User avatar
ColeValleyGirl
Megastar
Posts: 3236
Joined: 28 Dec 2005 22:02
Family Historian: V7
Location: Cirencester, Gloucestershire
Contact:

Re: Import of RM7 Citations

Post by ColeValleyGirl » 15 May 2021 12:09

It sounds as if the plugin is creating the source template record but not the corresponding source template definition. And I don't believe there's a way to create a source template definition from a record outside a plugin, so it would have to be done by creating a text file representing the 'collection' within the plugin.

User avatar
BakerJL75
Famous
Posts: 162
Joined: 14 Dec 2020 11:29
Family Historian: V7

Re: Import of RM7 Citations

Post by BakerJL75 » 15 May 2021 12:26

Thank you. That is a much more succinct explanation. And I rather thought it would take a plugin. For my purposes, it's not worth the additional work to write the plugin. Just wanted to check.
Thanks,
Jackie

avatar
Mark1834
Megastar
Posts: 1030
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Import of RM7 Citations

Post by Mark1834 » 15 May 2021 13:08

I had not considered this extra refinement. If it is a simple mod to create the definition as well as the template, it would make sense to tweak the plugin to do both at the same time. One more new FH7 feature to learn... ;)
Mark Draper

User avatar
tatewise
Megastar
Posts: 22765
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import of RM7 Citations

Post by tatewise » 15 May 2021 13:33

Mark, in case you did not know, the master Source Template files are held in the ProgramData folder:
C:\ProgramData\Calico Pie\Family Historian\Source Templates\...
The plain text file format looks quite straightforward, so should not pose a major problem for your Plugin.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry

User avatar
ColeValleyGirl
Megastar
Posts: 3236
Joined: 28 Dec 2005 22:02
Family Historian: V7
Location: Cirencester, Gloucestershire
Contact:

Re: Import of RM7 Citations

Post by ColeValleyGirl » 15 May 2021 14:06

I haven't looked, Mark, but you will need to check if the file needs to be written as UTF-16 (as, for example, Fact Set files are).

User avatar
BakerJL75
Famous
Posts: 162
Joined: 14 Dec 2020 11:29
Family Historian: V7

Re: Import of RM7 Citations

Post by BakerJL75 » 15 May 2021 14:06

Mark, that would be great if you have time.
Thanks,
Jackie

User avatar
tatewise
Megastar
Posts: 22765
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import of RM7 Citations

Post by tatewise » 15 May 2021 14:32

Notepad says Source Template .fhst files are UTF-8 with BOM format.

I wonder why Fact Type .fhf files need to be UTF-16 LE format. That looks like a mistake to me.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry

User avatar
ColeValleyGirl
Megastar
Posts: 3236
Joined: 28 Dec 2005 22:02
Family Historian: V7
Location: Cirencester, Gloucestershire
Contact:

Re: Import of RM7 Citations

Post by ColeValleyGirl » 15 May 2021 14:46

Fact sets have been UTF-16 since V6 (and before) -- Research Planner has to read and write them... There's an API in V7 to read fact info but not to save it.

User avatar
tatewise
Megastar
Posts: 22765
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import of RM7 Citations

Post by tatewise » 15 May 2021 14:54

Digging around, it seems most customisation files use UTF-16 LE format.
Exceptions are Autotext, Source Templates, and Plugins.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry

User avatar
ColeValleyGirl
Megastar
Posts: 3236
Joined: 28 Dec 2005 22:02
Family Historian: V7
Location: Cirencester, Gloucestershire
Contact:

Re: Import of RM7 Citations

Post by ColeValleyGirl » 15 May 2021 15:05

Yes, I've just checked -- Research Planner has to write UTF16 queries as well. I spent a lot of time looking for a lua library that would handle utf16 as well as utf8 -- there was one that was lightning fast but I couldn't get it to build and work reliably... It's still on my todo list! Either that or sorting out something myself that uses the native windows capability. Note to self: Check how FileSystemObject and FIleStream objects handle it.

User avatar
tatewise
Megastar
Posts: 22765
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import of RM7 Citations

Post by tatewise » 15 May 2021 15:21

In the meantime, I recall that you are using my encoder library technique.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry

User avatar
ColeValleyGirl
Megastar
Posts: 3236
Joined: 28 Dec 2005 22:02
Family Historian: V7
Location: Cirencester, Gloucestershire
Contact:

Re: Import of RM7 Citations

Post by ColeValleyGirl » 15 May 2021 15:26

Correct, although I created a severely cut down version of your 'module' to do exactly what I needed.

Post Reply