* Duplicate Media Records

For users to report plugin bugs and request plugin enhancements; and for authors to test new/new versions of plugins, and to discuss plugin development (in the Programming Technicalities sub-forum). If you want advice on choosing or using a plugin, please ask in General Usage or an appropriate sub-forum.
Post Reply
avatar
rwh46
Newbie
Posts: 2
Joined: 16 Feb 2015 21:09
Family Historian: V6
Location: Cayman Islands

Duplicate Media Records

Post by rwh46 »

I have recently imported my family tree into FH6 from RootsMagic7 via GEDCOM. However, for every media item in RootsMagic that is linked to multiple facts I now have multiple individual media record in FH6, for example a single image (picture) of a church linked to say 12 different baptism facts in RootsMagic has imported into FH6 as 12 separate media images, each one linked to one of the individual baptism facts.

I have run the "Check for Possible Duplicate Media" plug in and it has identified 5,106 duplicate records in the result set. I have started the merge process in order to test it, and it works just fine, but I am wondering if there is any way to automatically merge and eliminate the 5,106 duplicates without having to do it one line at a time - a rather laborious and time consuming task to say the least.
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Duplicate Media Records

Post by tatewise »

This is a follow up of a [FHU] E-mail where I offered to modify the Plugin next week to automatically perform the merges.

In the mean time please try the direct import of the .rmg file into FH as per how_to:import_from_roots_magic|> Import from Roots Magic. If that advice is now obsolete and .rmgc files also do not import then the KB can be updated.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
avatar
rwh46
Newbie
Posts: 2
Joined: 16 Feb 2015 21:09
Family Historian: V6
Location: Cayman Islands

Re: Duplicate Media Records

Post by rwh46 »

Mike,

The rmg file extension applies to Roots Magic 1, 2 and 3 only. From RM v4 onwards (now at RM v7) the file extension is rmgc which is not recognized by Family Historian.

Regards

Richard
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Duplicate Media Records

Post by tatewise »

I have updated the Knowldege Base for Roots Magic.

I have modified the Check for Possible Duplicate Media Plugin and attached the slightly renamed Check for Possible Duplicated Media version 1.3.1 dated 24 June 2015.
Just double-left-click on the link to install it into FH.

I advise you take a File > Backup/Restore > Small Backup before running this Plugin.

It runs as before, but now offers the option to merge Duplicate Media.
If you answer Yes it then:
  • Checks the Duplicate field values (Title, Date, Format, etc) are same as main Record
  • Replaces each link to Duplicate media with a link to main Record
  • Deletes each Duplicate media record (but not any duplicate files)
  • Updates final Result Set to indicate "Records Merged"
Check the merges are OK, and if anything is unsatisfcatory use Edit > Undo Plugin Updates before closing FH.

Let me know if there are any problems with using the modified Plugin.

Finally, you can then run the original Check for Possible Duplicate Media Plugin and also the Check for Unlinked Media Plugin in case any duplicate files need deleting.
Last edited by tatewise on 11 May 2023 15:20, edited 1 time in total.
Reason: Attachment Check for Possible Duplicated Media.fh_lua deleted ~ ask Mike Tate (tatewise) if you need a copy
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
avatar
ian-8864
Silver
Posts: 5
Joined: 16 Mar 2021 15:34
Family Historian: V7

Re: Duplicate Media Records

Post by ian-8864 »

Hi Mike,

I have had the exact same issue as Richard (rwh46) with duplicate media created when importing my family tree into FH7 via a Gedcom created in RootsMagic 7. I ended up with over 2,800 duplicates.

I came across this post whilst trying to resolve my duplicates issue. I downloaded your Check for Possible Duplicated Media plugin and tried to run it in FH7. However, it would not run. It produced an error message.

As I knew that you had developed the plugin for FH6, I then logged onto an alternative PC that still had my FH6.2 installation on it. I went through the same process to import the Gedcom and then ran your Check for Possible Duplicated Media plugin. This time it successfully found all of the duplicate media records and using the merge option I was able to merge them together via your plugin. I was then able to move the newly created FH6.2 family tree to my new PC where FH7 is installed. When I opened the tree in FH7 it was upgraded as it had been created in FH6.2. Thank you so much - this has saved me so much time and effort.

Do you have any plans to upgrade this Check for Possible Duplicated Media plugin to run in FH7? It would certainly be a great help to me and I presume others to have a FH7 version that will not only find but also have the option to merge duplicate media records.

Kind Regards

Ian
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Duplicate Media Records

Post by tatewise »

Hi Ian,
I'm glad that worked for you.
To be honest, I wish I did not have to spend so much time & effort fixing what FH should do itself.
Would you be prepared to lend your weight to our complaints to Calico Pie about this problem?

It comes about from the format RM exports Media in the GEDCOM. (Other products are the same.)
It uses what we call Local Media Objects (LMO) instead of separate linked Media Records.
It is similar to the difference between local Note fields and separate linked Note Records.
So there can be multiple LMO all linked to the same Media file.

When FH imports that RM GEDCOM it converts LMO to Media Records.
However, it takes no notice of the duplicates and creates multiple identical Media Records and multiple copies of the file.
This has been reported years ago and again during FH V7 beta testing but never fixed.

Similar symptoms arose recently on the Familiy-Historian@groups.io forum: Import - multiple copes of media?

Ian, please report your symptoms to Calio Pie via their http://www.calico-pie.com/osticket/open.php Support Ticket System and report back what they say.

If they persist in not fixing the bug, then I'll have to update the Plugin for FH V7 but I'd rather not.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
Jane
Site Admin
Posts: 8508
Joined: 01 Nov 2002 15:00
Family Historian: V7
Location: Somerset, England
Contact:

Re: Duplicate Media Records

Post by Jane »

Mike, that is NOT a bug, it is working as designed even if it's not ideal and a change for this should be a wish request not a bug report.

Encouraging users to complain about minor problems will simply overload CP with trivial problems which can be sorted using a plugin easily enough.

If you are not happy to update the Duplicate Media Plugin I will do it
Jane
My Family History : My Photography "Knowledge is knowing that a tomato is a fruit. Wisdom is not putting it in a fruit salad."
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Duplicate Media Records

Post by tatewise »

How are users meant to know it is working as designed?
As far as I am aware CP have never responded to either of my reports or anyone else saying it is as designed.

IMO creating multiple identical Media Records and multiple identical copies of the same Media file is a bad design.
It happens to any users importing from Legacy and RootsMagic and other products.
It cannot create a good impression to such newcomers who won't yet know about Plugins and if using the 30-Day Trial may decide to go elsewhere. The user on groups-io said: "... your explanation really surprised me! ... trying to sort out the media problem and whether I should choose another program altogether."

My first report was in 2015 so it could have been resolved years ago. The response then was: "Thanks I have passed this one over to the developers." and nothing since. The response in FH V7 beta testing was similar.

Jane, you are welcome to incorporate the additional function of auto-merging Media records into your existing Check for Possible Duplicate Media Plugin which was the basis of my adaptation. Perhaps that should have happened back in 2015!
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
avatar
ian-8864
Silver
Posts: 5
Joined: 16 Mar 2021 15:34
Family Historian: V7

Re: Duplicate Media Records

Post by ian-8864 »

Hi Mike,

Many thanks for your prompt reply.

As requested I have raised a Support Ticket (881977) for this issue. I included a file containing three redacted entries from the Gedcom which were linked to the same media file but created three separate media files when imported into FH7.

I will post here when I get a response.
Regards
Ian

PS I only saw the subsequent post from Jane and your reply after I had raised the ticket.
avatar
ian-8864
Silver
Posts: 5
Joined: 16 Mar 2021 15:34
Family Historian: V7

Re: Duplicate Media Records

Post by ian-8864 »

My support ticket has been closed with a status of Active with the following comment:
"Thank you for your email, we have passed it over to our development team for review."

Regards
Ian
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Duplicate Media Records

Post by tatewise »

So it wasn't rejected on the grounds that it is working as designed?
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
ColeValleyGirl
Megastar
Posts: 5465
Joined: 28 Dec 2005 22:02
Family Historian: V7
Location: Cirencester, Gloucestershire
Contact:

Re: Duplicate Media Records

Post by ColeValleyGirl »

tatewise wrote: 17 Mar 2021 18:00 So it wasn't rejected on the grounds that it is working as designed?
Unlikely a first line support person could make that call. The developers however can mark it as such in their system (or more likely attach it to the existing reports that are already marked as working as designed.) All that has been accomplished is a waste of support and development time. (@ian-8864, this isn't aimed at you -- you followed Mike's poor suggestion in good faith)
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Duplicate Media Records

Post by tatewise »

That still does not answer my question: How are users meant to know it is working as designed?

Users are perfectly at liberty to report whatever they consider is working unsatisfactorily.

As I said before I am not aware of any reports that are already marked as working as designed.
Neither of my reports have been marked that way.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
ColeValleyGirl
Megastar
Posts: 5465
Joined: 28 Dec 2005 22:02
Family Historian: V7
Location: Cirencester, Gloucestershire
Contact:

Re: Duplicate Media Records

Post by ColeValleyGirl »

tatewise wrote: 17 Mar 2021 18:39 Users are perfectly at liberty to report whatever they consider is working unsatisfactorily.
Of course they are. In this case, you've already reported it multiple times, and encouraged other users to report it -- its the multiple reports that (as you know) I believe are ill-advised as a waste of everybody's time including yours.
tatewise wrote: 17 Mar 2021 18:39 Neither of my reports have been marked that way.
They wouldn't be if all we see is the first-line ticket system, which I believe to be the case.
avatar
ian-8864
Silver
Posts: 5
Joined: 16 Mar 2021 15:34
Family Historian: V7

Re: Duplicate Media Records

Post by ian-8864 »

I am sorry if I have resurrected a long-standing query when I did my first post last week. However, I do have to agree with Mike that it seems like poor design to me. I believe that the issue has the potential to impact the data integrity of your family tree.

I have sent a follow up to my original support ticket which is intended for the development team to whom the call has been passed. This is the gist of what I have added to the ticket.

In my specific case, some media files (primarily census records for families that inherently have a lot of links to them) have been replicated over 30 times by the FH7 import process! In total the import process created approx. 2,850 duplicate media files from an initial count of about 880.

I cannot understand the rationale behind designing an import system that processes input records to create multiple copies of the same record/file. In my humble view it is simply not good practice.

Surely when processing an input file (gedcom) with links to other files (media) then there has to be some checks done. When the import process comes across a record in the input file containing a link to another file (i.e. the media), then one of the first steps is to check “have I processed this media file before and therefore already created a copy in the media folder”. If the answer to that question is “yes” then surely the next step is to create a new link to the media file that has already been created and not to create a duplicate copy of the same media file. It seems to me that this check is not being done but I am sure that it is much more complicated than I have described.

What happens in the future when the user wants to make changes to / replace a media file and they have not merged duplicates or realised that any duplicates exist? There may be an enhanced copy of a family photo that the user wants to use to replace an existing photo that is linked to several people / facts in the tree. Which copy of the media file they amend / replace can have a bearing. If the user forgets or does not realise they have to amend / replace all copies of the media file then the data effectively loses its integrity and cannot be relied upon. You can end up with different versions of what should be the same media file within the media folder.

Over time as the user works on the family tree in FH7, the potential for more and more discrepancies to appear between records that should be identical grows and the risk of further erosion of data integrity within the family tree increases.

For this reason, I do not think that it is a minor issue for anyone who is populating FH7 via the gedcom import process from a different platform such as Roots Magic 7. All I want to do is to import my trees including the media into FH7 without it creating multiple copies of the media files.

I appreciate that Plugins, specifically the functionality to automatically merge duplicate media files (incorporated by Mike) in the Check for Possible Duplicated Media plugin for FH6, can detect and correct duplication issues with the data. However, in this case, it is fixing the symptom and not the root cause of the problem. This means that you have to be aware of the duplicate media file issue and remember to run the Plugin every time you import a gedcom. Also, there is currently no Plugin which automatically merges duplicates available for FH7.

Kind Regards

Ian
avatar
LeslieP
Diamond
Posts: 78
Joined: 03 Jan 2021 16:38
Family Historian: V7

Re: Duplicate Media Records

Post by LeslieP »

RM to FH convert just discovering this challenge. I definitely have multiple media records for a single file, when it should be multiple links to the one file. I guess I'm lucky in that the media itself isn't duplicated since I don't store my media in the FH folder.

This seems like a very strange weakness - the GEDCOM import "merges" multiple links to places and sources just fine, but falls down when it encounters multiple references to the same file name. I can't begin to understand why this is done this way.

Will be monitoring this thread for potential solutions. Thanks!
Leslie P
Houston, TX
from TMG to RootsMagic to FH7
publish to web via TNG
User avatar
cwhermann
Famous
Posts: 155
Joined: 20 Mar 2021 22:04
Family Historian: V7
Location: New Hampshire, US

Re: Duplicate Media Records

Post by cwhermann »

I am RM user looking at FH. This issue first came to my attention when I completed a GEDCOM transfer with the option to set up media files inside the project folder and was blown away by the duplication of images in the new media file created by FH. I completed a second transfer to maintain links to my original media file which worked fine to avoid the duplication of the images themselves, but still experienced the same duplication of records mentioned in earlier posts.

I use shared/family facts only for marriages and births and try to link images to every citation, so even something like an image of a census with a large family creates a huge maintenance issue as Ian stated. This could be a deal breaker for me on whether I make the switch to FH7.

Quite frankly, I was surprised by Jane's and Helen's responses, it is this same attitude by the development/program team at RM that has a significant number of their RM7 users looking at alternatives to RM8. In addition, portability of GEDCOM files is a huge issue as more and more genealogists try to corroborate research or share branches with other genealogists running different programs. I would hope that at some point, software developers would see the ease of GEDCOM file import/export as an opportunity not a "trivial problem" or "waste of support and development time".

Like Leslie, I will be monitoring this thread for potential solutions, but in the meantime I will be taking a closer look at some other options.
Regards,
Curt
Curtis Hermann
FH 7.0.15
User avatar
ColeValleyGirl
Megastar
Posts: 5465
Joined: 28 Dec 2005 22:02
Family Historian: V7
Location: Cirencester, Gloucestershire
Contact:

Re: Duplicate Media Records

Post by ColeValleyGirl »

cwhermann wrote: 24 Mar 2021 14:11 Quite frankly, I was surprised by Jane's and Helen's responses, it is this same attitude by the development/program team at RM that has a significant number of their RM7 users looking at alternatives to RM8
Except of course, Jane and I do not work for Calico Pie. I believe we both agree that this is poor design; but repeatedly logging as a fault something that is working as designed (no matter how poor that design is) is a waste of effort on everybody's part. (I never said reworking the design was a waste of time, and neither did Jane).

Would somebody please raise a Wish List topic (on which people can then vote), as that is the way to bring desired enhancements to Calico Pie's attention with an indication of how many people want them.
User avatar
Mark1834
Megastar
Posts: 2458
Joined: 27 Oct 2017 19:33
Family Historian: V7
Location: South Cheshire, UK

Re: Duplicate Media Records

Post by Mark1834 »

I’m not taking sides about which app does it “right”, but there are fairly fundamental design issues that get in the way of seamless exchange of data via GEDCOM. As Mike described before this thread got diverted into (another) squabble about how CP handle bug reports, RM exports its media details in GEDCOM in a different style to the one FH prefers. They are both perfectly valid, but different.

Similarly for sources, RM is built on a traditional database foundation, and steers its users strongly towards what FH calls a “lumped” source style (e.g. the 1940 Census is the source, and a typical project will contain multiple different citations to that source). That’s the natural way to do it for a traditional database. However, GEDCOM doesn’t handle that structure very well, so FH users tend to prefer the “split” style, where every citation to a different part of the census is a completely separate source. It’s not very elegant to the database purists, but fits GEDCOM much better.

A great strength of FH is its extendability via plugins (even though only a small minority of users would have the inclination or skills to write them, anybody can use and benefit from them). That means there’s at least a chance of these problems being fixable. With RM and FTM, you are limited to what the original designers choose to support.
Mark Draper
avatar
LeslieP
Diamond
Posts: 78
Joined: 03 Jan 2021 16:38
Family Historian: V7

Re: Duplicate Media Records

Post by LeslieP »

I have received word from FH Support that they are indeed working to improve the gedcom import process so that multiple media records don't get created. That's great news, for new folks!

Doesn't help those who have already done their import and have all the duplicated media. Looks like a plugin that does the merges automatically will indeed be needed.

Writing plugins is beyond my ability at this stage in my FH adventure, I'm still at the "identify problems" stage and have not yet advanced to "fix problems" yet.

I hate to be a pest, particularly when the mere fact that plugins CAN be created and data problems like this CAN be solved is in fact one of the huge, major, amazing incredible selling features of this software. But an exhancement to the duplicate finder that makes it possible to automatically merge the duplicates would really be appreciated!
Leslie P
Houston, TX
from TMG to RootsMagic to FH7
publish to web via TNG
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Duplicate Media Records

Post by tatewise »

Yes, that is an interesting point.
There is the File > Validate... command that could possibly perform the correction of duplicate media.

Local Media Objects (LMO) that RM and other products use can get added to a Project via File > Merge/Compare File... and other methods and scenarios.
Then the File > Validate... command performs a similar conversion from LMO to Media Records as the GEDCOM import.
I think that currently it also produces duplicate Media Records and files.
Hopefully, that command will be fixed to avoid duplicates as well as the GEDCOM import process.

It would be reasonable to expect File > Validate... to merge existing duplicate Media Records and files as part of its ‘validation’ regime, perhaps as an option just as the LMO to Media Records conversion is now.
Perhaps you could suggest both the above ideas to Calico Pie since you have established a dialogue.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
Post Reply