* Import gedcom with photos from online url

Importing from another genealogy program? This is the place to ask. Questions about Exporting should go in the Exporting sub-forum of the General Usage forum.
User avatar
Valkrider
Megastar
Posts: 1563
Joined: 04 Jun 2012 19:03
Family Historian: V7
Location: Lincolnshire
Contact:

Import gedcom with photos from online url

Post by Valkrider »

I have been sent a gedcom which has images included using the OBJE tag with a reference to an online URL.

I have found that by default FH does not import these images. Is there a way to force it to as I have found a couple of other programs that do go off and grab the images (My Family for instance).
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import gedcom with photos from online url

Post by tatewise »

No, unfortunately, that is not possible.
I did start writing a Plugin that uses those OBJE URL to copy the files but most online sites use the HTTPS secure socket protocol and the libraries in FH only support the HTTP protocol. I asked CP to incorporate the secure sockets library but they were not able to do so. Even if it was implemented you would need the username and password of the online account to get past the HTTPS sign in.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
Valkrider
Megastar
Posts: 1563
Joined: 04 Jun 2012 19:03
Family Historian: V7
Location: Lincolnshire
Contact:

Re: Import gedcom with photos from online url

Post by Valkrider »

That's a pity @Mike that other progs can handle it but not FH.
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import gedcom with photos from online url

Post by tatewise »

Other progs presumably have implemented the HTTPS protocol internally.
FH could do the same internally as it is a widely implemented protocol.
This is what I reported to CP in Oct 2017 with Log #123878:
Investigation of the Lua Socket library reveals that although it supports http it does NOT support the Secure Socket Layer https protocol, which needs the Lua Sec library from https://github.com/brunoos/luasec/wiki.
It involves an ssl folder, ssl.lua & ssl\https.lua so would be invoked using a loadrequire(“ssl”) command.
Could you please compile that Lua Sec library and add it to the FH repertoire?
There are plenty of details via a Google Search for LuaSec and LuaSec Windows and LuaSec https, etc.
https is needed because most websites now demand it, and I’m trying to automate downloading media from such as ancestry.com and findmypast.com that require https even when no sign-in is necessary. There are several FHUG users interested in this Plugin development to enhance the migration of family trees into FH.
They replied:
We are still looking as we have not yet been able to find a version of Lua Sec compiled under the correct C compiler. Although the one you found compiled for Windows does seem to work, it may cause problems so we have not yet been able to allocate time to review the code.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
Jane
Site Admin
Posts: 8508
Joined: 01 Nov 2002 15:00
Family Historian: V7
Location: Somerset, England
Contact:

Re: Import gedcom with photos from online url

Post by Jane »

Downloading from https in FH plugins is possible using luacom, but the challenge is generally authorising the session for sites such as Ancestry which need cookies or sessions set to allow the images to be downloaded.
Jane
My Family History : My Photography "Knowledge is knowing that a tomato is a fruit. Wisdom is not putting it in a fruit salad."
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import gedcom with photos from online url

Post by tatewise »

Jane, can you provide some clues about how https is supported with luacom.
By "authorising the session" do you mean sign-in username & password?
In most cases, the FH user is also the online user so can they provide the authorisation needed?
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
ColeValleyGirl
Megastar
Posts: 5465
Joined: 28 Dec 2005 22:02
Family Historian: V7
Location: Cirencester, Gloucestershire
Contact:

Re: Import gedcom with photos from online url

Post by ColeValleyGirl »

Clue: winhttp library
User avatar
Valkrider
Megastar
Posts: 1563
Joined: 04 Jun 2012 19:03
Family Historian: V7
Location: Lincolnshire
Contact:

Re: Import gedcom with photos from online url

Post by Valkrider »

The gedcom I was sent had links to the website of the supplier of the gedcom so no login or credentials were required BUT it was an https:// url. I was able to download the images and then re-associate them but when other packages can handle it it is surprising, to me, that native FH can't when it can do so much more. I would have thought that this would be a fairly common requirement.
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import gedcom with photos from online url

Post by tatewise »

Colin, it is a fairly common requirement. Users regularly ask why media does not import from online family trees.
However, it is unusual that no login credentials are needed. Your supplier must have a completely public website.
Perhaps you could report the problem to Calico Pie and see what they say.

Helen, are you referring to luacom.CreateObject("winhttp.winhttprequest.5.1")
Unless it has changed recently, that does not support https protocols.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
ColeValleyGirl
Megastar
Posts: 5465
Joined: 28 Dec 2005 22:02
Family Historian: V7
Location: Cirencester, Gloucestershire
Contact:

Re: Import gedcom with photos from online url

Post by ColeValleyGirl »

User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import gedcom with photos from online url

Post by tatewise »

That looks interesting and presumably changed in 2018. When I get time I'll review where I got to 4 years ago.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
Valkrider
Megastar
Posts: 1563
Joined: 04 Jun 2012 19:03
Family Historian: V7
Location: Lincolnshire
Contact:

Re: Import gedcom with photos from online url

Post by Valkrider »

tatewise wrote: 05 Mar 2021 13:25 Colin, it is a fairly common requirement. Users regularly ask why media does not import from online family trees.
However, it is unusual that no login credentials are needed. Your supplier must have a completely public website.
Perhaps you could report the problem to Calico Pie and see what they say.
Mike

It wasn't a supplier it was a friend who sent me the gedcom and rather than sent a huge email with the images embeded he had them stored on his webserver.

I will raise it with CP.
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import gedcom with photos from online url

Post by tatewise »

Colin, you introduced the term 'supplier' but I suspected they were friend or family. Their webserver needed no credentials so the gist of what I said is correct.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
Valkrider
Megastar
Posts: 1563
Joined: 04 Jun 2012 19:03
Family Historian: V7
Location: Lincolnshire
Contact:

Re: Import gedcom with photos from online url

Post by Valkrider »

This is CP's reply
We will log this as a request for consideration.

In the meantime you could create a plugin to download those images locally as long as they are not behind a fire/pay wall.
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import gedcom with photos from online url

Post by tatewise »

Easier said than done!

If luacom.CreateObject("winhttp.winhttprequest.5.1") is the method of using HTTPS then I don't understand how.
The advice at https://docs.microsoft.com/en-us/window ... in-winhttp only discusses the Microsoft interface and I could not relate that to a possible Lua solution.

The Lua Sec library from https://github.com/brunoos/luasec/wiki is intended to work with the Lua Socket library to support HTTPS but in Oct 2017 CP could not find a suitably compiled version. However, the website was updated in Oct 2019 and has active postings throughout 2020 and 2021 so the libarary is being used by some.

I don't have the time or resources to dig any deeper.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
ColeValleyGirl
Megastar
Posts: 5465
Joined: 28 Dec 2005 22:02
Family Historian: V7
Location: Cirencester, Gloucestershire
Contact:

Re: Import gedcom with photos from online url

Post by ColeValleyGirl »

Mike, this describes the COM interface albeit using jScript as an example.
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import gedcom with photos from online url

Post by tatewise »

Thanks Helen, that is more digestible, so I started experimenting, but did not need that authorisation process!
Amazingly the standard winhttp script below downloaded media from Ancestry without username & password.

Code: Select all

local strTarget = fhGetContextInfo("CI_PROJECT_DATA_FOLDER").."\\Media\\Download".
local strURL = "http://.......URL......"
local http = luacom.CreateObject("winhttp.winhttprequest.5.1")
http:Open("GET",strURL,false)
http:Send()
http:WaitForResponse(30)
local state = http.Status	-- 200 is good, 400, 404 & others bad
local head = http.GetAllResponseHeaders("")	-- Content-Type = file type
local body = http.ResponseBody
local fileHandle, strError = io.open(strTarget,"wb")
fileHandle:write(body)
if io.type(fileHandle) == "file" then
	assert(fileHandle:close())
end
Unfortunately, the URL held in the GEDCOM OBJE.FILE tag does not work without some 'adjustment'.
OBJE.FILE URL minus leading http:
//trees.ancestry.com/rd?f=image&guid=c4a31e4e-069f-431d-a08a-5fa71d73a233&tid=15707378&pid=1
Plugin URL minus leading http:
//mediasvc.ancestry.com/v2/image/namespaces/1093/media/c4a31e4e-069f-431d-a08a-5fa71d73a233?client=TreesUI
i.e.
The guid= 38-char code maps directly across but the magic number 1093 needs user assistance.
The user must sign-in, open the Gallery, open any image, then use right-click View Image to reveal that number.

There is no filename supplied via winhttp but the OBJE.FILE.TITL and OBJE.FILE.FORM could be used.

I wonder if FindMyPast will also work as well?
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import gedcom with photos from online url

Post by tatewise »

I'm not sure if it is some lucky symbiotic process or what, but I've made a breakthrough and got some downloads working.
I have experimented with Ancestry and FindMyPast with some success as far as Media that I've attached to their trees.
It may need some more investigation regarding any Media attached from their own records.

Try the attached prototype Download Online Media plugin Version 0.7 Date 12 Mar 2021.
It should run in both FH V6 and FH V7. Its user interface is somewhat basic.
It should handle Ancestry or FindMyPast Family Tree GEDCOM downloads or any GEDCOM with suitable Media URL.
There is no need to be signed in to Ancestry or FindMyPast for downloading to proceed.

The Media URL must exist in Media record FILE or _FILE or _URL level 1 tags after import to FH.

The download rate is about one or two Media files per second, so use relatively small Family Tree Projects for testing.
I got it downloading both JPG and DOC files.

If any Media records are exact duplicates of each other then they are merged & purged.

The set up process for Ancestry is a little complicated. You must sign in to your Ancestry account, open the Family Tree, open its Media Gallery, then right-click an image file and choose View Image or Open image in new tab depending on the browser.
The resulting address bar URL must be copied into the user interface and is crucial for the downloads to work.
e.g. http: //mediasvc.ancestry.com/v2/image/namespaces/1234/media/7f9a3d4b- .... -3a50c63cdb11?client=TreesUI
Currently, there is no check that a correct format URL is chosen but it is a 'sticky' setting so only needs entering once.

Let me know if you get anything working or any useful feedback.
Last edited by tatewise on 01 Feb 2024 16:26, edited 1 time in total.
Reason: Attachment deleted as a better version is attachd later.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
NickWalker
Megastar
Posts: 2597
Joined: 02 Jan 2004 17:39
Family Historian: V7
Location: Lancashire, UK
Contact:

Re: Import gedcom with photos from online url

Post by NickWalker »

Do you think that Ancestry would consider anyone using this to break their terms and conditions such as users must:
Not circumvent, disable or otherwise interfere with features that prevent or restrict use or copying of any content or enforce limitations on use of the Services or the content therein, including by using any self-developed or third-party developed bots, crawlers, spiders, data miners, scraping, or other automatic access tools.
And on Find My Past:
store pages of the Site on a server or other storage device connected to a network or create an electronic database by (i) systematically downloading and storing all or any of the pages of the Site or (ii) by screen scraping, framing, caching, data extraction or programmatic access whether by robot, spider, or otherwise
It may be sensible to be careful and warn users as I'd hate you or anyone else to get in any trouble for this, as unlikely as that probably is.
Nick Walker
Ancestral Sources Developer

https://fhug.org.uk/kb/kb-article/ancestral-sources/
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import gedcom with photos from online url

Post by tatewise »

Yes, maybe some sort of warning may be a good idea. Although it is not obvious that the plugin circumvents any restrictions. The Ancestry Media Gallery allows images to be downloaded by the user and the plugin just automates that allowed process. Likewise with FindMyPast, where each full media URL is included in the GEDCOM and the download is to the user's private PC not a server or networked device.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
Valkrider
Megastar
Posts: 1563
Joined: 04 Jun 2012 19:03
Family Historian: V7
Location: Lincolnshire
Contact:

Re: Import gedcom with photos from online url

Post by Valkrider »

@Mike

I just tried your plugin. Unfortunately it doesn't work on the file that caused me to raise this issue in the first place it will not go to an open website url to grab the images.
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import gedcom with photos from online url

Post by tatewise »

Colin, can you give a bit more feedback on what happened.
Presumably, it prompted saying Website Unknown and tried to use the Media URL but failed.
It should have produced a Result Set giving the reason for failure for each file name and URL.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
Valkrider
Megastar
Posts: 1563
Joined: 04 Jun 2012 19:03
Family Historian: V7
Location: Lincolnshire
Contact:

Re: Import gedcom with photos from online url

Post by Valkrider »

Mike

Yes website unknown. No Result Set produced.
User avatar
tatewise
Megastar
Posts: 28341
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Import gedcom with photos from online url

Post by tatewise »

I'm mystified. Did it do anything?
Did it produce a progress bar and name each of the Download files... at about one per second?

I cannot understand how it can find Media URL, and neither download Ok nor produce a Result Set.
The logic is to read the URL from a Media record, attempt to download from the URL, if a file is downloaded then success, but if not then create a Result Set entry. After a successful download, it should update the Media record Title, etc. Maybe it is that step that went wrong.

Could you possibly post a sample Media record GEDCOM with URL, etc, redacted if necessary.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
Valkrider
Megastar
Posts: 1563
Joined: 04 Jun 2012 19:03
Family Historian: V7
Location: Lincolnshire
Contact:

Re: Import gedcom with photos from online url

Post by Valkrider »

@Mike

This is an individual from the gedcom

0 @Emma_1834@ INDI
1 NAME Emma Jane /Ashbee/
1 SEX F
1 _SVG_P 5,2,False
1 OBJE
2 FILE https://xxxxxxxxx.com/Images/Ashbee_Emma_Jane_b1834.jpg
3 FORM jpg
1 FAMS @William_Emma@
1 FAMC @William_Ann@

And this is the screenshot of the only dialog that pops up
Screenshot 2021-03-14 at 07.46.58.png
Screenshot 2021-03-14 at 07.46.58.png (41.57 KiB) Viewed 6654 times
There are no other warnings or anything the plugin just stops after displaying this dialog.
Post Reply