* Questionable name capitalisation

For users to report plugin bugs and request plugin enhancements; and for authors to test new/new versions of plugins, and to discuss plugin development (in the Programming Technicalities sub-forum). If you want advice on choosing or using a plugin, please ask in General Usage or an appropriate sub-forum.
Locked
avatar
jelv
Megastar
Posts: 645
Joined: 03 Feb 2020 22:57
Family Historian: V7
Location: Mere, Wiltshire

Questionable name capitalisation

Post by jelv »

Having (1) been given a gedcom with some data I wish to merge in to my tree where the capitalisation is best technically described as "a dogs breakfast" and (2) wanting to set about learning plugin coding, I've created my own plugin to report on questionable name capitalisation.

Taking things slowly, I've not attempted to do any correction via the plugin, it just reports all names where it thinks there is an issue. In addition to identifying surnames in all capitals it addresses limitations of the Clean up Surname Capitalisation plugin.

What it does:
  1. Checks for surnames all capitals or incorrectly capitalised (e.g. where the shift key was held just too long - SMith). it uses the same logic as the existing plugin for names beginning Mac etc.
  2. Given names are checked (with no special prefixes)
  3. Given and surnames in all lower case are reported
  4. Corrects an issue with the name MACE which the existing plugin wants to change to MacE
  5. Reports empty names (where the text has been removed from an additional name via the All tab without removing the field)
  6. Reports strange name structures where the /.../ defined surname is in the middle of the name (e.g. Ernest /Moore/ Jr where the name suffix should have been used). It doesn't attempt to unscramble these
The report shows how the name has been entered and how it thinks the given and surnames should be capitalised.

The message at the end gives the meaning of the issue codes:
The attachment Issues found message.png is no longer available
I've been reading the topic Surname prefix (SPFX) -- more generally, handling structured names. and wondering how what I've done might handle those issues.

My efforts so far are attached - if anyone would like to try it out I'd like some feedback - including on my coding as I'm just starting to learn lua.

Sample report:
Issues found message.png
Issues found message.png (4.58 KiB) Viewed 2085 times
Attachments
Sample outout.png
Sample outout.png (50.78 KiB) Viewed 2085 times
John Elvin
User avatar
tatewise
Megastar
Posts: 28488
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Clean up Surname Capitalisation

Post by tatewise »

IMO your point 6. is perfectly OK where users want the suffix to appear in the Focus Window and elsewhere without needing to customise the Name display to include the Name Suffix field. (The Focus Window name cannot be customised.)
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
User avatar
johnmorrisoniom
Megastar
Posts: 905
Joined: 18 Dec 2008 07:40
Family Historian: V7
Location: Isle of Man

Re: Clean up Surname Capitalisation

Post by johnmorrisoniom »

Excellent plugin, Found quite a few typos in my data.
Incorrectly reports when a Mc/Mac surname is a christian name
Also reports Machin as incorrect
avatar
jelv
Megastar
Posts: 645
Joined: 03 Feb 2020 22:57
Family Historian: V7
Location: Mere, Wiltshire

Re: Clean up Surname Capitalisation

Post by jelv »

tatewise wrote: 30 Oct 2022 10:17 IMO your point 6. is perfectly OK where users want the suffix to appear in the Focus Window and elsewhere without needing to customise the Name display to include the Name Suffix field. (The Focus Window name cannot be customised.)
How safe would it be to assume that, if there is text both sides of the indicated surname, what is before is the given name and what is after is a suffix? Can you think of any reason why you would have a prefix and given name after?

(I do recognise names in the order /surname/ given)
Last edited by jelv on 30 Oct 2022 11:54, edited 2 times in total.
John Elvin
avatar
jelv
Megastar
Posts: 645
Joined: 03 Feb 2020 22:57
Family Historian: V7
Location: Mere, Wiltshire

Re: Clean up Surname Capitalisation

Post by jelv »

johnmorrisoniom wrote: 30 Oct 2022 11:04 Excellent plugin, Found quite a few typos in my data.
Incorrectly reports when a Mc/Mac surname is a christian name
Could you post the full name as it appears in the entered name column of the results please (or a made up name that illustrates the issue).
johnmorrisoniom wrote: 30 Oct 2022 11:04 Also reports Machin as incorrect
A good reason my first attempts didn't go anywhere near changing data and just reported only!

As its just a report, if it's only one or two individuals it's not an issue. But if that's your family name it would be annoying to have loads of false reports. I'm thinking my next adventure in learning lua programming will be to add an interface with an option to maintain a list of exceptions such as Machin.

Edit: Would the exceptions list be surnames only or are there any given names which would not be first letter capital, remainder lower case?
Last edited by jelv on 30 Oct 2022 11:58, edited 1 time in total.
John Elvin
User avatar
David2416
Superstar
Posts: 399
Joined: 12 Nov 2017 16:37
Family Historian: V7
Location: Suffolk UK

Re: Clean up Surname Capitalisation

Post by David2416 »

Yes indeed, excellent Plugin. Found quite a few typos and so forth. Very useful as I have cleaned up a dozen or so.
Thank you

Also picked up where I have put notes after the // to distinguish people of the same name, which is not a problem but quite useful.
User avatar
johnmorrisoniom
Megastar
Posts: 905
Joined: 18 Dec 2008 07:40
Family Historian: V7
Location: Isle of Man

Re: Clean up Surname Capitalisation

Post by johnmorrisoniom »

Examples of instances where Christian Name can also be a surname
Charlotte McKenzie Cadzow
John McIntyre Oliver
George McKerrow Brash
Bernard Henry McNee Brash
Malcolm McFarlane Connelly
Sydney Annie McKinley Cowell
Elizabeth McIntosh Holland Bell

All of these produce a G (Given) Name error

Mackie
Macara
Mackwell
Maciulaitis
Macan
Machin
Macklin

Produce an Incorrect Surname error
avatar
jelv
Megastar
Posts: 645
Joined: 03 Feb 2020 22:57
Family Historian: V7
Location: Mere, Wiltshire

Re: Clean up Surname Capitalisation

Post by jelv »

johnmorrisoniom wrote: 30 Oct 2022 12:17 Examples of instances where Christian Name can also be a surname
Charlotte McKenzie Cadzow
John McIntyre Oliver
George McKerrow Brash
Bernard Henry McNee Brash
Malcolm McFarlane Connelly
Sydney Annie McKinley Cowell
Elizabeth McIntosh Holland Bell

All of these produce a G (Given) Name error
I assume these are shown as
Charlotte McKenzie /Cadzow/
John McIntyre /Oliver/
George McKerrow /Brash/
etc.
I presume these are where children are given the mother's surname as a given name. I should have thought of this as my wife's father and many of his ancestors have used that custom! I'll change the logic to use the exceptions on the given name as well.
johnmorrisoniom wrote: 30 Oct 2022 12:17 Mackie
Macara
Mackwell
Maciulaitis
Macan
Machin
Macklin

Produce an Incorrect Surname error
Definitely need the exceptions list I suggested in my last post and it answers the question I posed in the edit!
John Elvin
User avatar
tatewise
Megastar
Posts: 28488
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Clean up Surname Capitalisation

Post by tatewise »

jelv wrote: 30 Oct 2022 11:40
tatewise wrote: 30 Oct 2022 10:17 IMO your point 6. is perfectly OK where users want the suffix to appear in the Focus Window and elsewhere without needing to customise the Name display to include the Name Suffix field. (The Focus Window name cannot be customised.)
How safe would it be to assume that, if there is text both sides of the indicated surname, what is before is the given name and what is after is a suffix? Can you think of any reason why you would have a prefix and given name after?

(I do recognise names in the order /surname/ given)
I suspect if you applied the rules to given /surname/ given so that the names before and after the Surname are both treated as Given names then you won't be far wrong. Most suffixes, such as Dr, Rev, Junior, Senior, etc, follow the same rules as typical Given names, i.e. start with a capital letter and then lower case.
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
avatar
jelv
Megastar
Posts: 645
Joined: 03 Feb 2020 22:57
Family Historian: V7
Location: Mere, Wiltshire

Name capitalisation and parts check

Post by jelv »

It's been a long time coming, but I have a new plugin which allows the user to customise the rules used to define how non-standard capitalisation is applied. The other main additional feature of the new plugin is that it compares the name parts with the full name; this is principally for users who have imported their data from a source where the name parts are used and may have made subsequent changes to full names in FH (FH does not update the name pieces when the full name is changed). The name of the plugin has changed to reflect the additional functionality.
Name checking - Rules and Exceptions.png
Name checking - Rules and Exceptions.png (7.03 KiB) Viewed 1258 times
 
There is one significant limitation: Currently it may give error messages or not find issues if there are names which use non-Latin characters, however as it is a report only it is still safe to use the plugin. I am currently learning how to use UTF8 so these should be handled correctly in a later version (there are a lot of string functions used within the plugin so this will take some time).

This is the main configuration screen:
The attachment Name checking - configuration.png is no longer available
 
The rules and exceptions define the names which are not initial letter capital, remaining letters lower case. Selecting any of the options opens this dialog:
Name checking - configuration.png
Name checking - configuration.png (8.07 KiB) Viewed 1258 times
 
To add entries enter the rule/exception in the box at the top right and click Add.

To change an entry select it from the list on the left which will copy it to the box at the top right. Make the required change and click Update.

To remove an entry select it from the list on the left and click Delete.

Given & Surname Rules

The rules define the beginnings of names (in the correct capitalisation) where the rest of the name begins with an upper case letter and the remainder lower case. Examples:
  • Mac     MacLeod, MacDonald etc.
  • O'        O’Gara
  • d'         d’Orsay
Prefix, Given Name, Surname Prefix and Surname Exceptions

The exceptions are specific name parts which do not follow the default capitalisation or are incorrectly capitalised by one of the rules. For example, the Mac rule would expect MacHin or MacE so Machin and Mace could be added to the Exceptions list to correct this. Note that if you have names where the same spelling can be capitalised differently (e.g. Macdonald or MacDonald) both versions should be added the the Exceptions list.

The Prefix Exceptions list can be used for abbreviated titles which are irregularly capitalised, for example Staff Sergeant is abbreviated as SSgt.

The Surname Prefixes are a special case. They will act as described above, but can also be used to automatically find surname prefixes which are at the beginning of the surname or the end of the given name. When either/both these checks are turned on in the options, it will identify all the surname prefixes in the exceptions/list so all the prefixes in the project should be added to the list. Note that if the capitalisation varies, all variants will need to be in the list (e.g. Vincent Van Gogh and Simon “Piet” van der Valk would require both Van and van to be in the list).

Options

The options dialog allows enabling/disabling some additional checks:
  • Ancestry Synchronisation plugin compatability. This checks for issues which would would cause incorrect results when using Mark Draper's plugin.
  • Given name used must be part of given name.
  • Look for surname prefixes in surname. See notes above.
  • Look for surname prefixes in given name. See notes above.
Further options to enable/disable other checks may be added as a result of user feedback.


Finally, I'd like to acknowledge the testing and advice given to me by Mark Draper as I learnt more about LUA programming and specifically the idiosyncrasies of IUP dialoges.
John Elvin
User avatar
ColeValleyGirl
Megastar
Posts: 5520
Joined: 28 Dec 2005 22:02
Family Historian: V7
Location: Cirencester, Gloucestershire
Contact:

Re: Questionable name capitalisation

Post by ColeValleyGirl »

It would be worth starting a new topic, seeking testers for this new plugin.
User avatar
johnhanson
Diamond
Posts: 80
Joined: 27 Nov 2002 16:50
Family Historian: V7
Contact:

Re: Questionable name capitalisation

Post by johnhanson »

I just ran the query and got an error

Just realised that I posted it as a reply to the knowledgebase article rather than here

Ran for about three seconds on my database of 41000 names and got the following error message

Image
Attachments
image.png
image.png (6.84 KiB) Viewed 214 times
John Hanson FSG
Researcher, the Halsted Trust
User avatar
tatewise
Megastar
Posts: 28488
Joined: 25 May 2010 11:00
Family Historian: V7
Location: Torbay, Devon, UK
Contact:

Re: Questionable name capitalisation

Post by tatewise »

I discovered that fault arises under the following (unusual) conditions:
Name (INDI.NAME): Ian Steve /Munro/ Junior
Name Suffix (INDI.NAME.NSFX): Munro/ Junior
i.e. The Named Suffix field holds the trailing section of the Name field including a slash /
Mike Tate ~ researching the Tate and Scott family history ~ tatewise ancestry
avatar
jelv
Megastar
Posts: 645
Joined: 03 Feb 2020 22:57
Family Historian: V7
Location: Mere, Wiltshire

Re: Questionable name capitalisation

Post by jelv »

Thanks Mike, I've taken note of this possibility. I'll wait for confirmation (see other topic) that this is his issue before finalising an updated version.
John Elvin
avatar
jelv
Megastar
Posts: 645
Joined: 03 Feb 2020 22:57
Family Historian: V7
Location: Mere, Wiltshire

Re: Questionable name capitalisation

Post by jelv »

Could all queries about this plugin be made on the new topic reflecting the changed plugin name please.

Name capitalisation and parts check
John Elvin
Locked