Utility ~ Census CSV to GEDCOM Conversion
Description
A short program for converting census data from a CSV file (as can be produced by EXCEL) into a GEDCOM file. The Zip file contains the program itself, and an Adobe PDF document describing what it does and the requirements it has for the order of data, as reproduced below.
Produced by peterjward (13/03/10)
Introduction
You may have England and Wales census information transcribed into a spreadsheet such as Excel. The rows usually represent individuals and the columns data items about them. With most database and spreadsheet programs it should be possible to export this data into a CSV file. This is a text based file format where the data is arranged with each item of text information (from a spreadsheet row) separated from the next by a comma, and the end of each row terminated with a carriage return/linefeed. It is important however that your data does not itself contain any commas anywhere as this can disrupt the file produced – if necessary replace with semicolons(;), colons(:) or periods (.). Neither should it contain double quotes (“) as these are used by this program during processing for delineating strings.
The program (written in Visual Basic) will input this information and produce a GEDCOM file (of the same name as your CSV file) that can be opened and read by any genealogical program (such as Family Historian) with such an ability. This information is concerned with individuals only, and does not connect them into families – this you will have to do yourself. It does recognise married or widowed women as such and adds an asterisk (*) to the end of their surname to indicate that it is not a maiden name. This is easily removed later if you do not want it using a word processor that can deal with text files (such as Wordpad, Notepad or Word). It does not do anything to distinguish married men.
Details
It creates a record for each individual with name and sex. It will calculate an approximate birth year or date from the given age: It recognises ages in days, weeks and months (e.g. 7m, 12d, 5w), weeks being converted to months; anything else is interpreted as years. It will add a place of birth if this is present, using just the county of the census if the year is 1841. A census event is generated with appropriate date, location and address (if known), plus age: The relation to the head of the household is also added as a note to the event (except 1841 of course). The marital status is not recorded in any way except for wives and widows as indicated above. Towns and counties are concatenated for places of birth and census (e.g. “Guildford” and “Surrey” will become “Guildford, Surrey”).
A source reference is generated from the piece/folio/page information so that it represents one page of a census return. This reference is appended to the census event and to the individual record as a whole.
It requires all the input to be formatted as text (even dates and numbers), and in the correct order, which differs somewhat for the earliest (1841) and latest (1911) censuses compared with the rest. You should also ensure that your data is sorted by piece, folio, page and schedule before creating your CSV file as this will ensure that you get an orderly set of source names. It can deal with a mixture of pieces for any one year, but can only manage one year at a time.
A certain amount of error trapping has been done and missing items of data (blank spaces) should only mean corresponding blanks in the resulting GEDCOM file. I have tested this fairly thoroughly in Windows XP and VISTA without any problems but have not tried it with other operating systems. I originally designed it for my own use and I can (and will) give no warranty. However it will not delete your original files, merely produce an additional new one (but will overwrite an existing GEDCOM file of the same name). If it should crash the reason will be that your CSV file has not been read properly – you do not have the correct number of data items for any individual or else there is probably a stray comma (,) or double quote (“) which leads to the program misreading the items of data; you will then almost certainly get an error message. And if you do not have the data items in the right order then the GEDCOM produced will be garbage.
The program should be placed in the same folder as your CSV file. The GEDCOM file produced will appear in the same folder. You need to tell it the year in question and the name of your CSV file (case insensitive, and leave off the .csv file suffix). Navigation round the three ‘controls’ on the form is by mouse or tab key. The year selection is made by using the direction keys, filename entry by typing or copy and paste. When GO is selected it is activated by pressing the space bar or left clicking the mouse. The program gives no sign that it has finished processing – so just check the folder for the new Gedcom file. A fair-sized file (3000 individuals) is processed in less than a second. You may process any number of CSV files at any one time by simply entering the new filenames and years. The program is closed like any window by clicking on the small X box in the top right hand corner.
The data order required is as follows:
1841 Census
Census county | The county of the Census ( NOT where born) |
Census place | The town/village/hamlet |
Address | Often not present |
Piece number | The number immediately after HO107 |
Book number | Larger pieces may have several books |
Folio number | At top of alternate pages |
Page number | Found at the top of each page |
Forename | Obvious – no separate column for middle names/initials |
Surname | Also obvious |
Sex | Simple M or F |
Age | Days/weeks/months recognised and treated appropriately |
County Born | Simple Y or N – no notice taken of I(reland) etc |
1851 to 1901 Census
Census county | The county of the Census ( NOT where born) |
Census place | The town/village/hamlet |
Address | House / street/ etc, |
Piece No. | The number immediately after the RG9, RG10 etc |
Folio No. | The number following the piece no. |
Page No. | The number following the folio no. |
Schedule No. | The number identifying the household |
Forename(s) | Include middle names or initials |
Surname | Will accept double-barrelled names |
Relation | Head / Wife / Son / Servant / Boarder / Lodger etc. |
Status | Married / Widowed / Single (or unmarried) |
Sex | Simple M or F |
Age | Days/weeks/months recognised – anything else means years |
Birthplace | Town etc |
Birth County | May be foreign country |
1911 Census
Census county | The county of the Census ( NOT where born) |
Census place | The town/village/hamlet |
Address | House / street/ etc, |
Piece | Comes immediately after RG14 in census reference |
Registration Dist | Number immediately after RD in Census reference |
Subdistrict | Number immediately after SD in Census reference |
Enumeration District | Number immediately after ED in Census reference |
Schedule | Number immediately after SN in Census reference |
Forename(s) | Include middle names |
Surname | Will accept double-barrelled names |
Relation | Head / Wife / Son / Servant / Boarder / Lodger etc. |
Marital Status | Married / Widowed / Single (or unmarried) |
Sex | Simply M or F |
Age | Days/weeks/months recognised |
Birthplace | Town of birth |
Birth county | County of birth – sometimes foreign country |
Screenshot
Installation Instructions
Click on the link below to download and save the csvtogedcomcensusimport.zip file.
This Zip file contains the program and the above PDF information. Simply extract the two files to a folder.
Rearrange the data into the order required for the census year in question as above. Make sure you have removed or converted commas and double quotes and then save as a CSV (Comma Separated Values) file, and remember the file name. Move this into the folder with the program and run the program by simply double clicking on it (you do not have to have Visual Basic installed on your computer).
The Census CSV to GED.exe file may need right-click > Properties > Unblock to allow it to work correctly.
If you receive an error message saying MSVBVM50.DLL is missing, then download the Visual Basic 5.0 Run-Time Files from http://support.microsoft.com/kb/180071 and install as a Windows Administrator.