Determine the Character Encoding of a File

If you need to test the encoding of a string (usually as a result of reading a file) , the following function takes a string as parameter and returns a Boolean if the string is Unicode plus a number of bits to indicate that it is UTF-8 or UTF-16LE.

function TestEncoding(sText)
  local bUnicode, iBits
  local strStringEncoding = fhGetStringEncoding()
  fhSetStringEncoding("ANSI")
  if sText:match("^\xEF\xBB\xBF") -- "" = UTF-8 BOM
  or sText:match("[\xC2-\xF4][\x80-\xBF]+") then -- UTF-8 multi-byte encoding pattern
    bUnicode = true
    iBits = 8
  elseif fhCallBuiltInFunction('LeftText',sText,2) == "\xFF\xFE" -- "ÿþ" = UTF-16 BOM
  or fhCallBuiltInFunction('MidText',sText,2,1) == "\0" then -- UTF-16 2-byte encoding 2nd byte 0
    bUnicode = true
    iBits = 16
  else 
    bUnicode = false -- ANSI
  end
  fhSetStringEncoding(strStringEncoding )
  return bUnicode, iBits
end

Code courtesy of Mike Tate.