CodePage for ANSEL?

Moderator: MOD_nyhetsgrupper

Svar
Phil Wright

CodePage for ANSEL?

Legg inn av Phil Wright » 23. februar 2005 kl. 11.11

Does anyone know the correct codepage for handling the ANSEL character
set as defined for use in GEDCOM files? Is it the same as the ANSI
code page of 1251 (Latin 1)?

I am writing a utility that loads GEDCOM files and I can handle the
UTF8, Unicode and ASCII character sets without problems. But cannot
find information about the ANSEL code page for windows!

Thanks,
Phil.

john

Re: CodePage for ANSEL?

Legg inn av john » 23. februar 2005 kl. 11.51

Phil Wright wrote:
Does anyone know the correct codepage for handling the ANSEL character
set as defined for use in GEDCOM files? Is it the same as the ANSI
code page of 1251 (Latin 1)?

I am writing a utility that loads GEDCOM files and I can handle the
UTF8, Unicode and ASCII character sets without problems. But cannot
find information about the ANSEL code page for windows!

Thanks,
Phil.

see http://www.vjet.demon.co.uk/ftree/gedco ... ter_3.html for
some info

Gjest

Re: CodePage for ANSEL?

Legg inn av Gjest » 23. februar 2005 kl. 17.35

Does anyone know the correct codepage for handling the ANSEL
character
set as defined for use in GEDCOM files? Is it the same as the ANSI
code page of 1251 (Latin 1)?

No. ANSEL uses combining accents, i.e. two bytes 'a' and '^' for â,
whereas 8859-1 (latin-1) uses a single byte [mostly].

I am writing a utility that loads GEDCOM files and I can handle the
UTF8, Unicode and ASCII character sets without problems. But cannot
find information about the ANSEL code page for windows!

All of the information you need to write your own converter is in the
tables in the GEDCOM spec. You will not find much else to help you on
the web.

The gedcom-parse library (google for it) has its own ANSEL plugin for
the iconv library, which you may be able to use depending on platform
and license issues. There is also a patch for GNU recode floating
around, but unfortunately it is not up to date with the latest versions
of recode itself. As I recall one or other of these only understands
the subset of ANSEL that can be converted to 8859-1. Maybe someone
else knows of something for Windows.

Good luck.

--Phil.

mickg

Re: CodePage for ANSEL?

Legg inn av mickg » 23. februar 2005 kl. 20.11

[email protected] wrote:
Does anyone know the correct codepage for handling the ANSEL

character

set as defined for use in GEDCOM files? Is it the same as the ANSI
code page of 1251 (Latin 1)?


No. ANSEL uses combining accents, i.e. two bytes 'a' and '^' for â,
whereas 8859-1 (latin-1) uses a single byte [mostly].


I am writing a utility that loads GEDCOM files and I can handle the
UTF8, Unicode and ASCII character sets without problems. But cannot
find information about the ANSEL code page for windows!


All of the information you need to write your own converter is in the
tables in the GEDCOM spec. You will not find much else to help you on
the web.

The gedcom-parse library (google for it) has its own ANSEL plugin for
the iconv library, which you may be able to use depending on platform
and license issues. There is also a patch for GNU recode floating
around, but unfortunately it is not up to date with the latest versions
of recode itself. As I recall one or other of these only understands
the subset of ANSEL that can be converted to 8859-1. Maybe someone
else knows of something for Windows.

Good luck.

--Phil.

For clarification of what ANSEL does (or is) see:

http://www.niso.org/standards/resources ... 1993(R2002).pdf

Svar

Gå tilbake til «soc.genealogy.computing»