Discussion:
[OS X TeX] TeXShop 2.14 does not display saved Chinese characters correctly
Jung-Tsung Shen
2007-12-05 21:00:52 UTC
Permalink
I am using TeXShop (2.14) on OS 10.4.11.
The document containing Chinese characters can be compiled and displayed
correctly using CJK package now (see previous thread). However, when the
file is saved, and reopened, the characters become random codes (or, codes),
rather than the original typed characters. This is true regardless which
method (UTF-8, Chinese ...) is used to save.

The previous version of TeXShop seems to not have this problem. I remember I
tested this before following the discussions on this forum using xeLaTeX,
and the saved documents containing Chinese characters show up fine when
reopened.

I suspect this is also true for other language such as Japanese or Korean,
but I haven't tested.

JT
Herbert Schulz
2007-12-05 21:12:53 UTC
Permalink
Post by Jung-Tsung Shen
I am using TeXShop (2.14) on OS 10.4.11.
The document containing Chinese characters can be compiled and
displayed correctly using CJK package now (see previous thread).
However, when the file is saved, and reopened, the characters become
random codes (or, codes), rather than the original typed characters.
This is true regardless which method (UTF-8, Chinese ...) is used to
save.
The previous version of TeXShop seems to not have this problem. I
remember I tested this before following the discussions on this
forum using xeLaTeX, and the saved documents containing Chinese
characters show up fine when reopened.
I suspect this is also true for other language such as Japanese or
Korean, but I haven't tested.
JT
Howdy,

Do you have a line near the top of the file

%%!TEX encoding = UTF-8 Unicode

(assuming you are trying to save in utf8)? If you don't have the line
the file was possibly saved in some other encoding. What is the
default encoding you are using? (Go to TeXShop->Preferences->Document
Tab and the Encoding drop down menu.)

You should also have the corresponding command to LaTeX to tell it how
the file is encoded when it is compiled;

\usepackage[utf8]{inputencoding}

if you are using utf8 encoding.

Also, changing it after the mess-up won't work! Sorry! Not only that,
I believe that once it's saved under a particular encoding adding the
line won't change it; you've got to do a Save As... and overwrite the
old file. Maybe I'm wrong about that?

Good Luck,

Herb Schulz
(***@wideopenwest.com)
Maarten Sneep
2007-12-05 21:27:10 UTC
Permalink
Post by Jung-Tsung Shen
I am using TeXShop (2.14) on OS 10.4.11.
The document containing Chinese characters can be compiled and
displayed correctly using CJK package now (see previous thread).
However, when the file is saved, and reopened, the characters become
random codes (or, codes), rather than the original typed characters.
This is true regardless which method (UTF-8, Chinese ...) is used to
save.
The previous version of TeXShop seems to not have this problem. I
remember I tested this before following the discussions on this
forum using xeLaTeX, and the saved documents containing Chinese
characters show up fine when reopened.
I suspect this is also true for other language such as Japanese or
Korean, but I haven't tested.
I get the same.

- setting the default encoding to UTF-8 does not help.
- including the %!TEX encoding = UTF-8 statement does not make a
difference.
- Opening the file from the file menu and explicitly selecting UTF-8
for its encoding does not help.
- The file displays correctly in BBEdit, and is recognised there are
UTF-8.

I believe we have a bug in TeXShop.

What I see is that non-ASCII characters are displayed as multiple
characters, from the 128-255 value range in whatever encoding TeXShop
finally decides to use.

I'm on Leopard, so the OS did not introduce this change, afaics.
Richard, under which OS was 2.14 built?

I'll file a bug report.

Maarten
Maarten Sneep
2007-12-05 21:33:24 UTC
Permalink
Post by Maarten Sneep
I'll file a bug report.
No I won't, I can't log into sourceforge. Herb or Richard, can you try
to reproduce this bug and file a report?

Maarten
Herbert Schulz
2007-12-05 22:06:29 UTC
Permalink
Post by Maarten Sneep
Post by Jung-Tsung Shen
I am using TeXShop (2.14) on OS 10.4.11.
The document containing Chinese characters can be compiled and
displayed correctly using CJK package now (see previous thread).
However, when the file is saved, and reopened, the characters
become random codes (or, codes), rather than the original typed
characters. This is true regardless which method (UTF-8,
Chinese ...) is used to save.
The previous version of TeXShop seems to not have this problem. I
remember I tested this before following the discussions on this
forum using xeLaTeX, and the saved documents containing Chinese
characters show up fine when reopened.
I suspect this is also true for other language such as Japanese or
Korean, but I haven't tested.
I get the same.
- setting the default encoding to UTF-8 does not help.
- including the %!TEX encoding = UTF-8 statement does not make a
difference.
- Opening the file from the file menu and explicitly selecting UTF-8
for its encoding does not help.
- The file displays correctly in BBEdit, and is recognised there are
UTF-8.
I believe we have a bug in TeXShop.
What I see is that non-ASCII characters are displayed as multiple
characters, from the 128-255 value range in whatever encoding
TeXShop finally decides to use.
I'm on Leopard, so the OS did not introduce this change, afaics.
Richard, under which OS was 2.14 built?
I'll file a bug report.
Maarten
Howdy,

I've run into some similar things. Once the file is compromised there
is a problem getting it back.

Try to make a fresh file starting with a blank document and make sure
it has the

%%!TEX encoding = UTF-8 Unicode

line before saving it the first time. I've seen problems if I try to
change the encoding once it has been saved.

I once took a file, did a Select All... and Copy and the opened a New
document, Pasted and Saved (make sure you have the line above) and it
worked ok.

I think Dick Koch knows about some this.

Good Luck,

Herb Schulz
(***@wideopenwest.com)
Jung-Tsung Shen
2007-12-05 22:33:23 UTC
Permalink
Post by Herbert Schulz
Try to make a fresh file starting with a blank document and make sure
it has the
%%!TEX encoding = UTF-8 Unicode
line before saving it the first time. I've seen problems if I try to
change the encoding once it has been saved.
Herbert,

The line (with double %%) does help to preserve the Chinese characters when
reopened. The encoding method on saving, however, seem to be ineffective. I
would think the encoding method on saving should do the same trick. Or, this
is a way of implementation to accomplish cross platforms, cross text
editors, ... etc?

Thanks.

JT
Jung-Tsung Shen
2007-12-05 22:36:34 UTC
Permalink
Post by Jung-Tsung Shen
Post by Herbert Schulz
Try to make a fresh file starting with a blank document and make sure
it has the
%%!TEX encoding = UTF-8 Unicode
line before saving it the first time. I've seen problems if I try to
change the encoding once it has been saved.
Herbert,
The line (with double %%) does help to preserve the Chinese characters
when reopened. The encoding method on saving, however, seem to be
ineffective. I would think the encoding method on saving should do the same
trick. Or, this is a way of implementation to accomplish cross platforms,
cross text editors, ... etc?
Thanks.
JT
The situation I am facing now is, to get the characters displayed correctly,
I have to include the line (%%!TEX ...) at the beginning of the document.
But when I submit the manuscript to the publisher (APS in question), should
I include this line as well? Will it cause any other side effect on APS'
side? [I know the only way to find out is to ask APS ... but I would
appreciate some comments from the point of view of LaTeX.]

JT
Herbert Schulz
2007-12-05 22:45:09 UTC
Permalink
Post by Jung-Tsung Shen
The situation I am facing now is, to get the characters displayed
correctly, I have to include the line (%%!TEX ...) at the beginning
of the document. But when I submit the manuscript to the publisher
(APS in question), should I include this line as well? Will it cause
any other side effect on APS' side? [I know the only way to find out
is to ask APS ... but I would appreciate some comments from the
point of view of LaTeX.]
JT
Howdy,

The %% are really inconsequential, a single % should work also. That
line is only understood by TeXShop and is a comment as far as TeX
processing is concerned. Hopefully, the publisher knows that the
document is saved with UTF-8 encoding in case they have to manually
set up the encoding in their editor.

Good Luck,

Herb Schulz
(***@wideopenwest.com)
Maarten Sneep
2007-12-06 20:37:46 UTC
Permalink
Post by Jung-Tsung Shen
Post by Jung-Tsung Shen
Post by Herbert Schulz
Try to make a fresh file starting with a blank document and make sure
it has the
%%!TEX encoding = UTF-8 Unicode
line before saving it the first time. I've seen problems if I try to
change the encoding once it has been saved.
The line (with double %%) does help to preserve the Chinese
characters when reopened. The encoding method on saving, however,
seem to be ineffective. I would think the encoding method on saving
should do the same trick. Or, this is a way of implementation to
accomplish cross platforms, cross text editors, ... etc?
This is a TeXShop specific issue. Allan Odgaard (TextMate) claims that
from a few bytes of text it is possible to recognise UTF-8, see: http://blog.macromates.com/2005/handling-encodings-utf-8/
.

Of course many still have legacy documents that are pure 8-bit encoded
files, especially true in the TeX world.

The Cocoa Foundation Framework now includes code to add the string
encoding to a plain text file as an extended attribute. This already
works in TextEdit in 10.5, and other editors are sure to follow. Note
however that this attribute may not survive netoworked transportation
(i.e. as a mail attachment.
Post by Jung-Tsung Shen
The situation I am facing now is, to get the characters displayed
correctly, I have to include the line (%%!TEX ...) at the beginning
of the document. But when I submit the manuscript to the publisher
(APS in question), should I include this line as well? Will it cause
any other side effect on APS' side? [I know the only way to find out
is to ask APS ... but I would appreciate some comments from the
point of view of LaTeX.]
Since
%!TEX encoding = UTF-8 Unicode
is just a comment as far as TeX is concerned, there is no harm in
this. It is human readable, so APS editors can use this as well.

By the way I found out why I had a probem before. I mistyped the
encoding name (I left out the Unicode part). Apparently this causes
TeXShop to fall back to some 8-bit encoding, rather than the default
or whatever it was told to try in the file-open dialog box. I don't
think it is desirable behaviour, but at least I solved the issue.

Best,

Maarten
Jérome Laurens
2007-12-07 11:52:40 UTC
Permalink
Post by Herbert Schulz
Since
%!TEX encoding = UTF-8 Unicode
is just a comment as far as TeX is concerned, there is no harm in
this. It is human readable, so APS editors can use this as well.
By the way I found out why I had a probem before. I mistyped the
encoding name (I left out the Unicode part). Apparently this causes
TeXShop to fall back to some 8-bit encoding, rather than the default
or whatever it was told to try in the file-open dialog box. I don't
think it is desirable behaviour, but at least I solved the issue.
Ask for case insensitive support of official IANA name: UTF-8

Loading...