H.merijn brand 2007-04-14 04:34:27
That is based on both encoding and the font you use on the terminal.
By default, Spreadsheet::Read does not change the encoding, which means
that if the fields are encoded in Unicode (utf8), you should take action
in your script to output Unicode.
Summary, if your terminal is capable of dealing with UTF8 (like a
recent X11R6 xterm with utf8 enabled and font *-iso10646-1), then
binmode STDOUT, “:utf8”;
will probably suffice. If your terminal is iso8859-*, which also
supports the e-acute, then you will have to take appropriate actions
I think that the csv file is OK already. Try opening it in whatever
unicode enabled editor (I think both M$Word and M$Excel will do here)
and see how it looks
Mumia w. (read 2007-04-14 04:34:39
I have no problems outputting accented characters from
Spreadsheet::Read. Either your Perl or your terminal is not able to deal
with the accented characters.
Try placing “use encoding ‘iso-8859-1;’ at the top of your program.
Recent versions of Perl (>= 5.8) should be able to handle character
encodings well, but you might have to set up your locale properly, and
you might have to configure your terminal to display those characters.
Al 2007-04-14 09:05:46
thanks guys.. very helpful. thanks also for referring me to that good
Adding this line to my perl program did the trick:
binmode STDOUT, “:utf8”;
I had to do similar with the Filehandle of the file I write to.
Then, with TextWrangler, if I open that resulting file in UTF-8 mode,
it looks perfect, accent marks and all.
I’m using Perl 5.8.6 on Mac OS X
thanks so much!
Harryfmudd [at 2007-04-14 13:43:48
Maybe you should look into your Terminal window settings. Use menu
Terminal/Window Settings … and select “Display”.
Al 2007-04-16 03:34:44
Any suggestions for handling Asian characters from the original Excel?
Perl’s binmode setting helps to support accented characters fine.. but
when you go beyond the 256 bits.. seems that the Spreadsheet::Read Perl
module may have no way of knowing what Excel’s encoding is.
I’d like to input an excel that has Asian characters, process with
perl, and then write a csv or xml file (utf-8 encoded) with proper
Harryfmudd [at 2007-04-16 13:13:21
I’m not an expert on non-ASCII character sets, so the following is
somewhat provisional. But the thread has been fallow for about a day and
a half, and I figure if I say something horribly wrong someone will jump
at the opportunity to correct me.
Anyhow, this is what I _think_ the situation is.
I’ve never used Spreadsheet::Read, but the docs look like it’s an
umbrella module, and under the hood it selects the correct module to
read the spreadsheet you gave it. The docs also seem to say that for
Excel it’s Spreadsheet::ParseExcel.
Spreadsheet::ParseExcel apparantly will take a filehandle instead of a
spreadsheet name, giving you the opportunity to set the encoding you
want when you open the input file or when you binmode() it. See the docs
I could have sworn I saw documentation somewhere in the Encode-related
modules for a subroutine that would try to guess the encoding of a chunk
of text, but at the moment I can’t find it.
Harryfmudd [at 2007-04-16 18:25:08
It’s Encode::Guess. Duh.