It's all about the answers!

Ask a question

UTF-8


Lance Feagan (111) | asked Sep 10 '09, 1:01 p.m.
Hi, I am beginning to use RTC for source code management and am very impressed so far. I did run into one "hicckup" regarding a file that a Windows user had checked into our Subversion repository that I was pulling across to RTC containing the (R) character in ISO-8859-1 encoding. Eclipse thought the file should be UTF-8 and gave an error when I tried to do the initial load.

Thankfully, I was able to quickly track down the culprit files and fix them using a combination of the find, file, and recode/iconv commands. I then added them into the checkin that I wanted to do, created a baseline, and all was well.

This incident got me thinking, however, about character encoding issues and managing a project.
1) For starters, if I am focusing solely on Eclipse projects, should I require my developers to all use UTF-8 encoding for Java files?
2) Are there any cases where I would need to allow ISO-8859-1 encoding with Java files?
3) How could I make RTC's SCCS disallow ISO-8859-1 and/or only allow UTF-8 encoding on files being checked in?

Thanks in advance for any good ideas on what the best practices are and how to implement them.

One answer



permanent link
Andrew Hoo (1.0k1) | answered Sep 14 '09, 4:08 p.m.
JAZZ DEVELOPER
I think you're pretty close to finding your own best practices. In my
personal opinion if you start trying to enforce things too tightly then
you might make your life harder than it needs to be. To answer your
questions:

1) The SCM team here for RTC uses UTF-8 for all of our source. We don't
'require' it strictly but we found that it helped to just give the entire
team an informal common practice to follow. And we haven't had any
problems. So you could do the same with your developers.

2) Are there any cases where you need to allow the other character set? I
don't know, do you use characters that are not supported by UTF-8? It
sounds like you converted many of your files and are happy with the
results, so I think whatever works for you sounds great.

3) There are no mechanisms to block or allow the checkins of certain
encoding; however, during a commit if you have specified the file to be of
a certain encoding (from a previous checkin) and then you munge the file
to make an invalid stream for that encoding, then the checkin will fail.
While this might be an annoying hicckup as you discovered on the initial
load, I find that my day to day use never requires me to think of the
encoding of a file. I would probably have to work pretty hard to go and
change encoding to accidentally and maliciously disrupt my team. - But if
you feel like there should be more control here, feel free to open up an
enhancment request.



On Thu, 10 Sep 2009 13:08:00 -0400, lfeagan
<lfeagan> wrote:

Hi, I am beginning to use RTC for source code management and am very
impressed so far. I did run into one "hicckup" regarding a
file that a Windows user had checked into our Subversion repository
that I was pulling across to RTC containing the (R) character in
ISO-8859-1 encoding. Eclipse thought the file should be UTF-8 and
gave an error when I tried to do the initial load.

Thankfully, I was able to quickly track down the culprit files and fix
them using a combination of the find, file, and recode/iconv commands.
I then added them into the checkin that I wanted to do, created a
baseline, and all was well.

This incident got me thinking, however, about character encoding
issues and managing a project.
1) For starters, if I am focusing solely on Eclipse projects, should I
require my developers to all use UTF-8 encoding for Java files?
2) Are there any cases where I would need to allow ISO-8859-1 encoding
with Java files?
3) How could I make RTC's SCCS disallow ISO-8859-1 and/or only allow
UTF-8 encoding on files being checked in?

Thanks in advance for any good ideas on what the best practices are
and how to implement them.



--

Your answer


Register or to post your answer.


Dashboards and work items are no longer publicly available, so some links may be invalid. We now provide similar information through other means. Learn more here.