Setting the JVM character encoding on the AS400

When running a Java application (on any system) the JVM can be started with different character encodings. Typically it will pickup the system default based on the OS and locale system settings. For an excellent introduction to character encodings I can highly recommend Joel Spolsky’s article.

I recently encountered a problem with this on an AS400 where the Java application was trying to write a file out to the IFS file system. The filename in question contained Polish characters, or at least it was supposed to.

You can see the encoding in use by looking at the JVM properties against the job. Use WRKJOB to work with the job and take option 45 to Work with Java virtual machine. There are a couple of places you can then check.

  • Option 2 will show the system environment variables that were used to initialize the JVM
  • Option 7 will show the current Java system properties that are in use by the JVM

Searching online I saw many sources claiming that if you set the JVM argument file.encoding that this will set the default character encoding. On the AS400 this can be achieved by navigating to the SystemDefault.properties file and adding the line:

file.encoding=UTF8

Note: SystemDefault.properties can either reside in /QIBM/UserData/Java400 or in the user home directory for the user starting the JVM.

This may work when it comes to writing out the contents of a file but it has no effect on the encoding used to read or write filenames.

To properly influence the encoding used to initialize the JVM you have to set environment variables for the locale. In the Linux/Unix world this can be done with LC_ALL environment variable which can be set to something like en_US.UTF-8.

The AS400 isn’t a *nix platform so LC_ALL does not apply and the JVM is an IBM platform specific implementation. By looking at the environment variables against option 45 in WRKJOB and trial and error I managed to find that setting the following two environment variables did the trick.

Using ADDENVVAR add the following two environment variables:

QIBM_PASE_CCSID = 1208
PASE_LANG = EN_US

You can try different combinations of CCSID and locale depending on your desired character encoding. I set this up so we would be using UTF-8.

A list of locales can be found here. Note that en_US is either ISO8859-1 or 8859-15 whereas EN_US is UTF-8.

 

Introduction

Hello World!

This is my first post on this blog, or any blog, so I thought I’d do a brief introduction and set out my aims. My name is Ben and I work as a software developer based in the UK. I recently started a small project to write a web app I could use to learn a bit of French. I use java in my day job so I was looking to try out some different frameworks that may be of use at work. (Spring, hibernate, that kind of thing)

As a programmer you’re constantly solving small problems but once you get something working it’s all too easy to move on and forget. Even with version control systems it’s not always easy to find solutions from the past. If the codebase is large you have to remember what to search on, sometimes you can’t even access the code any more. My aim then is to document these solutions so I have a record to refer back to. I decided to document this as a blog so that others could potentially read and use the same solutions.

I’ve decided not to call this a java blog specifically because I may end up blogging about other languages or tools I come across.

Thanks

Ben