David Dundua Blog
Java FileWriter and UTF-8
Java is just full of surprises (and that is not a good thing). The last thing a developer wants is to be surprised by a language, especially if the behavior is different across platforms. I was trying to run a piece of code on my Mac today, and noticed a problem with UTF-8 characters. The same piece of code works perfectly on our Test servers running Ubuntu and Sun’s JVM.
So after a little bit of poking around I narrowed it down to a problem with FileWriter. It turns out that when you instantiate FileWriter object, it defaults to your platform default character encoding and not UTF-8. I had something like this:
import java.io.File; import java.io.FileWriter; FileWriter writer = new FileWriter(new File("your-output-file-name.pdf")); //UTF-8 characters appear as question marks on OSX
So, instead of using FileWriter, one needs to use its parent class – OutputStreamWriter:
import java.io.File; import java.io.FileOutputStream; import java.io.OutputStreamWriter; FileOutputStream fileStream = new FileOutputStream(new File("your-output-file-name.pdf")); OutputStreamWriter writer = new OutputStreamWriter(fileStream, "UTF-8"); //now UTF-8 characters are appearing properly in the ouput