如何使用Java将汉字保存到文件中？_随笔

如何使用Java将汉字保存到文件中？

这里有几个因素在起作用：

文本文件没有用于描述其编码的内在元数据（就尖括号税而言，XML受欢迎是有原因的）
Windows的默认编码仍然是8位（或双字节）“ ANSI ”字符集，其值范围有限-以这种格式编写的文本文件不可移植
为了从ANSI文件中识别出Unipre文件，Windows应用程序依赖文件开头的字节顺序标记（严格意义上不是这样-Raymond Chen解释）。理论上，BOM可以告诉您数据的字节序（字节顺序）。对于UTF-8，即使只有一个字节顺序，Windows应用程序也依靠标记字节来自动确定它是Unipre（尽管您会注意到Notepad在其打开/保存对话框中有一个编码选项）。
说Java损坏是错误的，因为Java不会自动编写UTF-8 BOM。例如，在Unix系统上，将BOM表写入脚本文件将是错误的，并且许多Unix系统使用UTF-8作为其默认编码。有时候，您也不希望在Windows上使用它，例如将数据追加到现有文件中时：
```
fos = new FileOutputStream(FileName,Append);
```

这是一种将UTF-8数据可靠地附加到文件的方法：

  private static void writeUtf8ToFile(File file, boolean append, String data)      throws IOException {    boolean skipBOM = append && file.isFile() && (file.length() > 0);    Closer res = new Closer();    try {      OutputStream out = res.using(new FileOutputStream(file, append));      Writer writer = res.using(new OutputStreamWriter(out, Charset          .forName("UTF-8")));      if (!skipBOM) {        writer.write('uFEFF');      }      writer.write(data);    } finally {      res.close();    }  }

用法：

  public static void main(String[] args) throws IOException {    String chinese = "u4E0Au6D77";    boolean append = true;    writeUtf8ToFile(new File("chinese.txt"), append, chinese);  }

注意：如果文件已经存在，并且您选择追加并且现有数据不是 UTF-8编码的，那么代码将创建的唯一内容就是一团糟。

这是

Closer

此代码中使用的类型：

public class Closer implements Closeable {  private Closeable closeable;  public <T extends Closeable> T using(T t) {    closeable = t;    return t;  }  @Override public void close() throws IOException {    if (closeable != null) {      closeable.close();    }  }}

此代码使Windows最佳地猜测如何基于字节顺序标记读取文件：

  private static final Charset[] UTF_ENCODINGS = { Charset.forName("UTF-8"),      Charset.forName("UTF-16LE"), Charset.forName("UTF-16BE") };  private static Charset getEncoding(InputStream in) throws IOException {    charsetLoop: for (Charset encodings : UTF_ENCODINGS) {      byte[] bom = "uFEFF".getBytes(encodings);      in.mark(bom.length);      for (byte b : bom) {        if ((0xFF & b) != in.read()) {          in.reset();          continue charsetLoop;        }      }      return encodings;    }    return Charset.defaultCharset();  }  private static String readText(File file) throws IOException {    Closer res = new Closer();    try {      InputStream in = res.using(new FileInputStream(file));      InputStream bin = res.using(new BufferedInputStream(in));      Reader reader = res.using(new InputStreamReader(bin, getEncoding(bin)));      StringBuilder out = new StringBuilder();      for (int ch = reader.read(); ch != -1; ch = reader.read())        out.append((char) ch);      return out.toString();    } finally {      res.close();    }  }

用法：

  public static void main(String[] args) throws IOException {    System.out.println(readText(new File("chinese.txt")));  }

（System.out使用默认编码，因此是否打印任何有意义的内容取决于您的平台和配置。）

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/5489782.html

如何使用Java将汉字保存到文件中？

发表评论

评论列表（0条）