Java文件中的行数_随笔

Java文件中的行数

这是我到目前为止找到的最快的版本，比readLines快6倍。在150MB的日志文件上，这需要0.35秒，而使用readLines（）则需要2.40秒。只是为了好玩，Linux的wc -l命令需要0.15秒。

public static int countLinesOld(String filename) throws IOException {    InputStream is = new BufferedInputStream(new FileInputStream(filename));    try {        byte[] c = new byte[1024];        int count = 0;        int readChars = 0;        boolean empty = true;        while ((readChars = is.read(c)) != -1) { empty = false; for (int i = 0; i < readChars; ++i) {     if (c[i] == 'n') {         ++count;     } }        }        return (count == 0 && !empty) ? 1 : count;    } finally {        is.close();    }}

编辑，在9 1/2年后：我几乎没有Java经验，但是无论如何我都尝试根据

LineNumberReader

下面的解决方案对该代码进行基准测试，因为它困扰着我没有人做。似乎特别是对于大文件，我的解决方案更快。尽管似乎要花一些时间才能使优化程序完成不错的工作。我已经玩了一些代码，并产生了一个始终最快的新版本：

public static int countLinesNew(String filename) throws IOException {    InputStream is = new BufferedInputStream(new FileInputStream(filename));    try {        byte[] c = new byte[1024];        int readChars = is.read(c);        if (readChars == -1) { // bail out if nothing to read return 0;        }        // make it easy for the optimizer to tune this loop        int count = 0;        while (readChars == 1024) { for (int i=0; i<1024;) {     if (c[i++] == 'n') {         ++count;     } } readChars = is.read(c);        }        // count remaining characters        while (readChars != -1) { System.out.println(readChars); for (int i=0; i<readChars; ++i) {     if (c[i] == 'n') {         ++count;     } } readChars = is.read(c);        }        return count == 0 ? 1 : count;    } finally {        is.close();    }}

1.3GB文本文件的基准结果，y轴以秒为单位。我使用相同的文件执行了100次运行，并使用进行了每次运行测量

System.nanoTime()

。你可以看到其中countLinesOld有一些异常值，并且

countLinesNew

没有异常值，虽然速度更快一点，但是差异在统计上是显着的。

LineNumberReader

显然慢一些。

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/5014194.html

Java文件中的行数

发表评论

评论列表（0条）