概述所以我们有一些非常低效的代码,根据允许的最大大小将pdf分成更小的块.阿卡.如果最大大小为10megs,则跳过8兆字节文件,而根据页数分割16兆字节文件. 这是我继承的代码,并且感觉必须有一种更有效的方法来执行此 *** 作,只需要一个方法和较少的对象实例化. 我们使用以下代码来调用方法: List<int> splitPoints = null; List<byte[]> documen 所以我们有一些非常低效的代码,根据允许的最大大小将pdf分成更小的块.阿卡.如果最大大小为10megs,则跳过8兆字节文件,而根据页数分割16兆字节文件.

这是我继承的代码,并且感觉必须有一种更有效的方法来执行此 *** 作,只需要一个方法和较少的对象实例化.


List<int> splitPoints = null;        List<byte[]> documents = null;        splitPoints = this.GetpdfSplitPoints(currentdocument,maxSize);        documents = this.Splitpdf(currentdocument,maxSize,splitPoints);


private List<int> GetpdfSplitPoints(IClaimdocument currentdocument,int maxSize)    {        List<int> splitPoints = new List<int>();        pdfReader reader = null;        document document = null;        int pagesRemaining = currentdocument.Pages;        while (pagesRemaining > 0)        {            reader = new pdfReader(currentdocument.Data);            document = new document(reader.GetPageSizeWithRotation(1));            using (MemoryStream ms = new MemoryStream())            {                pdfcopy copy = new pdfcopy(document,ms);                pdfimportedPage page = null;                document.open();                //Add pages until we run out from the original                for (int i = 0; i < currentdocument.Pages; i++)                {                    int currentPage = currentdocument.Pages - (pagesRemaining - 1);                    if (pagesRemaining == 0)                    {                        //The whole document has bee traversed                        break;                    }                    page = copy.GetimportedPage(reader,currentPage);                    copy.AddPage(page);                    //If the current collection of pages exceeds the maximum size,we save off the index and start again                    if (copy.CurrentdocumentSize > maxSize)                    {                        if (i == 0)                        {                            //One page is greater than the maximum size                            throw new Exception("one page is greater than the maximum size and cannot be processed");                        }                        //We have gone one page too far,save this split index                           splitPoints.Add(currentdocument.Pages - (pagesRemaining - 1));                        break;                    }                    else                    {                        pagesRemaining--;                    }                }                page = null;                document.Close();                document.dispose();                copy.Close();                copy.dispose();                copy = null;            }        }        if (reader != null)        {            reader.Close();            reader = null;        }        document = null;        return splitPoints;    }    private List<byte[]> Splitpdf(IClaimdocument currentdocument,int maxSize,List<int> splitPoints)    {        var documents = new List<byte[]>();        pdfReader reader = null;        document document = null;        MemoryStream fs = null;        int pagesRemaining = currentdocument.Pages;        while (pagesRemaining > 0)        {            reader = new pdfReader(currentdocument.Data);            document = new document(reader.GetPageSizeWithRotation(1));            fs = new MemoryStream();            pdfcopy copy = new pdfcopy(document,fs);            pdfimportedPage page = null;            document.open();            //Add pages until we run out from the original            for (int i = 0; i <= currentdocument.Pages; i++)            {                int currentPage = currentdocument.Pages - (pagesRemaining - 1);                if (pagesRemaining == 0)                {                    //We have traversed all pages                    //The call to copy.Close() MUST come before using fs.ToArray() because copy.Close() finalizes the document                    fs.Flush();                    copy.Close();                    documents.Add(fs.ToArray());                    document.Close();                    fs.dispose();                    break;                }                page = copy.GetimportedPage(reader,currentPage);                copy.AddPage(page);                pagesRemaining--;                if (splitPoints.Contains(currentPage + 1) == true)                {                    //Need to start a new document                    //The call to copy.Close() MUST come before using fs.ToArray() because copy.Close() finalizes the document                    fs.Flush();                    copy.Close();                    documents.Add(fs.ToArray());                    document.Close();                    fs.dispose();                    break;                }            }            copy = null;            page = null;            fs.dispose();        }        if (reader != null)        {            reader.Close();            reader = null;        }        if (document != null)        {            document.Close();            document.dispose();            document = null;        }        if (fs != null)        {            fs.Close();            fs.dispose();            fs = null;        }        return documents;    }




基本上,这是循环的一部分,遍历任意数量的pdf,然后拆分它们并将它们存储在数据库中.现在,我们不得不改变方法,一次完成所有这些(最后一次运行是各种大小的97 pdf),每5分钟通过系统运行5个pdf.当我们将工具增加到更多客户端时,这并不理想,并且不能很好地扩展.

(我们正在处理50 -100兆字节的pdf,但它们可能更大).

解决方法 我也继承了这个确切的代码,似乎存在一个重大缺陷.在GetpdfSplitPoints方法中,它根据maxsize检查复制页面的总大小,以确定在哪个页面拆分文件. 在Splitpdf方法中,当它到达发生拆分的页面时,确定该点上的MemoryStream低于允许的最大大小,并且还有一页将使其超出限制.但是在document.Close()之后;执行后,还有更多内容被添加到MemoryStream中(在我使用过的pdf文件中,MemoryStream的长度从文件前后的9 MB变为19 MB.关闭).我的理解是,复制页面的所有必要资源都会在关闭时添加. 我猜我必须完全重写这段代码,以确保我不会超过最大尺寸,同时保持原始页面的完整性. 总结

