c# – 为什么下载的并发数量有限制?

c# – 为什么下载的并发数量有限制?,第1张

概述我正在尝试制作自己的简单网络抓取工具.我想从URL下载具有特定扩展名的文件.我写了以下代码: private void button1_Click(object sender, RoutedEventArgs e) { if (bw.IsBusy) return; bw.DoWork += new DoWorkEventHandler(bw_DoWork); 我正在尝试制作自己的简单网络抓取工具.我想从URL下载具有特定扩展名的文件.我写了以下代码:
private voID button1_Click(object sender,RoutedEventArgs e)    {        if (bw.IsBusy) return;        bw.DoWork += new DoWorkEventHandler(bw_DoWork);        bw.RunWorkerAsync(new string[] { URL.Text,SavePath.Text,Filter.Text });    }    //--------------------------------------------------------------------------------------------    voID bw_DoWork(object sender,DoWorkEventArgs e)    {        try        {            ThreadPool.SetMaxThreads(4,4);            string[] strs = e.Argument as string[];            Regex reg = new Regex("<a(\s*[^>]*?){0,1}\s*href\s*\=\s*\\"([^>]*?)\\"\s*[^>]*>(.*?)</a>",RegexOptions.Compiled | RegexOptions.CultureInvariant | RegexOptions.IgnoreCase);            int i = 0;            string domainS = strs[0];            string Extensions = strs[2];            string OutDir = strs[1];            var domain = new Uri(domainS);            string[] Filters = Extensions.Split(new char[] { ';',',' ' },StringSplitoptions.RemoveEmptyEntrIEs);            string outPath = System.IO.Path.Combine(OutDir,string.Format("file_{0}.HTML",i));            WebClIEnt webClIEnt = new WebClIEnt();            string str = webClIEnt.DownloadString(domainS);            str = str.Replace("\r\n"," ").Replace('\n',' ');            MatchCollection mc = reg.Matches(str);            int NumOfThreads = mc.Count;            Parallel.ForEach(mc.Cast<Match>(),new ParallelOptions { MaxDegreeOfParallelism = 2,},mat =>            {                string val = mat.Groups[2].Value;                var link = new Uri(domain,val);                foreach (string ext in Filters)                    if (val.EndsWith("." + ext))                    {                        Download((object)new object[] { OutDir,link });                        break;                    }            });            throw new Exception("Finished !");        }        catch (System.Exception ex)        {            ReportException(ex);        }        finally        {        }    }    //--------------------------------------------------------------------------------------------    private static voID Download(object o)    {        try        {            object[] obJs = o as object[];            Uri link = (Uri)obJs[1];            string outPath = System.IO.Path.Combine((string)obJs[0],System.IO.Path.Getfilename(link.ToString()));            if (!file.Exists(outPath))            {                //WebClIEnt webClIEnt = new WebClIEnt();                //webClIEnt.Downloadfile(link,outPath);                Downloadfile(link.ToString(),outPath);            }        }        catch (System.Exception ex)        {            ReportException(ex);        }    }    //--------------------------------------------------------------------------------------------    private static bool Downloadfile(string url,string filePath)    {        try        {            httpWebRequest request = (httpWebRequest)httpWebRequest.Create(url);            request.UserAgent = "Web Crawler";            request.Timeout = 40000;            WebResponse response = request.GetResponse();            Stream stream = response.GetResponseStream();            using (fileStream fs = new fileStream(filePath,fileMode.CreateNew))            {                const int siz = 1000;                byte[] bytes = new byte[siz];                for (; ; )                {                    int count = stream.Read(bytes,siz);                    fs.Write(bytes,count);                    if (count == 0) break;                }                fs.Flush();                fs.Close();            }        }        catch (System.Exception ex)        {            ReportException(ex);            return false;        }        finally        {        }        return true;    }

问题是虽然它适用于2个并行下载:

new ParallelOptions { MaxDegreeOfParallelism = 2,}

…它不适用于更大程度的并行性,如:

new ParallelOptions { MaxDegreeOfParallelism = 5,}

…我得到连接超时异常.

起初我以为是因为WebClIEnt:

//WebClIEnt webClIEnt = new WebClIEnt();                //webClIEnt.Downloadfile(link,outPath);

…但是当我用使用httpWebRequest的函数Downloadfile替换它时,我仍然遇到错误.

我已在许多网页上测试过,没有任何改变.我还确认了Chrome的扩展“Download Master”,这些Web服务器允许多个并行下载.
有没有人知道为什么我会超时尝试并行下载多个文件?

解决方法 您需要分配 ServicePointManager.DefaultConnectionLimit.同一主机的默认并发连接是2.有关使用web.config connectionManagement的信息,请参阅 @L_403_2@. 总结

以上是内存溢出为你收集整理的c# – 为什么下载的并发数量有限制?全部内容,希望文章能够帮你解决c# – 为什么下载的并发数量有限制?所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/1243651.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-06-06
下一篇 2022-06-06

发表评论

登录后才能评论

评论列表(0条)

保存