返回顶部

收藏

正则表达式 Ruby中创建URL排列

更多
# return 4 urls, with and without trailing slash, with and without www
# useful for matching urls passed as params into some function,
# "does this url exist?"... need to make sure you check permutations
def url_permutations(url)
  url, params = url.split("?")
  # without_trailing_slash, with www
  a = url.gsub(/\\/$/, "").gsub(/http(s)?:\\/\\/([^\\/]+)/) do |match|
    protocol = "http#{$1.to_s}"
    domain = $2
    domain = "www.#{domain}" if domain.split(".").length < 3 # www.google.com == 3, google.com == 2
    "#{protocol}://#{domain}"
  end
  # with_trailing_slash, with www
  b = "#{a}/"
  # without_trailing_slash, without www
  c = a.gsub(/http(s)?:\\/\\/www\\./, "http#{$1.to_s}://")
  # with_trailing_slash, without www
  d = "#{c}/"

  [a, b, c, d].map { |url| "#{url}?#{params}"}
end

]
#该片段来自于http://outofmemory.cn

标签:ruby,网络

收藏

0人收藏

支持

0

反对

0

相关聚客文章
  1. 士豪 发表 2012-10-07 08:23:14 libev ev_io源码分析
  2. 老熊 发表 2012-04-24 16:27:58 为11gR2 Grid Infrastructure增加新的public网络
  3. hellolucky 发表 2012-11-01 14:12:09 [ Ruby on Rails ] 簡單好用的驗證碼Gem – Redis Captcha
  4. MK 发表 2013-03-22 16:54:23 JetBrains系列软件之Ruby开发工具RubyMine 5.0破解版
  5. 博主 发表 2006-12-11 08:00:00 ruby and unicode
  6. 博主 发表 2013-04-02 07:50:22 Redis: You Shall Never Be Blamed
  7. Philip Howard 发表 2013-04-09 00:00:00 WiringPi2 Ruby In The Works
  8. 发表 2013-05-02 16:26:51 折腾二级域名RSS
  9. 发表 2013-05-07 02:47:51 爱奇艺合体PPs
  10. 博主 发表 2013-05-14 13:00:00 Ruby 1.9.3-p426 is released
  11. 郑永 发表 2013-05-18 19:46:03 博客近况
  12. 四火 发表 2013-05-26 14:19:42 网络爬虫