Skip to content

fix(route): Fixed an issue where Huxiu subscriptions were being blocked by the WAF#21706

Closed
cnlhl wants to merge 3 commits intoDIYgod:masterfrom
cnlhl:master
Closed

fix(route): Fixed an issue where Huxiu subscriptions were being blocked by the WAF#21706
cnlhl wants to merge 3 commits intoDIYgod:masterfrom
cnlhl:master

Conversation

@cnlhl
Copy link
Copy Markdown

@cnlhl cnlhl commented Apr 13, 2026

Involved Issue / 该 PR 相关 Issue

Close #

Example for the Proposed Route(s) / 路由地址示例

/huxiu/club/1000                                          
/huxiu/article/429087.html                     
/huxiu/brief/301102.html  

New RSS Route Checklist / 新 RSS 路由检查表

  • New Route / 新的路由
  • Anti-bot or rate limit / 反爬/频率限制
    • If yes, do your code reflect this sign? / 如果有, 是否有对应的措施?
  • Date and time / 日期和时间
    • Parsed / 可以解析
    • Correct time zone / 时区正确
  • New package added / 添加了新的包
  • Puppeteer

Note / 说明

The Huxiu website has enabled the Alibaba Cloud Shield WAF firewall, causing standard HTTP
requests to be intercepted and redirected to a verification page, which prevents the retrieval of actual NUXT_DATA data.

This update modifies the fetchItem function to render the page using Puppeteer instead of got, in order to bypass the WAF
verification:

  • Use getPuppeteerPage to retrieve the full rendered page content
  • Block non-essential resources (images, CSS, etc.) to speed up loading
  • Support both __NUXT_DATA__ and __INITIAL_STATE__ data formats
  • Add fallback HTML parsing for the brief page

lihaolin and others added 2 commits April 13, 2026 14:39
The huxiu.com server has enabled Alibaba Cloud WAF which blocks
regular HTTP requests. Changed fetchItem to use puppeteer for
rendering pages to bypass the WAF verification.

- Use getPuppeteerPage instead of got for detail pages
- Block unnecessary resources (images, CSS) to speed up loading
- Add fallback HTML parsing for brief pages
- Support both __NUXT_DATA__ and __INITIAL_STATE__ formats

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added route auto: not ready to review Users can't get the RSS feed output according to automated testing results labels Apr 13, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Successfully generated as following:

http://localhost:1200/huxiu/club/1000 - Failed ❌
HTTPError: Response code 503 (Service Unavailable)

Error Message:<br/>Error: [rebrowser-patches] acquireContextId failed (tryAgain = false, tryCount = 1), errorMessage: Protocol error (Runtime.addBinding): Session closed. Most likely the page has been closed.
Route: /huxiu/club/:id
Full Route: /huxiu/club/1000
Node Version: v24.14.1
Git Hash: 2fb0f04a
http://localhost:1200/huxiu/article/429087.html - Failed ❌
HTTPError: Response code 503 (Service Unavailable)

Error Message:<br/>NotFoundError:
Route: /huxiu/article/429087.html
Full Route: /huxiu/article/429087.html
Node Version: v24.14.1
Git Hash: 2fb0f04a
http://localhost:1200/huxiu/brief/301102.html - Failed ❌
HTTPError: Response code 503 (Service Unavailable)

Error Message:<br/>NotFoundError:
Route: /huxiu/brief/301102.html
Full Route: /huxiu/brief/301102.html
Node Version: v24.14.1
Git Hash: 2fb0f04a

@github-actions
Copy link
Copy Markdown
Contributor

Auto Review

No clear rule violations found in the current diff.

@cnlhl
Copy link
Copy Markdown
Author

cnlhl commented Apr 13, 2026

Hi maintainers, I've updated the Huxiu route to use getPuppeteerPage to bypass the Aliyun WAF protection.

The code works on my local machine.
image

However, it's failing in the CI environment with 503 and acquireContextId failed. I suspect this is because the GitHub Actions IP is strictly blocked by Aliyun WAF, or Puppeteer is crashing in the CI container limits.

Could you please review and test it locally? Let me know if there's a better way to handle strict WAF routes in CI.

Use a single Puppeteer page to clear Huxiu's WAF and fetch club brief detail HTML in small in-browser batches, avoiding browserless concurrency spikes while keeping detail parsing local.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot removed the auto: not ready to review Users can't get the RSS feed output according to automated testing results label Apr 15, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Successfully generated as following:

http://localhost:1200/huxiu/club/1000 - Success ✔️
<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" version="2.0">
  <channel>
    <title>虎嗅报童-虎嗅网</title>
    <link>https://www.huxiu.com/club/1000.html</link>
    <atom:link href="http://localhost:1200/huxiu/club/1000" rel="self" type="application/rss+xml"></atom:link>
    <description>获取最新行业资讯,十五分钟尽知天下事。本栏目由虎嗅内容运营团队出品。 - Powered by RSSHub</description>
    <generator>RSSHub</generator>
    <webMaster>contact@rsshub.app (RSSHub)</webMaster>
    <itunes:author>虎嗅网</itunes:author>
    <itunes:category text="News"></itunes:category>
    <itunes:explicit>false</itunes:explicit>
    <language>en</language>
    <image>
      <url>https://img.huxiucdn.com/img/brief/202305/08/172636853912.png</url>
      <title>虎嗅报童-虎嗅网</title>
      <link>https://www.huxiu.com/club/1000.html</link>
    </image>
    <lastBuildDate>Wed, 15 Apr 2026 08:12:33 GMT</lastBuildDate>
    <ttl>5</ttl>
    <item>
      <title>霍尔木兹通航美军续封伊港;王石否认被抓传闻;Lululemon中国产品不含有害物质;央行开展5000亿元逆回购</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292664.html</link>
      <guid isPermaLink="false">huxiu-brief-292664</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>巴防长称美伊谈判即将开启;张勇将为强制买礼道歉;茅台回应王莉被调查;国新办介绍一季度进出口情况</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292660.html</link>
      <guid isPermaLink="false">huxiu-brief-292660</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>油价大涨黄金跌破4650美元;王石回应被抓传言;云南农信社回应招聘质疑;国务院举行政策吹风会</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292656.html</link>
      <guid isPermaLink="false">huxiu-brief-292656</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>伊朗最高领袖发布声明;美宜佳解约606家加盟商;李佳琦直播间回应未婚不退休;美伊谈判在伊斯兰堡开启</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292649.html</link>
      <guid isPermaLink="false">huxiu-brief-292649</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>霍尔木兹海峡再度关闭;阿里云CTO由李飞飞出任;新能安拟与张雪机车合作;两高发布民航安全司法解释</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292644.html</link>
      <guid isPermaLink="false">huxiu-brief-292644</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>伊朗同意停火将与美谈判;成品油价格适度调整;海康威视否认监控漏洞传闻;特朗普与军方召开发布会</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292638.html</link>
      <guid isPermaLink="false">huxiu-brief-292638</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>伊朗回应美停战提议提十条款;苹果折叠屏手机开始试产;易中天获厦大最高荣誉;中国驻以使馆4月7日集中撤离</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292630.html</link>
      <guid isPermaLink="false">huxiu-brief-292630</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>伊朗称袭击迪拜数据中心阿联酋否认;三部门约谈三家平台企业;与辉同行回应优思益全额退款;港股美股今日休市</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292624.html</link>
      <guid isPermaLink="false">huxiu-brief-292624</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>特朗普称霍尔木兹开放才停火;核查跨境电商优思益违规营销;张雪机车禁新手买820RR遭投诉;阿尔忒弥斯2号载人绕月发射</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292618.html</link>
      <guid isPermaLink="false">huxiu-brief-292618</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>特朗普称两三周内结束伊朗战事;中巴提出中东和平五点倡议;税务回应鞠婧祎涉税;医保改革药店购药新规</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292613.html</link>
      <guid isPermaLink="false">huxiu-brief-292613</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>特朗普欲于4月6日前与伊朗达协议;贵州茅台时隔两年再涨价;长安医院回应高价慰问果篮;3月PMI采购经理指数发布</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292607.html</link>
      <guid isPermaLink="false">huxiu-brief-292607</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>美军速决地面战方案曝光;内存条价格大幅下跌;单依纯回应李白版权问题;G7财长商议释放石油储备</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292600.html</link>
      <guid isPermaLink="false">huxiu-brief-292600</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>伊朗回应美停火并提条件;月之暗面拟赴港上市;斯柯达退出中国大众回应;中关村论坛AI主题日开幕</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292593.html</link>
      <guid isPermaLink="false">huxiu-brief-292593</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>伊朗拒绝美停战提五条件;黄天鹅公布检测结果;信达证券所长涉猥亵女下属;WTO第14届部长级会议开幕</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292586.html</link>
      <guid isPermaLink="false">huxiu-brief-292586</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>张雪峰逝世;Token中文名正式定为词元;美团致歉承担全部损失;2026中关村论坛年会开幕</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292581.html</link>
      <guid isPermaLink="false">huxiu-brief-292581</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>伊美将在巴基斯坦会谈;大疆起诉影石;寿司郎门头沟店异物未检出寄生虫;阿里今日将发布新芯片</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292577.html</link>
      <guid isPermaLink="false">huxiu-brief-292577</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>以军称将加大南黎地面攻势;微信上线龙虾官方插件;OpenClaw创始人证实360漏洞;成品油价格上调</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292571.html</link>
      <guid isPermaLink="false">huxiu-brief-292571</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>黄金逼近4500美元白银跌超13%;央行维护金融市场稳定;女子体检收三份不同CT报告;华为发布数据存储新品</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292563.html</link>
      <guid isPermaLink="false">huxiu-brief-292563</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>鲍威尔称美经济不确定降息或减;美股三大指数齐跌油价大涨;美团回应北大毕业生送外卖;华为2026合作伙伴大会开幕</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292558.html</link>
      <guid isPermaLink="false">huxiu-brief-292558</guid>
      <author>虎嗅早报</author>
    </item>
    <item>
      <title>特朗普称美或应退出北约;库克否认退休传闻;黄天鹅回应角黄素质疑;腾讯将公布2025年四季报</title>
      <description></description>
      <link>https://www.huxiu.com/brief/292552.html</link>
      <guid isPermaLink="false">huxiu-brief-292552</guid>
      <author>虎嗅早报</author>
    </item>
  </channel>
</rss>
http://localhost:1200/huxiu/article/429087.html - Failed ❌
HTTPError: Response code 503 (Service Unavailable)

Error Message:<br/>NotFoundError:
Route: /huxiu/article/429087.html
Full Route: /huxiu/article/429087.html
Node Version: v24.14.1
Git Hash: 5892fa6c
http://localhost:1200/huxiu/brief/301102.html - Failed ❌
HTTPError: Response code 503 (Service Unavailable)

Error Message:<br/>NotFoundError:
Route: /huxiu/brief/301102.html
Full Route: /huxiu/brief/301102.html
Node Version: v24.14.1
Git Hash: 5892fa6c

@github-actions github-actions bot added the auto: not ready to review Users can't get the RSS feed output according to automated testing results label Apr 15, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Auto Review

No clear rule violations found in the current diff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto: not ready to review Users can't get the RSS feed output according to automated testing results route

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant