Products
GG网络技术分享 2025-03-18 16:15 13
I have the following RegEx ^http:\\/\\/(?!www\\.)(.*)$
Expected behavior:
http://example.com - Matchhttp://www.example.com - Does not match
It looks like golang
does not support negative lookahead. How can I rewrite this RegEx to work on golang
?
UPDATE
I\'m not coding using golang, I\'m using Traefik that accepts a Regex (golang flavor) as a config value, so basically I have this:
regex = \"^https://(.*)$\"replacement = \"https://www.$1\"
What I want is to always add www. to the URL, but NOT if the URL has it already, otherwise it would become www.www.*
图片转代码服务由CSDN问答提供
感谢您的意见,我们尽快改进~
功能建议我有以下RegEx ^ http:\\ / \\ /(?! www \\。)(。 *)$ </ code> </ p>
预期的行为:</ p>
http://example.com-匹配http:// www .example.com-不匹配
</ code> </ pre>
golang </ code>似乎不支持否定超前。 我如何重写此RegEx以在 golang </ code>上工作?</ p>
更新</ strong> </ p>
我是 我没有使用golang进行编码,因此我使用的是 Traefik ,它接受Regex(golang风格)作为配置值,因此基本上 我有这个:</ p>
regex =“ ^ https://(。*)$” replacement =“ https://www.$1”
</ code> </ pre>
我想要的是始终在URL中添加 www。</ strong>,但是如果URL中已经包含 NOT </ strong>,则为 NOT </ strong>,否则 成为 www.www。* </ strong> </ p>
</ div>
网友观点:
If you\'re really bent on creating a negative lookahead manually, you will need to exclude all possible w
in the regexp:
^https?://(([^w].+|w(|[^w].*)|ww(|[^w].+)|www.+)\\.)?example\\.com$
This regexp allows any word with a dot before example.com
, unless that word is just www
. It does so by allowing any word that does not start with w
, or, if it starts with w
it is either just that w
or followed by a non-w
and other stuff. If it starts with two w
, then it must be either just that or followed by a non-w
. If it starts with www
, it must be followed by something.
The clarification makes this much much easier. The approach is to always (optionally) match www.
and then to put that back in the replacement always:
Search:
^http://(?:www\\.)?(.*)\\b$
Replace:
http://www.$1
Golang uses the RE2 regex engine, which doesn\'t support look arounds of any kind.
Since you are dealing with URLs, you can simply parse them and inspect the host part:
package mainimport (
\\\"net/url\\\"
\\\"strings\\\"
\\\"testing\\\"
)
func Match(s string) bool {
u, err := url.Parse(s)
switch {
case err != nil:
return false
case u.Scheme != \\\"http\\\":
return false
case u.User != nil:
return false
}
return !strings.HasPrefix(u.Host, \\\"www.\\\")
}
func TestMatch(t *testing.T) {
testCases := []struct {
URL string
Want bool
}{
{\\\"http://example.com\\\", true},
{\\\"http://wwwexample.com\\\", true},
{\\\"http://www.example.com\\\", false},
{\\\"http://user@example.com\\\", false},
{\\\"http://user@www.example.com\\\", false},
{\\\"www.example.com\\\", false},
{\\\"example.com\\\", false},
}
for _, tc := range testCases {
if m := Match(tc.URL); m != tc.Want {
t.Errorf(\\\"Match(%q) = %v; want %v\\\", tc.URL, m, tc.Want)
}
}
}
一个正则表达式引发的血案,让线上CPU100%异常!
作者:陈树义 | 该文来自腾讯云+社区 陈树义的专栏
前几天线上一个项目监控信息突然报告异常,上到机器上后查看相关资源的使用情况,发现 CPU 利用率将近 100%。通过 Java 自带的线程 Dump 工具,我们导出了出问题的堆栈信息。
我们可以看到所有的堆栈都指向了一个名为 validateUrl 的方法,这样的报错信息在堆栈中一共超过 100 处。通过排查代码,我们知道这个方法的主要功能是校验 URL 是否合法。
很奇怪,一个正则表达式怎么会导致 CPU 利用率居高不下。为了弄清楚复现问题,我们将其中的关键代码摘抄出来,做了个简单的单元测试。
Demand feedback