找回密码
 加入
搜索
查看: 2117|回复: 9

[效率算法] 【已解决】又遇到正则问题,有办法取得图片中那种数据麽?

  [复制链接]
发表于 2013-1-22 17:46:59 | 显示全部楼层 |阅读模式
本帖最后由 huangke 于 2013-1-22 21:21 编辑
<DIV id=content class=content mod-cs-content text-content clearfix>
<P><SPAN>2013/01/30</SPAN></P>
<P><SPAN><SPAN>2013/05/30</SPAN><BR></SPAN></P>
<P><SPAN><SPAN><SPAN>2013/12/30</SPAN><BR></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN>2014/01/01</SPAN><BR></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN>2013/06/01</SPAN><BR></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2014/01/01</SPAN></SPAN><BR></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2014/06/01</SPAN><BR></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2014/12/01</SPAN><BR></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2015/01/01</SPAN><BR></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2015/06/01</SPAN></SPAN><BR></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2015/12/01</SPAN></SPAN><BR></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2016/01/01</SPAN><BR></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>1</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>3</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>4</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>5</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>6</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>7</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>8</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>9</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>
<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>10</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P></DIV>
<DIV class=mod-tagbox clearfix>
上面数据是网页中取得的源代码,想取得里面的一些数据,毫无头绪ING。。。。。。。知道用 StringRegExp,看来得花时间看看 ?.()等奇怪的符号了。
如何取得下面的数据:(就是<span></span>里面的值,但是很多<span>与</span>是木有匹配的。)

2013/01/30

2013/05/30

2013/12/30

2014/01/01

2013/06/01

2014/01/01

2014/06/01

2014/12/01

2015/01/01

2015/06/01

2015/12/01

2016/01/01

1

2

3

4

5

6

7

8

9

10

发表于 2013-1-22 18:00:32 | 显示全部楼层
(?i)SPAN>([\d\/]+)
发表于 2013-1-22 18:44:20 | 显示全部楼层
>([\d\/]+)
发表于 2013-1-22 20:24:05 | 显示全部楼层
本帖最后由 lpxx 于 2013-1-22 20:25 编辑

简单点,去除html标签就行了
StringRegExpReplace($str, '<[^>]*>', '', 0)

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有账号?加入

×
 楼主| 发表于 2013-1-22 21:20:35 | 显示全部楼层
楼上的大牛,牛逼!强大!回去学正则,发奋呐!
 楼主| 发表于 2013-1-22 21:54:08 | 显示全部楼层
afan 发表于 2013-1-22 18:00



    afan这个是相反了,数据去掉了。。。留下很多标签,数据不见鸟
发表于 2013-1-22 22:00:56 | 显示全部楼层
本帖最后由 afan 于 2013-1-22 22:07 编辑
afan这个是相反了,数据去掉了。。。留下很多标签,数据不见鸟
huangke 发表于 2013-1-22 21:54



    你不是要“取到”某某数据吗? 一般来说,用 StringRegExp() 来取匹配的数据。
当然,也可以用 StringRegExpReplace() 来剔取,如果需要整体使用获取的数据的话。

正则有匹配和替换两种形式,可根据需要选择高效、适合使用的方式。
比如上面获取的 2013/01/30、2013/05/30、…… 需要单独使用(如列表显示)则适合使用匹配方式;需要整体使用(如输出Txt文档)则适合使用替换方式。
 楼主| 发表于 2013-1-22 22:17:53 | 显示全部楼层
你不是要“取到”某某数据吗? 一般来说,用 StringRegExp() 来取匹配的数据。
当然,也可以用  ...
afan 发表于 2013-1-22 22:00



    哦哦,我搞错了。。。A版的木有错,我错了。。。。
$array = StringRegExp($html, '(?i)SPAN>([\d\/]+)', 3)
_ArrayDisplay($array)
发表于 2013-1-22 22:17:56 | 显示全部楼层
afan这个是相反了,数据去掉了。。。留下很多标签,数据不见鸟
huangke 发表于 2013-1-22 21:54



    适用例子
#include <Array.au3>
Local $Str = _
                '<DIV id=content class=content mod-cs-content text-content clearfix>' & @CRLF & _
                '<P><SPAN>2013/01/30</SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN>2013/05/30</SPAN><BR></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN>2013/12/30</SPAN><BR></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN>2014/01/01</SPAN><BR></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN>2013/06/01</SPAN><BR></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2014/01/01</SPAN></SPAN><BR></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2014/06/01</SPAN><BR></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2014/12/01</SPAN><BR></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2015/01/01</SPAN><BR></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2015/06/01</SPAN></SPAN><BR></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2015/12/01</SPAN></SPAN><BR></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2016/01/01</SPAN><BR></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>1</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>2</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>3</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>4</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>5</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>6</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>7</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>8</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>9</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P>' & @CRLF & _
                '<P><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN><SPAN>10</SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></SPAN></P></DIV>' & @CRLF & _
                '<DIV class=mod-tagbox clearfix>'

Local $Test = StringRegExp($str, '(?i)SPAN>([\d\/]+)', 3)
If @Error Then Exit
_ArrayDisplay($Test, '列表需要')

Local $Test1 = StringRegExpReplace($str, '<.*?>', '')
MsgBox(0, '整体需要', $Test1)
 楼主| 发表于 2013-1-23 00:30:11 | 显示全部楼层
哈哈,强强强,,,收下,猛学习!
您需要登录后才可以回帖 登录 | 加入

本版积分规则

QQ|手机版|小黑屋|AUTOIT CN ( 鲁ICP备19019924号-1 )谷歌 百度

GMT+8, 2024-5-12 23:27 , Processed in 0.077724 second(s), 20 queries .

Powered by Discuz! X3.5 Licensed

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表