返回

首页

业界

电商

创业

访谈

手机

移动

报告

运营

建站

互联网+

系统

教程

易采站长站-移动端

Go语言实现自动填写古诗词实例代码

2020-01-28 13:12:16丽君

目前会爬取百度汉语、古诗词两个网站的数据，如果有更好的数据源，只需要实现Spider接口，在MapSpiderManifest()方法中注册即可。


type Spider interface {
 GetContent(SearchResult) (string, error)
 FindContent(string, string) (SearchResult, error)
}

func MapSpiderManifest() map[string]Spider {
 //初始化并且注册所有的Spider
 spiderMap := make(map[string]Spider)

 //百度
 baiduSpider := new(BaiduSpider)
 spiderMap["baiduSpider"] = baiduSpider

 //古诗文网
 gushiwenSpider := new(GushiwenSpider)
 spiderMap["gushiwenSpider"] = gushiwenSpider
 return spiderMap
}



2.诗词句子查找


古诗文默写，以前上学的时候做的多了，把一句话抠出来，随机选其中几段话让学生默写。一般可以归类为一下几种模式:

开头留空 :_，[_，...],何人不起故园情。

末尾留空：俱往矣，_，[_，...]。

中间留空：月出于东山之上,_,白露横江,

不管是什么样的模式，就每个填空处来看，只有它前面或者后面有提示句，我们才能知道这个空的答案是什么。也就是说，这样的填空可以自主的找到答案，姑且称之为自主空。而前后都没有提示句的空，只能等待附近有自主空找到了答案，才能找到它本身的答案，用一个图说明更加清晰：

图中灰色的块因为有提示句，所以可以通过步骤一种爬取下来的文章内容找到对应的答案，填入Blank中，具体的查找算法如下代码所示：




//已知newFind的PreString，求BlankString和PostString
func makeWithPreContent(contentsSplit []string, newFind *Find) {
 for l := range contentsSplit {
 if isEqual(contentsSplit[l], newFind.PreString) && l < len(contentsSplit)-1 {
 newFind.BlankString = contentsSplit[l+1]
 if l < len(contentsSplit)-2 {
 newFind.PostString = contentsSplit[l+2]
 }
 newFind.BlankFinish = true
 }
 }
}

//已知newFind的PostString，求BlankString和PreString
func makeWithPostContent(contentsSplit []string, newFind *Find) {
 for l := range contentsSplit {
 if isEqual(contentsSplit[l], newFind.PostString) && l > 0 {
 newFind.BlankString = contentsSplit[l-1]
 if l-1 > 0 {
 newFind.PreString = contentsSplit[l-2]
 }
 newFind.BlankFinish = true
 }
 }
}

// 按标点符号分隔内容
func SplitByPunctuation(s string) ([]string, []string) {
 regPunctuation, _ := regexp.Compile(`[,，。.?？！!;；：:]`)
 //匹配标点符号，保存下来。 然后分割字符串
 toPun := regPunctuation.FindAllString(s, -1)
 result := regPunctuation.Split(s, -1)

 if len(result[len(result)-1]) == 0 {
 result = result[:len(result)-1]
 }

 //去掉前后空格，去掉引号
 for i := range result {
 result[i] = strings.TrimSpace(result[i])
 regQuoting := regexp.MustCompile("[“”‘'']")
 result[i] = regQuoting.ReplaceAllString(result[i], "")
 }
 return result, toPun
}								  2/3   首页 上一页 1 2 3 下一页 尾页


		
				
    相关文章
    大家在看


    
			



手把手教你使用正则表达式验证银行帐号
2023-03-15
0万阅读





JS中正则表达式全局匹配正斜杠的方法
2023-03-02
0万阅读





python如何用正则表达式提取字符串
2023-03-02
0万阅读





如何将mov直接刻录成vcd
2023-02-23
3万阅读





火云术语怎么查找术语库
2023-02-17
5万阅读





python中如何使用正则表达式提取数据
2023-02-06
21万阅读





Regex正则表达式判断密码强度
2023-02-03
6万阅读





Regex正则表达式判断密码强度
2023-02-01
6万阅读





python中的正则表达式,贪婪匹配与非贪婪匹配方式
2023-01-31
7万阅读





winrar压缩完后实现自动关机
2023-01-19
8万阅读


			
		
	  
    
	
	


手把手教你使用正则表达式验证银行帐号
2023-03-15
0万阅读





JS中正则表达式全局匹配正斜杠的方法
2023-03-02
0万阅读





python如何用正则表达式提取字符串
2023-03-02
0万阅读





如何将mov直接刻录成vcd
2023-02-23
3万阅读





火云术语怎么查找术语库
2023-02-17
5万阅读





python中如何使用正则表达式提取数据
2023-02-06
21万阅读





Regex正则表达式判断密码强度
2023-02-03
6万阅读





Regex正则表达式判断密码强度
2023-02-01
6万阅读





python中的正则表达式,贪婪匹配与非贪婪匹配方式
2023-01-31
7万阅读





winrar压缩完后实现自动关机
2023-01-19
8万阅读


	
    

        
电脑版 - 移动首页