HTML解析看我就够了,不依赖任何第三方,两个方法搞定
2016-07-11 本文已影响951人
Caiflower
看完这篇文章你会知道HTML解析其实很简单~
项目中后台返回的数据是HTML格式的,感觉特别蛋疼,花了不少时间找了不少资料,感觉解析起来都特别麻烦,经过一段时间研究,发现一般HTML格式的数据都是有规律可找的,那么福利来了,下面介绍一种不常见的但是非常简单易懂的方式---> 字符串截取
不废话,上代码~
// 声明文件,
@interface GKTopic : NSObject
/// 帖子ID
@property (nonatomic, copy) NSString *id;
/// 帖子标题
@property (nonatomic, copy) NSString *title;
/// 发帖人
@property (nonatomic, copy) NSString *author;
/// 头像url
@property (nonatomic, copy) NSString *avatarImageUrl;
+ (NSArray *)topics;
@end
实现文件
+ (NSArray *)topics {
// 加载html
NSString * html = [NSString stringWithContentsOfFile:[[NSBundle mainBundle] pathForResource:@"v2ex" ofType:@"html"] encoding:NSUTF8StringEncoding error:nil];
NSMutableArray *topics = [NSMutableArray array];
// 设置从哪里开始截取
NSString * matchingBegin = @"cell from_"; // 这个还是需要自己看html源码找规律的~ mathcingEnd 也是一样
// 设置截取到哪里
NSString * mathcingEnd = @"</div>";
NSRange lastRange = NSMakeRange(0, 0);
// 循环截取
while ((lastRange = [html rangeOfString:matchingBegin options:0 range:NSMakeRange(lastRange.location, html.length - lastRange.location)]).location != NSNotFound) {
NSRange endRange = [html rangeOfString:mathcingEnd options:0 range:NSMakeRange(lastRange.location, html.length - lastRange.location)];
if (endRange.location != NSNotFound) {
// 获取区间内字符串
NSString *topicString = [html substringWithRange:NSMakeRange(lastRange.location, endRange.location - lastRange.location)];
// 标签处理
GKTopic * topic = [self topicWithString:topicString];
[topics addObject:topic];
lastRange = endRange;
}else {
break;
}
}
return topics;
}
+ (GKTopic *)topicWithString:(NSString *)string {
GKTopic *topic = [[GKTopic alloc]init];
// 查找发帖作者
topic.author = [string gk_rangeFromeStartString:@"<a href=\"/member/" toEndString:@"\">"];
// 查找用户头像地址
topic.avatarImageUrl = [string gk_rangeFromeStartString:@"<img src=\"" toEndString:@"\" class=\"avatar\""];
// 查找帖子id:如:<a href="/t/291493">,帖子id是291493
topic.id = [string gk_rangeFromeStartString:@"<a href=\"/t/" toEndString:@"\">"];
// 查找帖子标题
NSString *fromStr = [NSString stringWithFormat:@"t/%@\">",topic.id];
topic.title = [string gk_rangeFromeStartString:fromStr toEndString:@"</a>"];
return topic;
}
上面用到的NSString分类的方法
- (NSString *)gk_rangeFromeStartString:(NSString *)startString toEndString:(NSString *)endString
{
NSRange range = [self rangeOfString:startString];
NSString *string;
if (range.location != NSNotFound) {
string = [self substringFromIndex:range.location + range.length];
}
range = [string rangeOfString:endString];
if (range.location != NSNotFound) {
string = [string substringToIndex:range.location];
}
return string;
}
这里简单截取了部分,其他的各位可以自己尝试下,上面返回数组的方法完全可以抽取出来,比如
/**
* @param beginString 起始位置
* @param endString 结束位置
* @return 模型数组
*/
+ (NSArray *)topicsWithBeginString:(NSString *)beginString endString:(NSString *)endStrng;
方法名字可能有点不规范啊,各位可以自己随便取,这里仅提供思路~
大概就是这样了,如果有不正确的地方欢迎批评指正,
最后放上Demo地址:https://github.com/ChrisCaixx/HtmlToObject
觉得好用的可以点下星星哦,3Q