首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > 其他教程 > 其他相关 >

Enhanced SAX Handler : 比 DOM 还容易的SAX Handler

2012-11-03 
Enhanced SAX Handler : 比 DOM 还简单的SAX Handler快速与简单并非天生不可兼得, 相反, 简单的东西应该是

Enhanced SAX Handler : 比 DOM 还简单的SAX Handler


快速与简单并非天生不可兼得, 相反, 简单的东西应该是快速的


在使用 SAX 解析 XML 的过程中, 碰到了以下问题:

    SAX Handler 并没有想象中快, 尤其是文件比较大的时候SAX Handler 编写容易出错, 因为需要区别不同的元素, 需要很多判断才能拿到自己想要的信息没有统一的方法获取SAX Handler解析出来的信息
这实际上反映了目前SAX Handler接口定义中缺失的三种能力:?Stoppable,?Subscribable, 和?Reportable

1, Stoppable
缺省情况下SAX Parser会解析整个文件, 即使你已经取得了足够的你想要的信息, 但解析不会停止, 这就是感觉SAX Parser在解析大文件的时候不是很快的原因

只有异常才能阻止SAX Parser继续解析, 所以解决方法很简单:

a). 定义接口:

public?interface?Stoppable {

????boolean?canStop();

}

b). 缺省实现:

public?abstract?class?EnhancedHandler?extends?DefaultHandler?implements?Reportable {

????private?boolean?canStop;

????public?boolean?canStop() {??return?canStop;????}

????protected?void?stop() {?canStop?=?true;?} ?//call this method when subclass objects get enough information.

}

c). 当且只当所有SAX Handler都可以停止的时候, 抛出异常:

public?class?CompositeEnhancedHandler?extends?DefaultHandler {

????private?static?final?RuntimeException?SHOULD_STOP_EXCEPTION?=?new?ShouldStopParsingException();

????private?final?EnhancedHandler[]?handlers;

????public?CompositeEnhancedHandler(EnhancedHandler... handlers) {

????????this.handlers?= handlers;

????}

????public?void?characters(char[] ch,?int?start,?int?length)?throws?SAXException {

????????for?(EnhancedHandler handler :?handlers) {?handler.characters(ch, start, length);?}

????????throwExceptionIfCanStop();

????}

????public?void?endElement(String uri, String localName, String qName)?throws?SAXException {

????????for?(EnhancedHandler handler :?handlers) {?handler.endElement(uri, localName, qName);?}

????????throwExceptionIfCanStop();

????}

????public?void?startElement(String uri, String localName, String qName, Attributes attributes)?throws?SAXException {

????????for?(EnhancedHandler handler :?handlers) {?handler.startElement(uri, localName, qName, attributes);?}

????????throwExceptionIfCanStop();

????}

????private?void?throwExceptionIfCanStop() {

????????for?(EnhancedHandler handler :?handlers) {??if?(!handler.canStop()) {?return;?}?}

????????throw?SHOULD_STOP_EXCEPTION;

????}

}

d). SAX Parser 捕获异常:

CompositeEnhancedHandler handler =?new?CompositeEnhancedHandler(new?Handler1(),?new?Handler2());

try?{

????SAXParser saxParser = SAXParserFactory.newInstance().newSAXParser();

????saxParser.parse(new?File("england.xml"), handler);

}?catch?(ShouldStopParsingException se) {

????// All handlers got enough information, just stop parsing.

}?

?

2. Subscribable
不能指定只处理特定元素的能力的缺乏, 使得SAX Handler难以编写且易于出错, ?不得不判断当前元素的名称, 是否正在处理特定的元素等, 这使得每个Handler都在重复这些逻辑相似的代码.

解决方法是提供一个额外的中间层, 询问SAX Handler对哪个元素感兴趣. 该中间层只会向每个SAX Handler发送它们感兴趣的元素信息. (也可以采用每个SAX Handler向中间层注册感兴趣信息的方法, 但比较复杂, ESAX采用前者)

a). 定义接口:

public?interface?Subscribable {

????String subscribe();

}


b). 中间层?CompositeEnhancedHandler:

public?class?CompositeEnhancedHandler?extends?DefaultHandler?{

????private?final?AddableMap?mapping?=?new?AddableMap();

????private?List<EnhancedHandler>?currentHandlers;

????public?CompositeEnhancedHandler(EnhancedHandler... handlers) {

????????... ...

????????for?(EnhancedHandler handler : handlers) {??mapping.get(handler.subscribe()).add(handler);?}

????}

????public?void?startElement(String uri, String localName, String qName, Attributes attributes)?throws?SAXException {

????????currentHandlers?=?mapping.get(qName);

????????for?(EnhancedHandler handler :?currentHandlers)?{?handler.startElement(uri, localName, qName, attributes);?}

????????... ...

????}

????public?void?characters(char[] ch,?int?start,?int?length)?throws?SAXException {

????????for?(EnhancedHandler handler :?currentHandlers)?{?handler.characters(ch, start, length);?}

????????... ...

????}

?

????public?void?endElement(String uri, String localName, String qName)?throws?SAXException {

????????for?(EnhancedHandler handler :?currentHandlers)?{?handler.endElement(uri, localName, qName);?}

????????... ...

????}

????private?static?class?AddableMap {

????????private?Map<String, List<EnhancedHandler>>?container?=?new?HashMap<String, List<EnhancedHandler>>();

????????public?List<EnhancedHandler> get(String qname) {

????????????if?(!container.containsKey(qname)) {?container.put(qname,?new?ArrayList<EnhancedHandler>());?}

????????????return?container.get(qname);

????????}

????}

}

?

3. Reportable
DOM提供了很方便的方法供提取特定信息, 但SAX Handler缺失了这项能力, ?感兴趣的信息被藏在每个Handler内部

ESAX提供的解决方法是"收集参数模式"

a). 定义接口:

public?interface?Reportable {

????void?report(Map resultSet);

}

b). 缺省支持:

public?abstract?class?EnhancedHandler?extends?DefaultHandler?implements?Reportable, Stoppable, Subscribable {

????... ...

}

public?class?CompositeEnhancedHandler?extends?DefaultHandler?implements?Reportable?{

????public?void?report(Map resultSet) {

????????for?(EnhancedHandler handler :?handlers) {?handler.report(resultSet);?}

????}

}



最终, ESAX 为 原始的 SAX Handler 补足了 可中止的能力, 可订阅的能力, 可汇报的能力, 使得比原始的SAX Handler更快, 比DOM接口更简单, 更易于编程

一个简单的例子可参见:?

http://jade-stone-suite.googlecode.com/svn/trunk/JS.ESax/test/jade/stone/esax/sample/FACupHandler.java

测试用例参见:

http://jade-stone-suite.googlecode.com/svn/trunk/JS.ESax/test/jade/stone/esax/test/CompositeEnhancedHandlerTest.java


最终的缺省实现可参见:

http://jade-stone-suite.googlecode.com/svn/trunk/JS.ESax/src/jade/stone/esax/support/EnhancedHandler.java

http://jade-stone-suite.googlecode.com/svn/trunk/JS.ESax/src/jade/stone/esax/support/CompositeEnhancedHandler.java

项目主页:

http://jade-stone-suite.googlecode.com/svn/trunk/JS.ESax/

热点排行