使用REXML解析xml
1 Tree Parsing(也就是DOM-like)?
我们需要require rexml/document 库,并且include REXML : require 'rexml/document'?
include REXML?
input = File.new("books.xml")?
doc = Document.new(input)?
root = doc.root?
puts root.attributes["shelf"] # Recent Acquisitions?
doc.elements.each("library/section") { |e| puts e.attributes["name"] }?
# Output:?
# Ruby?
# Space?
doc.elements.each("*/section/book") { |e| puts e.attributes["isbn"] }?
# Output:?
# 0672328844?
# 0321445619?
# 0684835509?
# 074325631X?
sec2 = root.elements[2]?
author = sec2.elements[1].elements["author"].text # Robert Zubrin 这里要注意的是xml中的属性和值被表示为一个hash,因此我们能够通过attributes[]来提取我们需要的值,元素的值还能通过类似于 path的字符串或者整数来取得.其中用整数取的话,是1-based而不是0-based.?
2 Stream Parsing(也就是SAX-like Parsing)?
这边使用了一个小技巧,那就是定义了一个listener 类,它将会在parse的时候被回调: require 'rexml/document'?
require 'rexml/streamlistener'?
include REXML?
class MyListener?
include REXML::StreamListener?
def tag_start(*args)?
puts "tag_start: #{args.map {|x| x.inspect}.join(', ')}"?
end?
def text(data)?
return if data =http://blog.soso.com/qz.q/~ /^/w*$/ # whitespace only?
abbrev = data[0..40] + (data.length > 40 ? "..." : "")?
puts " text : #{abbrev.inspect}"?
end?
end?
list = MyListener.new?
source = File.new "books.xml"?
Document.parse_stream(source, list)?