用lisp来让计算机学会写作
大部分的代码、思路参考了《Ansi Common Lisp》P138~P141。
问题:给一篇英文文本,如何让计算机依据此文本而生成随机但可读的文本。如:
|Venture|
The National Venture Capital Association estimates that wealth associated with a deal a big spending by regulations that will spend one another's main reason these projects .
这是计算机学习了Paul Graham的一些文章后生成的随机文本。它根据Venture这个单词向两边延伸成一个句子。令人惊喜的是,文本常常是可读的。
算法:记录每个单词后面出现的单词以及出现的次数,如I leave在原文中出现了5次,I want出现了3次,除此之外,其它地方没有出现过I,所以在生成随机文章的时候,当遇到I,有5/8的概率选择leave为下一个单词。假如选择了leave的话,则看看leave后面出现过哪些单词,重复以上过程。
现用lisp来解决问题。
lisp里的符号类型,即symbol,可以很好记录各种字符串还有标点符号,所以采用它来记录。采用内附的hashtable来建立列表:
(defparameter *words* (make-hash-table :size 10000))
那如何建立列表呢?
(defparameter *words* (make-hash-table :size 10000))(defconstant maxword 100)(defparameter nwords 0)(defconstant debug nil)(let ((prev '|.|)) (defun see (sym) (incf nwords) (let ((pair (assoc sym (gethash prev *words*)))) (if pair (incf (cdr pair)) (push (cons sym 1) (gethash prev *words*)))) (setf prev sym)))(defun check-punc (c);char to symbol (case c (#\. '|.|) (#\, '|,|) (#\; '|;|) (#\? '|?|) (#\: '|:|) (#\! '|!|)))(defun read-text (pathname) (with-open-file (str pathname :direction :input) (let ((buf (make-string maxword)) (pos 0)) (do ((c (read-char str nil 'eof) (read-char str nil 'eof))) ((eql c 'eof))(if (or (alpha-char-p c)(eql c #\')) (progn (setf (char buf pos) c) (incf pos)) (progn (unless (zerop pos)(see (intern (subseq buf 0 pos)))(setf pos 0)) (let ((punc (check-punc c)))(if punc (see punc)))))))))(defun print-ht (ht) (maphash #'(lambda (k v)(format t "~A ~A~%" k v)) ht))(defparameter *r-words* (make-hash-table :size 10000))(defun push-words (w1 w2 n) (push (cons w2 n) (gethash w1 *r-words*)))(defun get-reversed-words ();a cat -> cat a (maphash #'(lambda (k lst) (dolist (pair lst) (push-words (car pair) k (cdr pair)))) *words*))(defun print-a-word (word ht) (maphash #'(lambda (k lst) (if (eql k word) (format t "~A ~A~%" k lst))) ht))(if debug (print-a-word '|leave| *r-words*))(defun punc-p (sym);symbol to char,nil when fails. (check-punc (char (symbol-name sym) 0)))(defun random-word (word ht) (let* ((choices (gethash word ht)) (x (random (reduce #'+ choices :key #'cdr)))) (dolist (pair choices) (decf x (cdr pair)) (if (minusp x) (return (car pair))))))(defun gen-former (word str) (let ((last (random-word word *r-words*))) (if (not (punc-p last))(progn (gen-former last str) (format str "~A " last)))))(defun gen-latter (word str) (let ((next (random-word word *words*))) (format str "~A " next) (if (not (punc-p next)) (gen-latter next str))));(gen-latter '|leave| t)(defun get-a-word (ht);get a random word (let ((x (random nwords))) (maphash #'(lambda (k v) (dolist (pair v) (decf x (cdr pair)) (if (minusp x) (return-from get-a-word (car pair))))) ht)));(get-a-word *words*)(defun gen-sentence (word str) (gen-former word str) (format str "~A " word) (gen-latter word str))(defun test () (setf nwords 0) (read-text "essay.txt") (get-reversed-words) (let ((word (get-a-word *words*))) (print word) (gen-sentence word t)))(test)