Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
@@ -1,5 +1,16 @@
Release 0.3 (not yet released)

* A new protocol enables clients to find dictionary entries that are similar to
a given string or corrections for a given misspelled word.

The following new functions provide increasingly abstract functionality for
enumerating similar words and corrections: SPELL:MAP-SIMILAR,
SPELL:MAP-CORRECTIONS and SPELL:CORRECTIONS.

For convenience, the function SPELL:ENGLISH-CORRECTIONS automatically uses
the English dictionary and considers the appropriate case variants of the
supplied string.

* Documentation is now available in the documentation directory.

* The new function MAP-ENTRIES calls a supplied function for each entry in a
Expand Down
61 changes: 45 additions & 16 deletions README.org
Original file line number Diff line number Diff line change
Expand Up @@ -2,36 +2,40 @@

* Introduction

SPELL is a spellchecking library for Common Lisp.
SPELL is a spellchecking library for Common Lisp. It is made
available under the BSD license.

It is made available under the BSD license.

Loading the ~spell~ system may initially take up to between, say, 3
and 60 seconds (depending on the machine and CL implementation) as
an English dictionary is loaded and compiled into the resulting FASL
file.
Loading (possibly after compiling) the ~spell~ system may initially
take between something like 3 and 60 seconds, depending on the
machine and CL implementation, as an English dictionary is loaded,
optimized and compiled into a FASL file. Subsequent load operations
(without compiling) should finish within well below one second.

For loading the ~spell~ system, use

#+begin_src lisp
(ql:quickload "spell")
#+end_src

Currently the only exported functions are ~spell:english-lookup~
that accepts a string, and ~spell:english-check-paragraph~ that
checks a whole paragraph of text and returns a list of conses. Each
cons represents a single word in the paragraph which has failed
dictionary lookup, with the ~car~ and ~cdr~ being offsets in the
original string outlining the word.
This document gives only a very brief overview and highlights some
features. Proper documentation can be found in the
file:documentation directory.

* Looking up Words

The exported functions for looking up words are
~spell:english-lookup~ which accepts a string, and
~spell:english-check-paragraph~ which checks a whole paragraph of
text and returns a list of conses:

#+begin_src lisp :exports both
(spell:english-lookup "horse")
#+end_src

#+RESULTS:
#+begin_example
(#<EXPLICIT-BASE-VERB #1="horse" person:ANY number:ANY tense:NIL negative:NIL contraction:NIL strength:WEAK infinitive:SELF {1001089AA3}>
#<EXPLICIT-BASE-NOUN #1# number:SINGULAR case:NIL gender:NIL {1001089A73}>)
(#<EXPLICIT-BASE-VERB "horse" person:ANY number:ANY tense:NIL negative:NIL contraction:NIL strength:WEAK infinitive:SELF {1001089AA3}>
#<EXPLICIT-BASE-NOUN "horse" number:SINGULAR case:NIL gender:NIL {1001089A73}>)
#+end_example

#+begin_src lisp :exports both
Expand All @@ -44,6 +48,31 @@
((22 . 25) (47 . 50) (51 . 56))
#+end_example

Each cons represents a single word in the paragraph which has failed
dictionary lookup, with the ~car~ and ~cdr~ being offsets in the
original string outlining the word.

* Obtaining Corrections

The SPELL library exports a few functions that obtain similar words
for a given word or corrections for a misspelled word. The most
convenient function is ~spell:english-corrections~ which returns a
list of corrections for a (possibly) misspelled English word;

#+begin_src lisp :exports both :results value verbatim
(spell:english-corrections "lifp" :threshold 1)
#+end_src

#+RESULTS:
#+begin_example
("lift" "life" "lip" "lisp" "limp")
NIL
#+end_example

For more ways to use this function as well as the information about
the lower-level functions for similar words and corrections, see the
main documentation.

* Backward Compatibility Notice

The SPELL library provides the ~spell/simple~ ASDF system. The
Expand All @@ -61,5 +90,5 @@
#+end_src

# Local Variables:
# eval: (load-library 'ob-lisp)
# eval: (load-library "ob-lisp")
# End:
5 changes: 4 additions & 1 deletion code/compact-trie.lisp
Original file line number Diff line number Diff line change
Expand Up @@ -274,7 +274,7 @@
;; Now that children and compact entries have been handled, if NODE
;; is also a leaf, call the next method which is the one specialized
;; to `leaf-mixin' to handle the non-compact entries.
(when (typep node 'leaf-mixin)
(when (leafp node)
(call-next-method)))

(defmethod compact-node-slots append ((node raw-interior-mixin) (depth integer))
Expand Down Expand Up @@ -323,6 +323,9 @@

(defclass compact-interior-node (compact-interior-mixin compact-node) ())

(defmethod leafp ((node compact-interior-node))
nil)

(defmethod node-lookup ((function function)
(string string)
(suffix (eql 0))
Expand Down
58 changes: 38 additions & 20 deletions code/english.lisp
Original file line number Diff line number Diff line change
Expand Up @@ -32,27 +32,37 @@

(defparameter *english-dictionary* #.(load-english-dictionary))

;;; Query functions

(declaim (inline map-english-case-variants))
(defun map-english-case-variants (function word)
(let ((function (a:ensure-function function)))
(funcall function word)
(when (plusp (length word))
(let* ((initial (aref word 0))
(downcased (char-downcase initial)))
(unless (char= initial downcased)
;; We change, for example, "Anti-Semitic" at the beginning
;; of a sentence to "anti-Semitic" which is in the
;; dictionary.
(let ((decapitalized (copy-seq word)))
(setf (aref decapitalized 0) downcased)
(funcall function decapitalized))
;; We change, for example, "PARAMETER" (which is typical for
;; some commenting styles) to "parameter" which is in the
;; dictionary.
(when (every #'upper-case-p word)
(funcall function (string-downcase word))))))))
(declaim (notinline map-english-case-variants))

(defun english-lookup (word)
(when (and word (string/= word ""))
(let ((dictionary *english-dictionary*))
(flet ((try (variant)
(a:when-let ((result (lookup variant dictionary)))
(return-from english-lookup result))))
(try word)
(let* ((initial (aref word 0))
(downcased (char-downcase initial)))
(unless (char= initial downcased)
;; We change, for example, "Anti-Semitic" at the beginning
;; of a sentence to "anti-Semitic" which is in the
;; dictionary.
(let ((decapitalized (copy-seq word)))
(setf (aref decapitalized 0) downcased)
(try decapitalized))
;; We change, for example, "PARAMETER" (which is typical
;; for some commenting styles) to "parameter" which is in
;; the dictionary.
(when (every #'upper-case-p word)
(try (string-downcase word)))))))))
(let ((dictionary *english-dictionary*))
(flet ((try (variant)
(a:when-let ((result (lookup variant dictionary)))
(return-from english-lookup result))))
(declare (dynamic-extent #'try))
(locally (declare (inline map-english-case-variants))
(map-english-case-variants #'try word)))))

(declaim (inline english-text-char-p find-start find-end))
(defun english-text-char-p (character)
Expand Down Expand Up @@ -85,3 +95,11 @@
do (setf position word-end)
unless (english-lookup (subseq string word-start word-end))
collect (cons word-start word-end)))

(defun english-corrections (string &key (threshold 2)
(variants 'map-english-case-variants)
(group-by :spelling)
(count nil))
(corrections string *english-dictionary* threshold :variants variants
:group-by group-by
:count count))
6 changes: 5 additions & 1 deletion code/package.lisp
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,13 @@
#:entry-count
#:map-entries
#:lookup
#:map-similar
#:map-corrections
#:corrections
#:insert
#:load-dictionary)

(:export
#:english-lookup
#:english-check-paragraph))
#:english-check-paragraph
#:english-corrections))
16 changes: 16 additions & 0 deletions code/protocol.lisp
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,34 @@

(defgeneric lookup (string dictionary))

(defgeneric map-similar (function string dictionary threshold &key group-by))

(defgeneric map-corrections (function string dictionary threshold
&key variants group-by))

(defgeneric corrections (string dictionary threshold
&key variants group-by count))

(defgeneric insert (object string dictionary))

(defgeneric load-dictionary (source &key into))

;;; Trie node protocols

(defgeneric leafp (node))

(defgeneric interiorp (node))

;;; Lookup protocol

(defgeneric node-lookup (function string suffix node))

(defgeneric map-node-entries (function node characters))

;;; Similar protocol

(defgeneric node-map-similar (function string suffix node threshold characters))

;;; Insert protocol

(defgeneric node-insert (object string suffix node))
Expand Down
3 changes: 3 additions & 0 deletions code/raw-trie.lisp
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,9 @@

(defclass raw-interior-node (raw-interior-mixin interior-node raw-node) ())

(defmethod leafp ((node raw-interior-node))
nil)

#-minimal-raw-trie
(defmethod node-lookup
((function function) (string string) (suffix (eql 0)) (node raw-interior-node))
Expand Down
Loading