PIKT

Samples: Find Spelling Errors

PIKT Logo
Home FAQ News Intro Samples Tutorial Reference Software Authors Licensing SiteSearch

Computer Magazines at Amazon.com

Computer Networks
Computer Networks

Information Systems Security
Information Systems Security

Linux Journal
Linux Journal

Sys Admin
Sys Admin

Journal Of Network And Computer Applications
Journal Of Network And Computer Applications

spell_check.pl is a simple Perl spell check script to find spelling errors in a text document, for example an HTML file.  spell_check.pl uses the find_words script to identify words in the specified text file.  For each word, one or more lookups are done:  first in the standard ispell dictionary, and if not found, second in the specified local dictionary, in this case PIKTDictionary.obj.  Any word not found in either dictionary indicates a possible spelling error and is output, one word per line.  If any of these possibly misspelled words are not a genuine spelling mistake, they should be added to the local dictionary file (PIKTDictionary.obj).

spell_check.pl is called by the ReportPIKTSpellingErrors Pikt script.

#!/usr/bin/perl

# spell_check.pl:  spell check a text file; find all words in a text
#                  file, look up each in the specified dictionary
#                  file (with one lower case word per line), and output
#                  those not found (one word per line)
#
#                  Usage:  spell_check.pl -d <dictionary> -f <file>

while (@ARGV) {
        if ($ARGV[0] eq "-d") {
                shift;
                $dictionary = $ARGV[0];
                shift;
                next;
        }
        if ($ARGV[0] eq "-f") {
                shift;
                $file = $ARGV[0];
                shift;
                next;
        }
}

open(DICTIONARY, $dictionary);
while(<DICTIONARY>) {
        chomp;
        next if /^$/;           # bypass empty lines
        $word = $_;
        $word =~ s/\s//g;       # remove spaces
        $word =~ tr/A-Z/a-z/;   # convert to lower case
#       print "#$word#\n";
        $dict{$word}++;         # add word to internal dictionary
}
close(DICTIONARY);

#foreach $key (keys %dict) {
#       print "$key\n";
#}

open(WORDS, "/usr/local/bin/find_words < $file | /usr/bin/ispell -l |");
while(<WORDS>) {
        chomp;
        next if /^$/;           # bypass empty lines
        $word = $_;
        $word =~ tr/A-Z/a-z/;   # convert to lower case
        next if $dict{$word};   # skip if found in internal dictionary
        print "$word\n";        # print if not found
}
close(WORDS);

[For more examples, see Samples.]

Home | FAQ | News | Intro | Samples | Tutorial | Reference | Software | Authors | Licensing | SiteSearch
Links | SiteIndex | Pikt-Users | Pikt-Workers | Contribute | ContactUs | Top of Page
Page best viewed at 1024x768.   Page last updated 2005-06-22.
This site is PIKT® powered.
PIKT® is a registered trademark of the University of Chicago.
Copyright © 1998-2005 Robert Osterlund.  All rights reserved.

Computer Books at Amazon.com

Dynamic HTML: The Definitive Reference
Dynamic HTML: The Definitive Reference

HTML & XHTML: The Complete Reference
HTML & XHTML: The Complete Reference

HTML 4 for Dummies
HTML 4 for Dummies

Web Database Applications with PHP & MySQL
Web Database Applications with PHP & MySQL

PHP Pocket Reference
PHP Pocket Reference