NAME
Net::Dict - client API for accessing dictionary servers (RFC 2229)
SYNOPSIS
use Net::Dict;
$dict = Net::Dict->new('dict.server.host');
$h = $dict->define("word");
foreach $i (@{$h}) {
($db, $def) = @{$i};
. . .
}
DESCRIPTION
Net::Dict
is a perl class for looking up words and their definitions on network dictionary servers. Net::Dict
provides a simple DICT client API for the network protocol described in RFC2229. Quoting from that RFC:
The Dictionary Server Protocol (DICT) is a TCP transaction based query/response protocol that allows a client to access dictionary definitions from a set of natural language dictionary databases.
An instance of Net::Dict represents a connection to a single DICT server. For example, to connect to the dictionary server at dict.org
, you would write:
$dict = Net::Dict->new('dict.org');
A DICT server can provide any number of dictionaries, which are referred to as databases. Each database has a name and a title. The name is a short identifier, typically just one word, used to refer to that database. The title is a brief one-line description of the database. For example, at the time of writing, the dict.org
server has 11 databases, including a version of Webster's dictionary from 1913. The name of the database is web1913, and the title is Webster's Revised Unabridged Dictionary (1913).
To look up definitions for a word, you use the define
method:
$dref = $dict->define('banana');
This returns a reference to a list; each entry in the list is a reference to a two item list:
[ $dbname, $definition ]
The first entry is a database name as introduced above. The second entry is the text of a definition from the specified dictionary.
MATCHING WORDS
In addition the looking up word definitions, you can lookup a list of words which match a given pattern, using the match() method. Each DICT server typically supports a number of strategies which can be used to match words against a pattern. For example, using prefix strategy with a pattern "anti" would find all words in databases which start with "anti":
@mref = $dict->match('anti', 'prefix');
foreach my $match (@{ $mref }) {
($db, $word) = @{ $match };
}
Similarly the suffix strategy is used to search for words which end in a given pattern. The strategies() method is used to request a list of supported strategies - see "METHODS" for more details.
SELECTING DATABASES
By default Net::Dict will look in all databases on the DICT server. This is specified with a special database name of *
. You can specify the database(s) to search explicitly, as additional arguments to the define and match methods:
$dref = $dict->define('banana', 'wn', 'web1913');
Rather than specify the databases to use every time, you can change the default from '*' using the setDicts
method:
$dict->setDicts('wn', 'web1913');
Any subsequent calls to define or match will refer to these databases, unless over-ridden with additional arguments to the method. You can find out what databases are available on a server using the dbs
method:
%dbhash = $dict->dbs();
Each entry in the returned hash has the name of a database as the key, and the corresponding title as the value.
There is another special database name - !
- which says that all databases should be searched, but as soon as a definition is found, no further databases should be searched.
CONSTRUCTOR
$dict = Net::Dict->new (HOST [,OPTIONS]);
This is the constructor for a new Net::Dict object. HOST
is the name of the remote host on which a Dict server is running. This is required, and must be an explicit host name.
The constructor makes a connection to the remote DICT server, and sends the CLIENT command, to identify the client to the server.
Note: previous versions let you give an empty string for the hostname, resulting in selection of default hosts. This behaviour is no longer supported.
OPTIONS
are passed in a hash like fashion, using key and value pairs. Possible options are:
- Port
-
The port number to connect to on the remote machine for the Dict connection (a default port number is 2628, according to RFC2229).
- Client
-
The string to send as the CLIENT identifier. If not set, then a default identifier for Net::Dict is sent.
- Timeout
-
Sets the timeout for the connection, in seconds. Defaults to 120.
- Debug
-
The debug level - a non-zero value will resulting in debugging information being generated, particularly when errors occur. Can be changed later using the
debug
method, which is inherited from Net::Cmd. More on the debug method can be found in Net::Cmd.
Making everything explicit, here's how you might call the constructor in your client:
$dict = Net::Dict->new($HOST,
Port => 2628,
Client => "myclient v$VERSION",
Timeout => 120,
Debug => 0);
This will return undef
if we failed to make the connection. It will die
if bad arguments are passed: no hostname, unknown argument, etc.
METHODS
Unless otherwise stated all methods return either a true or false value, with true meaning that the operation was a success. When a method states that it returns a value, failure will be returned as undef or an empty list.
define ( $word [, @dbs] )
returns a reference to an array, whose members are lists, consisting of two elements: the dictionary name and the definition. If no dictionaries are specified, those set by setDicts() are used.
match ( $pattern, $strategy [, @dbs] )
Looks for words which match $pattern according to the specified matching $strategy. Returns a reference to an array, each entry of which is a reference to a two-element array: database name, matching word.
dbs
Returns a hash with information on the databases available on the DICT server. The keys are the short names, or identifiers, of the databases; the value is title of the database:
%dbhash = $dict->dbs();
print "Available dictionaries:\n";
while (($db, $title) = each %dbhash) {
print "$db : $title\n";
}
This is the SHOW DATABASES
command from RFC 2229.
dbInfo ( $dbname )
Returns a string, containing description of the dictionary $dbname.
setDicts ( @dicts )
Specify the dictionaries that will be searched during the successive define() or match() calls. Defaults to '*'. No existence checks are performed by this interface, so you'd better make sure the dictionaries you specify are on the server (e.g. by calling dbs()).
strategies
returns an array, containing an ID of a matching strategy as a key and a verbose description as a value.
This method was previously called strats(); that name for the method is also currently supported, for backwards compatibility.
auth ( $USER, $PASSPHRASE )
Attempt to authenticate the specified user, using the scheme described on page 18 of RFC 2229. The user should be known to the server, and $PASSPHRASE is a shared secret known only to the server and the user.
For example, if you were using dictd from dict.org, your configuration file might include the following:
database private {
data "/usr/local/dictd/db/private.dict.dz"
index "/usr/local/dictd/db/private.index"
access { user connor }
}
user connor "there can be only one"
To be able to access this database, you'd write something like the following:
$dict = Net::Dict->new('dict.foobar.com');
$dict->auth('connor', 'there can be only one');
A subsequent call to the databases
method would reveal the private
database now accessible. Not all servers support the AUTH extension; you can check this with the has_capability() method, described below.
serverInfo
Returns a string, containing the information about the server, provided by the server:
print "Server Info:\n";
print $dict->serverInfo(), "\n";
This is the SHOW SERVER
command from RFC 2229.
dbTitle ( $DBNAME )
Returns the title string for the specified database. This is the same string returned by the dbs()
method for all databases.
capabilities
Returns a list of the capabilities supported by the DICT server, as described on pages 7 and 8 of RFC 2229.
has_capability ( $cap_name )
Returns true (non-zero) if the DICT server supports the specified capability; false (zero) otherwise. Eg
if ($dict->has_capability('auth')) {
$dict->auth('genie', 'open sesame');
}
status
Send the STATUS command to the DICT server, which will return some server-specific timing or debugging information. This may be useful when debugging or tuning a DICT server, but probably won't be of interest to most users.
KNOWN BUGS AND LIMITATIONS
Need to add methods for getting lists of databases and strategies in the order they're returned by the remote server. Suggested by Aleksey Cheusov.
The following DICT commands are not currently supported:
OPTION MIME
No support for firewalls at the moment.
Site-wide configuration isn't supported. Previous documentation suggested that it was.
Currently no way to specify that results of define and match should be in HTML. This was also previously a config option for the constructor, but it didn't do anything.
EXAMPLES
The distribution includes two example DICT clients: dict is a basic command-line client, and tkdict is a GUI-based client, created using Perl/Tk.
The examples directory of the Net-Dict distribution includes two basic examples. simple.pl
illustrates basic use of the module, and portuguese.pl
demos use of an English to Portuguese dictionary. Thanks to Jose Joao Dias de Almeida for the examples.
SEE ALSO
RFC 2229 - the internet document which defines the DICT protocol.
Net::Cmd - a module which provides methods for a network command class, such as Net::FTP, Net::SMTP, as well as Net::Dict. Part of the libnet distribution, available from CPAN.
Digest::MD5 - you'll need this module if you want to use the auth method.
dict.org - the home page for the DICT effort; has links to other resources, including other libraries and clients, and dictd
, the reference DICT server.
REPOSITORY
https://github.com/neilbowers/Net-Dict
AUTHOR
The first version of Net::Dict was written by Dmitry Rubinstein <dimrub@wisdom.weizmann.ac.il>, using Net::FTP and Net::SMTP as a pattern and a model for imitation.
The module was extended, and is now maintained, by Neil Bowers <neil@bowers.com>
COPYRIGHT
Copyright (C) 2002-2014 Neil Bowers. All rights reserved.
Copyright (C) 2001 Canon Research Centre Europe, Ltd.
Copyright (c) 1998 Dmitry Rubinstein. All rights reserved.
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.