NAME
ishmael - Convert ebook documents to plain text
SYNOPSIS
ishmael [options] file
DESCRIPTION
ishmael is a Perl program that reads given ebook documents and converts their contents to formatted plain text, which should be suitable for piping into other programs for further processing. It accomplishes this by converting the ebook contents to HTML and then formatting that HTML using the dump feature found in many text web browsers, like lynx(1).
ishmael currently supports the following ebook formats:
ishmael is also capable of dumping some of the metadata of an ebook via the -m
and -j
options.
OPTIONS
- -d|--dumper=dumper
-
Specify the program to use for formatting ebook text. The following are valid options, as long they're installed on your system:
- elinks
- links
- lynx
- w3m
- queequeg
queequeg(1) is a script distributed with ishmael that acts as a fallback dumper if no other dumper is installed on your system. If this program was installed normally, queequeg(1) should always be available to ishmael.
By default, ishmael will either use the dumper specified by the
ISHMAEL_DUMPER
environment variable if set, or the first one it finds installed on your system otherwise. - -f|--format=format
-
Instead of trying to determine the given ebook format via a series of heuristics, manually specify the format. The following are valid options (caes does not matter):
- -o|--output=file
-
Instead of writing output to stdout, write output to file.
- -w|--width=width
-
Specify the outputted line width. Defaults to
80
. - -H|--html
-
Dump the HTML-ified contents of the ebook instead of the formatted plain text.
- -i|--identify
-
Instead of dumping the text contents of an ebook, try to identify its format instead.
- -j|--meta-json
-
Dump the ebook's metadata in JSON form.
- -m|--metadata
-
Dump the ebook's metadata.
- -h|--help
-
Print help message and exit.
- -v|--version
-
Print version and copyright info, then exit.
ENVIRONMENT
- ISHMAEL_DUMPER
-
Name of dumper program to use by default.
RESTRICTIONS
PDF processing is inefficient and the output is ugly.
AUTHOR
Written by Samuel Young, <samyoung12788@gmail.com>.
This project's source can be found on its Codeberg Page. Comments and pull requests are welcome!
HISTORY
This is the fifth iteration of this program, and hopefully the last :-).
This program originally went by the name of ebread. The first iteration was written in C and only supported EPUBs, it was quite buggy. The second iteration was written as a learning exercise for Perl, it too only supported EPUBs, it was also where I got the idea to delegate the text formatting task to another program. The third iteration was again in C, but this time supported a bunch of other ebook formats. It wasn't nearly as buggy as the first, but the code was quite sloppy and had gotten to the point where I couldn't extend it much. The fourth iteration was written in Raku, it only supported EPUBs. This iteration, I renamed the project to ishmael because I got bored of the last name. This iteration supports multiple different ebook formats, but is written in Perl so it should (hopefully) be less buggy and more maintainable.
COPYRIGHT
Copyright (C) 2025 Samuel Young
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.