Changes for version 0.20

  • Actually include test data (duh)
  • Fixed pdf reading
  • Added support for using pdftotext to read PDF files
  • Added a method to get the detected filetype to the public API
  • No longer re-reads the file on each call to ->text()

Modules

a module to read pure text from a vareiety of formats