NAME
MediaWiki::CleanupHTML - cleanup the MediaWiki-generated HTML from MediaWiki embellishments.
VERSION
version 0.0.6
SYNOPSIS
use MediaWiki::CleanupHTML;
open my $fh, '<:encoding(UTF-8)', $filename
or die "Cannot open '$filename' - $!";
my $cleaner = MediaWiki::CleanupHTML->new({ fh => $fh });
open my $out_fh, '>:encoding(UTF-8)', $processed_filename
or die "Cannot open '$processed_filename' for output - $!";
$cleaner->print_into_fh($out_fh);
$cleaner->destroy_resources();
DESCRIPTION
The HTML rendered on MediaWiki pages is full of MediaWiki-specific embellishments such as edit sections. This module attempts to clean it up and return a more straightforward HTML. Note that the HTML returned by MediaWiki APIs may not always available (for instance if the wiki is down), so this module should be considered a fallback.
SUBROUTINES/METHODS
MediaWiki::CleanupHTML->new({fh => $fh})
The constructor - accepts the filehandle from which to read the XHTML.
$cleaner->print_into_fh($fh)
Output to a filehandle. The filehandle should be able to process UTF-8 output.
$cleaner->destroy_resources()
Destroy the allocated resources (of the HTML::TreeBuilder tree, etc.). Must be called before destruction.
AUTHOR
Shlomi Fish, http://www.shlomifish.org/ .
ACKNOWLEDGEMENTS
The developers of HTML::TreeBuilder::XPath, HTML::TreeBuilder and related modules for their helpful code.
LICENSE AND COPYRIGHT
Copyright 2012 Shlomi Fish.
This program is distributed under the MIT / Expat License: http://www.opensource.org/licenses/mit-license.php
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
SUPPORT
Websites
The following websites have more information about this module, and may be of help to you. As always, in addition to those websites please use your favorite search engine to discover more resources.
MetaCPAN
A modern, open-source CPAN search engine, useful to view POD in HTML format.
RT: CPAN's Bug Tracker
The RT ( Request Tracker ) website is the default bug/issue tracking system for CPAN.
https://rt.cpan.org/Public/Dist/Display.html?Name=MediaWiki-CleanupHTML
CPANTS
The CPANTS is a website that analyzes the Kwalitee ( code metrics ) of a distribution.
CPAN Testers
The CPAN Testers is a network of smoke testers who run automated tests on uploaded CPAN distributions.
CPAN Testers Matrix
The CPAN Testers Matrix is a website that provides a visual overview of the test results for a distribution on various Perls/platforms.
CPAN Testers Dependencies
The CPAN Testers Dependencies is a website that shows a chart of the test results of all dependencies for a distribution.
Bugs / Feature Requests
Please report any bugs or feature requests by email to bug-mediawiki-cleanuphtml at rt.cpan.org
, or through the web interface at https://rt.cpan.org/Public/Bug/Report.html?Queue=MediaWiki-CleanupHTML. You will be automatically notified of any progress on the request by the system.
Source Code
The code is open to the world, and available for you to hack on. Please feel free to browse it and play with it, or whatever. If you want to contribute patches, please send me a diff or prod me to pull from your repository :)
https://github.com/shlomif/perl-mediawiki-cleanuphtml
git clone git://github.com/shlomif/perl-mediawiki-cleanuphtml.git
AUTHOR
Shlomi Fish <shlomif@cpan.org>
BUGS
Please report any bugs or feature requests on the bugtracker website https://github.com/shlomif/perl-mediawiki-cleanuphtml/issues
When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.
COPYRIGHT AND LICENSE
This software is Copyright (c) 2012 by Shlomi Fish.
This is free software, licensed under:
The MIT (X11) License