NAME
Apache::Hadoop::WebHDFS - interface to Hadoop's WebHDS API that supports GSSAPI (secure) access.
VERSION
Version 0.02
SYNOPSIS
Hadoop's WebHDFS API, is a rest interface to HDFS. This module provides a perl interface to the API, allowing one to both read and write files to HDFS. Because Apache::Hadoop::WebHDFS supports GSSAPI, it can be used to interface with unsecure and secure Hadoop Clusters.
Apache::Hadoop::WebHDFS is a subclass of WWW:Mechanize, so one could reference WWW::Mechanize methods if needed. One will note that WWW::Mechanize is a subclass of LWP, meaning it's possible to also reference LWP methods from Apache::Hadoop::WebHDFS.
METHODS
new() - creates a new WebHDFS object. Takes an anonomous hash with namenode and namenode port as keys. If not specified defaults to localhost and 50070.
getdelegationtoken() - gets a delegation token from the namenode.
renewdelegationtoken() - renews a delegation token from the namenode.
canceldelegationtoken() - informs the namenode to invalidate the delegation token as it's no longer needed.
Open() - opens and reads a file on HDFS
create() - creates and writes to a file on HDFS
rename() - renames a file on HDFS.
getfilestatus() - returns a json structure containing status of file or directory
liststatus() - returns a json structure of contents inside a directory
mkdirs() - creates a directory on HDFS
getfilechecksum() - gets HDFS checksum on file
GSSAPI Debugging
To see GSSAPI calls during the request, enable LWP::Debug by adding 'use LWP::Debug qw(+);' to your script.
REQUIREMENTS
Carp is used for various warnings and errors. WWW::Mechanize is needed as this is a subclass. LWP::Debug is required for debugging GSSAPI connections LWP::Authen::Negotiate is the magic sauce for working with secure hadoop clusters parent included with Perl 5.10.1 and newer or found on CPAN for older versions of perl
EXAMPLES
list a HDFS directory on a secure hadop cluster
#!/bin/perl
use Data::Dumper;
use Authen::Krb5::Effortless; # <-- to get TGT from kerberos
use Apache::Hadoop::WebHDFS;
my $username=getlogin();
my $krb5=Authen::Krb5::Effortless->new();
$krb5->fetch_TGT_PW('s3kr3+', $username);
my $hdfsclient = Apache::Hadoop::WebHDFS->new( {namenode =>"mynamenode.example.com",
namenodeport =>"50070"});
$hdfsclient->liststatus("/tmp");
print Dumper $hdfsclient->content() if ( $hdfsclient->success() ) ;
AUTHOR
Adam Faris, <apache-hadoop-webhdfs at mekanix.org>
BUGS
Please use github to report bugs and feature requests
https://github.com/opsmekanix/Apache-Hadoop-WebHDFS/issues
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Apache::Hadoop::WebHDFS
You can also look for information at:
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
ACKNOWLEDGEMENTS
I would like to acknowledge Andy Lester plus the numerous people who have worked on WWW::Mechanize, Anchim Grolms and team for providing LWP::Authen::Negotiate, and the contributors to LWP. Thanks for providing awesome modules.
LICENSE AND COPYRIGHT
Copyright 2013 Adam Faris.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
L<http://www.apache.org/licenses/LICENSE-2.0>
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.