The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Apache::Hadoop::WebHDFS - interface to Hadoop's WebHDS API that supports GSSAPI (secure) access.

VERSION

Version 0.01

SYNOPSIS

Hadoop's WebHDFS API, is a rest interface to HDFS. This module provides a perl interface to the API, allowing one to both read and write files to HDFS. Because Apache::Hadoop::WebHDFS supports GSSAPI, it can be used to interface with unsecure and secure Hadoop Clusters.

Apache::Hadoop::WebHDFS is a subclass of WWW:Mechanize, so one could reference WWW::Mechanize methods if needed. One will note that WWW::Mechanize is a subclass of LWP, meaning it's possible to also reference LWP methods from Apache::Hadoop::WebHDFS.

METHODS

new() - creates a new WebHDFS object. Takes an anonomous hash with namenode and namenode port as keys. If not specified defaults to localhost and 50070.

getdelegationtoken() - gets a delegation token from the namenode.

canceldelegationtoken() - informs the namenode to invalidate the delegation token as it's no longer needed.

Open() - opens and reads a file on HDFS

create() - creates and writes to a file on HDFS

rename() - renames a file on HDFS.

getfilestatus() - returns a json structure containing status of file or directory

liststatus() - returns a json structure of contents inside a directory

mkdirs() - creates a directory on HDFS

GSSAPI Debugging

To see GSSAPI calls during the request, enable LWP::Debug by adding 'use LWP::Debug qw(+);' to your script.

REQUIREMENTS

Carp is used for various warnings and errors. WWW::Mechanize is needed as this is a subclass. LWP::Debug is required for debugging GSSAPI connections LWP::Authen::Negotiate is the magic sauce for working with secure hadoop clusters parent included with Perl 5.10.1 and newer or found on CPAN for older versions of perl

EXAMPLES

list a HDFS directory on a secure hadop cluster

  #!/bin/perl
  use Data::Dumper;
  use Authen::Krb5::Effortless;  # <-- to get TGT from kerberos
  use Apache::Hadoop::WebHDFS;
  my $username=getlogin();
  my $krb5=Authen::Krb5::Effortless->new();
  $krb5->fetch_TGT_PW('s3kr3+', $username);
  my $hdfsclient = Apache::Hadoop::WebHDFS->new( {namenode       =>"mynamenode.example.com",
                                                  mynamenodeport =>50070}
                                                                                           );
  $hdfsclient->liststatus("/tmp");        
  print Dumper $content  if ( $hdfsclient->success() ) ;     

          

AUTHOR

Adam Faris, <apache-hadoop-webhdfs at mekanix.org>

BUGS

  Please use github to report bugs and feature requests 
  https://github.com/opsmekanix/Apache-Hadoop-WebHDFS/issues

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Apache::Hadoop::WebHDFS

You can also look for information at:

ACKNOWLEDGEMENTS

I would like to acknowledge Andy Lester plus the numerous people who have worked on WWW::Mechanize, Anchim Grolms and team for providing LWP::Authen::Negotiate, and the contributors to LWP. Thanks for providing awesome modules.

LICENSE AND COPYRIGHT

Copyright 2013 Adam Faris.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

    L<http://www.apache.org/licenses/LICENSE-2.0>

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.