A PowerShell native interface for the Hadoop WebHDFS APIs.
The cmdlets have been written and tested against Hadoop version 2.8.1, but include all API calls defined in version 2.9.0. It is possible to authenticate via Kerberos using the Credential parameter of the New-HDFSSession cmdlet, this has been tested on a secured Cloudera CDH 5 cluster. Using the SPNEGOToken parameter for Kerberos authentication has not been tested.
The below shows some of the usage of the cmdlets. The Path parameter does not need to be prefaced with a leading "/", so you can specify "/home/file.txt" or "home/file.txt" and they are both interpreted the same way.
All cmdlets by default will execute Write-Warning when an error is encountered. To cause the cmdlet to throw an exception instead, use the -ErrorAction Stop parameter.
To setup a basic session using user name authentication:
Import-Module -Name HDFS
New-HDFSSession -Namenode 192.168.1.2 -Username hdadminYou may need to force the use of TLS 1.2 in a secured environment for all Invoke-WebRequest calls. Be aware that forcing this usage may affect other cmdlets or scripts in the same PowerShell session.
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12After forcing TLS 1.2, you can now establish a session with a secured CDH platform using TLS and Kerberos.
Import-Module -Name HDFS
New-HDFSSession - Namenode cdhnode -Port 14000 -UseSSL -Credential (Get-Credential)You can establish multiple sessions at once to different name nodes in the same PowerShell session. If you only establish 1 session, it is the default and you don't need to specify it in the cmdlets. If you do add more than 1 session with New-HDFSSession to different name nodes, if you do not supply a -Session parameter, the cmdlets default to the first session created. If you want to target a specific name node, supply the -Session parameter in the file system operation cmdlets.
Once you've stablished a session, you can perform file system operations.
Set-HDFSItem -Path "/" -Owner "hdadmin"Remove-HDFSItem -Path "/test" -RecursiveNew-HDFSItem -Path "/test" -ItemType DirectorySet-HDFSAcl -Path "/test" -Acl "user::rwx,group::rwx,other::rwx" -ReplaceNew-HDFSItem -Path "/test/test.txt" -ItemType File -InputObject "TESTING"New-HDFSItem -PAth "/backups/sql.bak" -InputFile c:\backups\sql1.bakGet-HDFSContent -Path "/test/test.txt" -Encoding ([System.Text.Encoding]::UTF8)Add-HDFSContent -Path "/test/test.txt" -InputObject "`nTEST2`n"
Get-HDFSContent -Path "/test/test.txt" -Encoding ([System.Text.Encoding]::UTF8)After establishing more than 1 session, this example shows how to target a specific session in the cmdlet.
Get-HDFSContent -Path "/test/test.txt" -Encoding ([System.Text.Encoding]::UTF8) -Session 192.168.1.2
Get-HDFSContent -Path "/test/test.txt" -Encoding ([System.Text.Encoding]::UTF8) -Session cdhnodeThere are 2 different sessions created with New-HDFSSession, one established with a Namenode at 192.168.1.2 and another with the name cdhnode.
Rename-HDFSItem -Path "/test/test.txt" -NewName "/test/test2.txt" -VerboseGet-HDFSHomeDirectorySet-HDFSXAttr -Path "/" -Name "user.test" -Value "Test3" -Flag Create
Get-HDFSXAttr -ListAvailable -Path "/"
Get-HDFSXAttr -Path "" -Names "user.test" -Encoding TEXT
Remove-HDFSXAttr -Path "" -Name "user.test"Added a Credential parameter for New-HDFSSession and changed the parameter name for the Kerberos token from KerberosCredential to SPNEGOToken. Users should prefer using the Credential parameter for Kerberos auth over creating their own SPNEGO token.
Changed file input process with New-HDFSItem.
Added the ability to send file content to HDFS with New-HDFSItem.
Improved error handling. Added -Confirm and -Force functionality where applicable.
Initial Release.