Following are some code snippets to connect to a remote HDFS and of file operations.
I used a hortonworks sandbox to play with. You may want to check my previous posts about how to configure the network for hartonworks sandbox to be able to access it from your local system. Link
I am using a Hortonworks 2.2 sandbox that can be downloaded from HDP 2.2
AppProps.java
HDFSAccess.java
I used a hortonworks sandbox to play with. You may want to check my previous posts about how to configure the network for hartonworks sandbox to be able to access it from your local system. Link
I am using a Hortonworks 2.2 sandbox that can be downloaded from HDP 2.2
- Configure development environment
- Code
- Test
Configure the development environment:
The best way to get the libraries is through maven. The POM file is below
<?xml version="1.0" encoding="UTF-8"?><project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>HDP2_2</groupId>
<artifactId>dev</artifactId>
<version>1.0-SNAPSHOT</version>
<repositories>
<repository>
<id>repo.hortonworks.com</id>
<name>Hortonworks HDP Maven Repository</name>
<url>http://repo.hortonworks.com/content/repositories/releases/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.6.0.2.2.0.0-2041</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.0.2.2.0.0-2041</version>
</dependency>
<dependency>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
<version>1.2</version>
</dependency>
<dependency>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty</artifactId>
<version>6.1.26</version>
</dependency>
<dependency>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty-util</artifactId>
<version>6.1.26</version>
</dependency>
</dependencies>
</project>
Following are 3 classes that I created to my Hello World to HDFS. You may write it differently
I am eventually creating a directory in HDFS with name MyDirectory. Using the instance of HDFSAccess below, you can use it to do a host of things that can be done on a filesystem. Change the IP Address and User Name in AppProps
I am eventually creating a directory in HDFS with name MyDirectory. Using the instance of HDFSAccess below, you can use it to do a host of things that can be done on a filesystem. Change the IP Address and User Name in AppProps
AppProps.java
package com.self.train.hdfs;
public class AppProps {
public static final String HDFS_URL = "hdfs://192.168.56.102:8020";
public static final String hdfsCorePath="/etc/hadoop/conf/core-site.xml";
public static final String hdfsSitePath="/etc/hadoop/conf/hdfs-site.xml";
public static final String hadoopUser="root";
public static void init(){
System.setProperty("HADOOP_USER_NAME", hadoopUser);
}
}
HDFSAccess.java
package com.self.train.hdfs; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import java.io.IOException; public class HDFSAccess { private Configuration conf; /* Constructor */ public HDFSAccess(String HDFSUrl, String corePath, String sitePath) { conf = new Configuration(); conf.addResource(new Path(corePath)); conf.addResource(new Path(sitePath)); conf.set("fs.defaultFS", HDFSUrl); } public boolean createDirectory(String dirName){ Path dir = new Path(dirName); try { FileSystem fs = FileSystem.get(conf); if (!fs.exists(dir)) { fs.mkdirs(dir); }else{ System.out.println("Directory already exists with name:"+ dirName); return false; } }catch(IOException e){ System.out.println("Exception encountered while creating directory. Details: \n"+ e.getMessage()); return false; } System.out.println("Directory created with name:"+ dirName); return true; } }
Stub.java (For testing)
package com.self.train.hdfs;
public class Stub {
public static void main(String args[]){
AppProps.init();
HDFSAccess ha = new HDFSAccess(AppProps.HDFS_URL, AppProps.hdfsCorePath, AppProps.hdfsSitePath);
ha.createDirectory("MyDirectory");
}
}
''Big Data Notepad" is an intriguing concept that suggests a lightweight, user-friendly tool for working with vast datasets. It likely combines simplicity with powerful functionality, catering to analysts and developers seeking quick insights without complex setups. By streamlining data exploration and manipulation, such a tool could bridge the gap between big data platforms and everyday usability, making big data more accessible for a wider range of professionals and use cases.
ReplyDeleteBig Data Notepad" is an intriguing concept that suggests a lightweight, user-friendly tool for working with vast datasets. It likely combines simplicity with powerful functionality, catering to analysts and developers seeking quick insights without complex setups. By streamlining data exploration and manipulation, such a tool could bridge the gap between big data platforms and everyday usability, making big data more accessible for a wider range of professionals and use cases.
Thank you.
digital marketing course in Kolkata fees