Do You Make These Mistakes?

Posts

Apache Maven - Installation

February 09, 2014

Apache Maven is an innovative software project management tool, provides new concept of a project object model ( POM ) file to manage project’s build, dependency and documentation. The most powerful feature is able to download the project dependency libraries automatically. We will show you how to install Apache Maven 3 on Ubuntu 12. Searching for Maven Package In a terminal, run apt-cache search maven to get all the available Maven package. The maven package always comes with latest Apache Maven. $ apt-cache search maven .... libxmlbeans-maven-plugin-java-doc - Documentation for Maven XMLBeans Plugin maven - Java software project management and comprehension tool maven-debian-helper - Helper tools for building Debian packages with Maven maven2 - Java software project management and comprehension tool Installing Maven Package Run the below command to install the latest Apache Maven. $ sudo apt-get install maven Verifying Maven Installation Run the be...

Compression in Hadoop

December 30, 2013

File compression brings two major benefits: it reduces the space needed to store files, and it speeds up data transfer across the network, or to or from disk. When dealing with large volumes of data, both of these savings can be significant, so it pays to carefully consider how to use compression in Hadoop. Some of the compression formats used in Hadoop Compression Format Tool Algorithm Filename Extension Splittable DEFLATE NA DEFLATE .deflate No gzip gzip DEFLATE .gz No bzip2 bzip2 bzip2 .bz2 Yes LZO lzop LZO .lzo No Snappy NA Snappy .snappy No Codecs A codec is the implementation of a compression-decompression algorithm and in Hadoop, it is represented by an implementation of the CompressionCodec interface. Compression Format Hadoop CompressionCodec DEFLATE org.apac...

The Hadoop Distributed Filesystem

December 11, 2013

Design of HDFS HDFS is a filesystem designed for Very large files - Files that are of hundereds of MB, GB or TB. Hadoop clusters running today stores petabytes of data. Streaming data access - write once, read many times pattern Commodity hardware - Hadoop doesn’t require expensive, highly reliable hardware to run on. The applications for which using HDFS does not work so well. While this may change in the future, these are areas where HDFS is not a good fit today Low-latency data access Lots of small files Multiple writers, arbitrary file modifications Blocks HDFS has the concept of a block, but it is a much larger unit—64 MB by default. HDFS blocks are large compared to disk blocks, and the reason is to minimize the cost of seeks. By making a block large enough, the time to transfer the data from the disk can be made to be significantly larger than the time to seek to the start of the block. Thus the time to transfer a large file made of multiple blocks opera...

Failed to load Main-Class manifest attribute from HelloWorld.jar - SOLVED

November 10, 2013

When i try to compile a jar file using the below command in command prompt, java -jar HelloWorld.jar i got an error like Failed to load Main-Class manifest attribute from HelloWorld.jar This is due to the missing launch configuration. The Main-Class header needs to be in the manifest for the JAR file - this is metadata about things like other required libraries. See the Sun documentation for how to create an appropriate manifest. Simply, i followed the eclipse for exporting the jar file instead of remembering all the commands. and choose as specified below. and choose the following options below. 1. Choose your class that contains MAIN method. 2. Choose the destination of Jar file 3. Once, one and two steps are done, Click Finish. Now run the same command via command prompt, java -jar HelloWorld.jar Thi...

Deleting files with SIZE range

November 01, 2013

The -a in an explicit AND operator that allows you to join two primaries. In this case creating a range using -size . rm -rf `find . -size +300c -a -size -400c`; The above command deletes the files which size are in between 300kb to 400kb. Note the size is a numeric argument that can optionally be appended with + and - . Numeric arguments can be specified as +n for greater than n, -n for less than n, n for exactly n.

HADOOP - Installation setup

October 04, 2013

Prerequisites Hadoop requires a working Java 1.5+ (aka Java 5) installation. However, using Java 1.6/1.7 (aka Java 6/7) is recommended for running Hadoop. Please refer to jdk installation instructions here. Dedicated user for Hadoop system A dedicated Hadoop user will help Hadoop installation from other software applications and user accounts running on the same machine. umasarath@ubuntu:~$ sudo addgroup hadoop umasarath@ubuntu:~$ sudo adduser --ingroup hadoop hduser The above two commands will add "hduser" user and "hadoop" group. Hadoop Installation Download Hadoop from the Apache Download Mirrors and extract the contents of the Hadoop package to a location of your choice. The folder I chosen was the hduser home folder. Extract the downloaded file in /home/hduser folder and make sure the file should be extracted in hduser login. hduser@ubuntu:~$ sudo tar xzf hadoop-1.2.1.tar.gz hduser@ubuntu:~$ sudo mv hadoop-1.2.1 hadoop Configuratio...

Install JDK on Ubuntu

October 04, 2013

Installing Open JDK from Command Prompt Issue command apt-get install openjdk-7-jdk to install JDK7. Ubuntu will auto download JDK and start the installation, wait a few minutes for the downloading process. umasarath @ ubuntu:~ $ sudo apt-get install openjdk- 7 -jdk Verifying Java after installation Ubuntu installs JDK at /usr/lib/jvm/jdk-folder, for example /usr/lib/jvm/java-7-openjdk-amd64/. In additional, Ubuntu also puts the JDK bin folder in the system path, via symbolic link. For example, /usr/bin/java. To verify if JDK is installed properly, type java -version in the command prompt. umasarath@ubuntu:~$java-version java version "1.7.0_25" OpenJDK Runtime Environment (IcedTea 2.3.10) (7u25-2.3.10-1ubuntu0.12.04.2) OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode) umasarath@ubuntu:/usr/lib/jvm/java-7-openjdk-amd64/bin$ Post-Installation Setup To configure JAVA_HOME in system path each time the terminal is started, you can append the expor...