Hadoop Bigdata UNIT-1
Hadoop Bigdata UNIT-1
All the operations that you perform on a data such as searching, sorting, insertion, manipulation,
deletion etc. can be performed by Java Collections.
Java Collection simply means a single unit of objects. Java Collection framework provides many
interfaces (Set, List, Queue, Deque etc.) and classes (ArrayList, Vector, LinkedList, PriorityQueue,
HashSet, LinkedHashSet, TreeSet etc).
Collection framework represents a unified architecture for storing and manipulating group of objects.
It has:
The java.util package contains all the classes and interfaces for Collection framework.
Methods of Collection interface
There are many methods declared in the Collection interface. They are as follows:
2 public boolean addAll(Collection is used to insert the specified collection elements in the
c) invoking collection.
6 public int size() return the total number of elements in the collection.
7 public void clear() removes the total no of element from the collection.
Iterator interface
Iterator interface provides the facility of iterating the elements in forward direction only.
There are only three methods in the Iterator interface. They are:
2 public Object next() It returns the element and moves the cursor pointer to the next
element.
3 public void remove() It removes the last elements returned by the iterator. It is rarely
used.
Linked List
A linked list is a data structure used for collecting a sequence of objects that allows efficient addition
and removal of elements in the middle of the sequence. To understand the need for such a data
structure, imagine a program that maintains a sequence of employee objects, sorted by the last names
of the employees. When a new employee is hired, an object needs to be inserted into the sequence.
Unless the company happened to hire employees in alphabetical order, the new object probably needs
to be inserted somewhere near the middle of the sequence. If we use an array to store the objects, then
all objects following the new hire must be moved toward the end. Conversely, if an employee leaves
the company, the object must be removed, and the hole in the sequence needs to be closed up by
moving all objects that come after it. Moving a large number of values can involve a substantial
amount of processing time. We would like to structure the data in a way that minimizes this cost.
Rather than storing the values in an array, a linked list uses a sequence of nodes. Each node stores a
value and a reference to the next node in the sequence (see Figure 1). When you insert a new node
into a linked list, only the neighbouring node references need to be updated. The same is true when
you remove a node
Instead, the Java library supplies a List Iterator type. A list iterator describes a position anywhere
inside the linked list
program:
import java.util.*;
public class LinkedListDemo {
public static void main(String args[]) {
// create a linked list
LinkedList ll = new LinkedList();
// add elements to the linked list
ll.add("F");
ll.add("B");
ll.add("D");
ll.add("E");
ll.add("C");
ll.addLast("Z");
ll.addFirst("A");
ll.add(1, "A2");
System.out.println("Original contents of ll: " + ll);
// remove elements from the linked list
ll.remove("F");
ll.remove(2);
A stack lets you insert and remove elements at only one end, traditionally called the top of the
stack. To visualize a stack, think of a stack of books. New items can be added to the top of the
stack. Items are removed at the top of the stack as well. Therefore, they are removed in the
order that is opposite from the order in which they have been added, called last in, first out or
LIFO order. For example, if you add items A, B, and C and then remove them, you obtain C, B,
and A. Traditionally, the addition and removal operations are called push and pop
Program:
import java.util.*;
st.push(new Integer(a));
System.out.println("push(" + a + ")");
System.out.println(a);
showpush(st, 42);
showpush(st, 66);
showpush(st, 99);
showpop(st);
showpop(st);
showpop(st);
try {
showpop(st);
}catch (EmptyStackException e) {
System.out.println("empty stack");
output:
stack: [ ]
push(42)
stack: [42]
push(66)
push(99)
pop -> 99
pop -> 66
stack: [42]
pop -> 42
stack: [ ]
c) Queues
A queue is similar to a stack, except that you add items to one end of the queue (the tail) and
remove them from the other end of the queue (the head). To visualize a queue, simply think of 11
people lining up (see Figure 13). People join the tail of the queue and wait until they have
reached the head of the queue. Queues store items in a first in, first out or FIFO fashion. Items
are removed in the same order in which they have been added
import java.util.LinkedList;
import java.util.Queue;
q.add(i);
System.out.println("Elements of queue-"+q);
System.out.println(q);
// implementation.
Output:
Elements of queue-[0, 1, 2, 3, 4]
removed element-0
[1, 2, 3, 4]
head of queue-1
Size of queue-4
d) Set
A Set is a Collection that cannot contain duplicate elements. It models the mathematical set
abstraction. The Set interface contains only methods inherited from Collection and adds the
restriction that duplicate elements are prohibited. Set also adds a stronger contract on the
behavior of the equals and hashCode operations, allowing Set instances to be compared
import java.util.*;
public class SetDemo {
public static void main(String args[]) {
int count[] = {34, 22,10,60,30,22};
Set<Integer> set = new HashSet<Integer>();
try{
for(int i = 0; i<5; i++){
set.add(count[i]);
System.out.println(set);
System.out.println(sortedSet);
}
catch(Exception e){}
Output:
e) Map
A map is a data type that keeps associations between keys and values. The figure 1 gives a 18
typical example: a map that associates names with colors. This map might describe the favorite
Mathematically speaking, a map is a function from one set, the key set, to another set, the value
set. Every key in the map has a unique value, but a value may be associated with several keys.
Just as there are two kinds of set implementations, the Java library has two implementations for
maps: HashMap and TreeMap. Both of them implement the Map interface. As with sets, you
need to decide which of the two to use. As a rule of thumb, use a hash map unless you want to
After constructing a HashMap or TreeMap, you should store the reference to the map object in
a Map reference:
Program:
import java.awt.Color;
import java.util.HashMap;
import java.util.Map;
import java.util.Set;
public class MapDemo
{
Ram : java.awt.Color[r=0,g=255,b=0]
Generic class
A class that can refer to any type is known as generic class. Here, we are using T type parameter to
create the generic class of specific type.
Let’s see the simple example to create and use the generic class.
1. class MyGen<T>{
2. T obj;
4. T get(){return obj;}
5. }
The T type indicates that it can refer to any type (like String, Integer, Employee etc.). The type you
specify for the class, will be used to store and retrieve the data.
1. class TestGenerics3{
4. m.add(2);
6. System.out.println(m.get());
7. }}
Output:2
Type Parameters
The type parameters naming conventions are important to learn generics thoroughly. The commonly
type parameters are as follows:
1. T - Type
2. E - Element
3. K - Key
4. N - Number
5. V - Value
Generic Method
Like generic class, we can create generic method that can accept any type of argument.
Let’s see a simple example of java generic method to print array elements. We are using here E to
denote the element.
System.out.println(element );
}
System.out.println();
printArray( intArray );
printArray( charArray );
The ? (question mark) symbol represents wildcard element. It means any type. If we write <? extends
Number>, it means any child class of Number e.g. Integer, Float, double etc. Now we can call the
method of Number class through any child class object.
import java.util.*;
abstract class Shape{
abstract void draw();
}
class Rectangle extends Shape{
void draw(){System.out.println("drawing rectangle");}
}
class Circle extends Shape{
void draw(){System.out.println("drawing circle");}
}
class GenericTest{
//creating a method that accepts only child class of Shape
public static void drawShapes(List<? extends Shape> lists){
for(Shape s:lists){
s.draw();//calling method of Shape class by child class instance
}
}
public static void main(String args[]){
List<Rectangle> list1=new ArrayList<Rectangle>();
list1.add(new Rectangle());
List<Circle> list2=new ArrayList<Circle>();
list2.add(new Circle());
list2.add(new Circle());
drawShapes(list1);
drawShapes(list2);
}}
OutPut
drawing rectangle
drawing circle
drawing circle
Wrapper class in java provides the mechanism to convert primitive into object and object
into primitive.
Since J2SE 5.0, autoboxing and unboxing feature converts primitive into object and object
into primitive automatically. The automatic conversion of primitive into object is known as
autoboxing and vice-versa unboxing.
The eight classes of java.lang package are known as wrapper classes in java. The list of eight
wrapper classes are given below:
Primitive Type Wrapper class
Boolean Boolean
Char Character
Byte Byte
Short Short
Int Integer
Long Long
Float Float
Double Double
4. int a=20;
7.
9. }}
Output:
20 20 20
Wrapper class Example: Wrapper to Primitive
7.
9. }}
Output:
333
Serialization in Java
Serialization in java is a mechanism of writing the state of an object into a byte stream.
java.io.Serializable interface
Serializable is a marker interface (has no data member and method). It is used to "mark" java classes
so that objects of these classes may get certain capability. The Cloneable and Remote are also marker
interfaces.
The String class and all the wrapper classes implements java.io.Serializable interface by default.
3. int id;
4. String name;
6. this.id = id;
7. this.name = name;
8. }
9. }
In the above example, Student class implements Serializable interface. Now its objects can be
converted into stream.
ObjectOutputStream class
The ObjectOutputStream class is used to write primitive data types and Java objects to an
OutputStream. Only objects that support the java.io.Serializable interface can be written to streams.
Constructor
Important Methods
Method Description
1) public final void writeObject(Object obj) throws writes the specified object to the
IOException {} ObjectOutputStream.
2) public void flush() throws IOException {} flushes the current output stream.
3) public void close() throws IOException {} closes the current output stream.
In this example, we are going to serialize the object of Student class. The writeObject() method of
ObjectOutputStream class provides the functionality to serialize the object. We are saving the state of
the object in the file named f.txt.
1. import java.io.*;
2. class Persist{
5.
8. out.writeObject(s1);
9. out.flush();
10. System.out.println("success");
11. }
12. }
success
Deserialization in java
Deserialization is the process of reconstructing the object from the serialized state.It is the reverse
operation of serialization.
ObjectInputStream class
Constructor
Important Methods
Method Description
1) public final Object readObject() throws IOException, reads an object from the input
ClassNotFoundException{} stream.
2) public void close() throws IOException {} closes ObjectInputStream.
1. import java.io.*;
2. class Depersist{
5. Student s=(Student)in.readObject();
6. System.out.println(s.id+" "+s.name);
7.
8. in.close();
9. }
10. }
OutPut:
211 ravi