Unit-2 Hadoop and Python
Unit-2 Hadoop and Python
ii
FILE HANDLING .......................................... 43
DATE/ TIME OPERATORS .......................... 45
CLASSES ..................................................... 46
SUMMARY ................................................... 48
Unit-2: Hadoop and Python
Apache Hadoop is an open source framework for distributed batch processing of big data.
Hadoop Ecosystem includes:
• Hadoop MapReduce
• HDFS
• YARN
• HBase
• Zookeeper
• Pig
• Hive
• Mahout
• Chukwa
• Cassandra
• Avro
• Oozie
• Flume
• Sqoop
APACHE HADOOP
A Hadoop cluster comprises of a Master node, backup node and a number of slave nodes. The master
node runs the NameNode and JobTracker processes and the slave nodes run the DataNode and
TaskTracker components of Hadoop. The backup node runs the Secondary NameNode process.
NameNode: NameNode keeps the directory tree of all files in the file system, and tracks where across
the cluster the file data is kept. It does not store the data of these files itself. Client applications talk to
the NameNode whenever they wish to locate a file, or when they want to add/copy/move/delete a file.
Secondary NameNode: NameNode is a Single Point of Failure for the HDFS Cluster. An optional
Secondary NameNode which is hosted on a separate machine creates checkpoints of the namespace.
JobTracker: The JobTracker is the service within Hadoop that distributes MapReduce tasks to specific
nodes in the cluster, ideally the nodes that have the data, or at least are in the same rack.
TaskTracker: TaskTracker is a node in a Hadoop cluster that accepts Map, Reduce and Shuffie tasks
from the JobTracker. Each TaskTracker has a defined number of slots which indicate the number of
tasks that it can accept.
DataNode: A DataNode stores data in an HDFS file system. A functional HDFS filesystem has more than
one DataNode, with data replicated across them. DataNodes respond to requests from the NameNode
for filesystem operations. Client applications can talk directly to a DataNode, once the NameNode has
provided the location of the data. Similarly, MapReduce operations assigned to TaskTracker instances
near a DataNode, talk directly to the DataNode to access the files. TaskTracker instances can be
deployed on the same servers that host DataNode instances, so that MapReduce operations are
performed close to the data.
MAPREDUCE
Fig-2.3: MapReduce
MapReduce to Job execution workflow
MapReduce job execution starts when the client applications submit jobs to the Job tracker. The
JobTracker returns a JobID to the client application. The JobTracker talks to the NameNode to determine
the location of the data. The JobTracker locates TaskTracker nodes with available slots at/or near the
data.
The TaskTrackers send out heartbeat messages to the JobTracker, usually every few minutes, to
reassure the JobTracker that they are still alive. These messages also inform the JobTracker of the
number of available slots, so the JobTracker can stay up to date with where in the cluster, new work can
be delegated.
The JobTracker submits the work to the TaskTracker nodes when they poll for tasks. To choose a task
for a TaskTracker, the JobTracker uses various scheduling algorithms (default is FIFO). The TaskTracker
nodes are monitored using the heartbeat signals that are sent by the TaskTrackers to JobTracker.
The TaskTracker spawns a separate JVM process for each task so that any task failure does not bring
down the TaskTracker. The TaskTracker monitors these spawned processes while capturing the output
and exit codes. When the process finishes, successfully or not, the TaskTracker notifies the JobTracker.
When the job is completed, the JobTracker updates its status.
HADOOP SCHEDULERS
Hadoop scheduler is a pluggable component that makes it open to support different scheduling
algorithms. The default scheduler in Hadoop is FIFO.
Two advanced schedulers are also available - the Fair Scheduler, developed at Facebook, and the
Capacity Scheduler, developed at Yahoo.
The pluggable scheduler framework provides the flexibility to support a variety of workloads with varying
priority and performance constraints.
Efficient job scheduling makes Hadoop a multi-tasking system that can process multiple data sets for
multiple jobs for multiple users simultaneously.
FIFO SCHEDULER
FIFO is the default scheduler in Hadoop that maintains a work queue in which the jobs are queued. The
scheduler pulls jobs in first in first out manner (oldest job first) for scheduling. There is no concept of
priority or size of job in FIFO scheduler.
FAIR SCHEDULER
The Fair Scheduler allocates resources evenly between multiple jobs and also provides capacity
guarantees. Fair Scheduler assigns resources to jobs such that each job gets an equal share of the
available resources on average over time. Tasks slots that are free are assigned to the new jobs, so that
each job gets roughly the same amount of CPU time.
Job Pools: The Fair Scheduler maintains a set of pools into which jobs are placed. Each pool has a
guaranteed capacity. When there is a single job running, all the resources are assigned to that job. When
there are multiple jobs in the pools, each pool gets at least as many task slots as guaranteed. Each pool
receives at least the minimum share. When a pool does not require the guaranteed share the excess
capacity is split between other jobs.
Fairness: The scheduler computes periodically the difference between the computing time received by
each job and the time it should have received in ideal scheduling. The job which has the highest deficit
of the compute time received is scheduled next.
CAPACITY SCHEDULER
The Capacity Scheduler has similar functionally as the Fair Scheduler but adopts a different scheduling
philosophy.
Queues: In Capacity Scheduler, you define a number of named queues each with a configurable number
of map and reduce slots. Each queue is also assigned a guaranteed capacity. The Capacity Scheduler
gives each queue its capacity when it contains jobs, and shares any unused capacity between the
queues. Within each queue FIFO scheduling with priority is used.
Fairness: For fairness, it is possible to place a limit on the percentage of running tasks per user, so that
users share a cluster equally. A wait time for each queue can be configured. When a queue is not
scheduled for more than the wait time, it can preempt tasks of other queues to get its fair share.
Note:
Watch this Youtube Video: https://www.youtube.com/live/JrM-EcSQ2YE?feature=shared
Scalability: Scalability is an important factor that drives the application designers to move to cloud
computing environments. Building applications that can serve millions of users without taking a hit on
their performance has always been challenging. With the growth of cloud computing application
designers can provision adequate resources to meet their workload levels.
Reliability & Availability: Reliability of a system is defined as the probability that a system will perform
the intended functions under stated conditions for a specified amount of time. Availability is the probability
that a system will perform a specified function under given conditions at a prescribed time.
Security: Security is an important design consideration for cloud applications given the outsourced
nature of cloud computing environments.
Maintenance & Upgradation: To achieve a rapid time-to-market, businesses typically launch their
applications with a core set of features ready and then incrementally add new features as and when they
are complete. In such scenarios, it is important to design applications with low maintenance and
upgradation costs.
Performance: Applications should be designed while keeping the performance requirements in mind.
Load Balancing Tier: Load balancing tier consists of one or more load balancers.
Application Tier: For this tier, it is recommended to configure auto scaling. Auto scaling can be triggered
when the recorded values for any of the specified metrics such as CPU usage, memory usage, etc.
goes above defined thresholds.
Database Tier: The database tier includes a master database instance and multiple slave instances.
The master node serves all the write requests and the read requests are served from the slave nodes.
This improves the throughput for the database tier since most applications have a higher number of read
requests than write requests.
The Figure-2.7 shows a typical deployment architecture for content delivery applications such as online
photo albums, video webcasting, etc. Both relational and non-relational data stores are shown in this
deployment. A content delivery network (CDN) which consists of a global network of edge locations is
used for media delivery. CDN is used to speed up the delivery of static content such as images and
videos.
Service Oriented Architecture (SOA) is a well-established architectural approach for designing and
developing applications in the form services that can be shared and reused. SOA is a collection of
discrete software modules or services that form a part of an application and collectively provide the
functionality of an application.
SOA services are developed as loosely coupled modules with no hard- wired calls embedded in the
services. The services communicate with each other by passing messages. Services are described using
the Web Services Description Language (WSDL). WSDL is an XML-based web services description
language that is used to create service descriptions containing information on the functions performed
by a service and the inputs and outputs of the service.
Service Components: The service components allow the layers above to interact with the business
systems. The service components are responsible for realizing the functionality of the services
exposed.
Composite Services: These are coarse-grained services which are composed of two or more service
components. Composite services can be used to create enterprise scale components or business-unit
specific components.
Orchestrated Business Processes: Composite services can be orchestrated to create higher level
business processes. In this layers the compositions and orchestrations of the composite services are
defined to create business processes.
Presentation Services: This is the topmost layer that includes user interfaces that exposes the
services and the orchestrated business processes to the users.
Enterprise Service Bus: This layer integrates the services through adapters, routing, transformation
and messaging mechanisms.
Cloud Component Model (CCM) is an application design methodology that provides a flexible way of
creating cloud applications in a rapid, convenient and platform independent manner.CCM is an
architectural approach for cloud applications that is not tied to any specific programming language or
cloud platform.
Cloud applications designed with CCM approach can have innovative hybrid deployments in which
different components of an application can be deployed on cloud infrastructure and platforms of different
cloud vendors.
Applications designed using CCM have better portability and interoperability. CCM based applications
have better scalability by decoupling application components and providing asynchronous
communication mechanisms.
SOA Vs CCM
SOA CCM
SOA advocates principles of reuse
CCM is based on reusable
and well-defined relationship between
Standardization & Re-use components which can be used by
service provider and service
multiple cloud applications.
consumer.
CCM is based on loosely coupled
SOA is based on loosely coupled
Loose coupling components that communicate
services that minimize dependencies.
asynchronously
SOA services minimize resource CCM components are stateless.
Statelessness consumption by deferring the State is stored outside of the
management of state information. components.
SOA CCM
SOA services have small and well- CCM components have very large number of
End points defined set of endpoints through which endpoints. There is an endpoint for each
many types of data can pass. resource in a component, identified by a URI.
SOA uses a messaging layer above
CCM components use HTTP and REST for
Messaging HTTP by using SOAP which provide
messaging.
prohibitive constraints to developers.
Uses WS-Security , SAML and other
Security CCM components use HTTPS for security.
standards for security
CCM allows resources in components
Interfacing SOA uses XML for interfacing. represent different formats for interfacing
(HTML, XML, JSON, etc.).
CCM components and the underlying
Consuming traditional SOA services in a component resources are exposed as XML,
Consumption
browser is cumbersome. JSON (and other formats) over HTTP or
REST, thus easy to consume in the browser.
Model View Controller (MVC) is a popular software design pattern for web applications.
Model: Model manages the data and the behavior of the applications. Model processes events sent by
the controller. Model has no information about the views and controllers. Model responds to the requests
for information about its state (from the view) and responds to the instructions to change state (from
controller).
View: View prepares the interface which is shown to the user. Users interact with the application through
views. Views present the information that the model or controller tell the view to present to the user and
also handle user requests and sends them to the controller.
Controller: Controller glues the model to the view. Controller processes user requests and updates the
model when the user manipulates the view. Controller also updates the view when the model changes.
A RESTful web service is a web API implemented using HTTP and REST principles. The REST
architectural constraints are as follows:
• Client-Server
• Stateless
• Cacheable
• Layered System
• Uniform Interface
• Code on demand
RELATIONAL DATABASES
A relational database is database that conforms to the relational model that was popularized by Edgar
Codd in 1970.
The 12 rules that Codd introduced for relational databases include:
1. Information rule
2. Guaranteed access rule
3. Systematic treatment of null values
4. Dynamic online catalog based on relational model
5. Comprehensive sub-language rule
6. View updating rule
7. High level insert, update, delete
8. Physical data independence
9. Logical data independence
10. Integrity independence
11. Distribution independence
12. Non-subversion rule
Relations: A relational database has a collection of relations (or tables). A relation is a set of tuples (or
rows).
Schema: Each relation has a fixed schema that defines the set of attributes (or columns in a table) and
the constraints on the attributes.
Tuples: Each tuple in a relation has the same attributes (columns). The tuples in a relation can have any
order and the relation is not sensitive to the ordering of the tuples.
Attributes: Each attribute has a domain, which is the set of possible values for the attribute.
Insert/Update/Delete: Relations can be modified using insert, update and delete operations. Every
relation has a primary key that uniquely identifies each tuple in the relation.
Primary Key: An attribute can be made a primary key if it does not have repeated values in different
tuples.
ACID GUARANTEES
Relational databases provide ACID guarantees.
Atomicity: Atomicity property ensures that each transaction is either “all or nothing”. An atomic
transaction ensures that all parts of the transaction complete or the database state is left unchanged.
Consistency: Consistency property ensures that each transaction brings the database from one valid
state to another. In other words, the data in a database always conforms to the defined schema and
constraints.
Isolation: Isolation property ensures that the database state obtained after a set of concurrent
transactions is the same as would have been if the transactions were executed serially. This provides
concurrency control, i.e. the results of incomplete transactions are not visible to other transactions. The
transactions are isolated from each other until they finish.
Durability: Durability property ensures that once a transaction is committed, the data remains as it is,
i.e. it is not affected by system outages such as power loss. Durability guarantees that the database can
keep track of changes and can recover from abnormal terminations.
NON-RELATIONAL DATABASES
Non-relational databases (or popularly called No-SQL databases) are becoming popular with the growth
of cloud computing. Non-relational databases have better horizontal scaling capability and improved
performance for big data at the cost of less rigorous consistency models.
Unlike relational databases, non-relational databases do not provide ACID guarantees. Most non-
relational databases offer “eventual” consistency, which means that given a sufficiently long period of
time over which no updates are made, all updates can be expected to propagate eventually through the
system and the replicas will be consistent.
The driving force behind the non-relational databases is the need for databases that can achieve high
scalability, fault tolerance and availability. These databases can be distributed on a large cluster of
machines. Fault tolerance is provided by storing multiple replicas of data on different machines.
Types of Non-relational Databases:
Key-value store: Key-value store databases are suited for applications that require storing unstructured
data without a fixed schema. Most key-value stores have support for native programming language data
types.
Document store: Document store databases store semi-structured data in the form of documents which
are encoded in different standards such as JSON, XML, BSON, YAML, etc.
Graph store: Graph stores are designed for storing data that has graph structure (nodes and edges).
These solutions are suitable for applications that involve graph data such as social networks,
transportation systems, etc.
Object store: Object store solutions are designed for storing data in the form of objects de?ned in an
object-oriented programming language.
PYTHON
Python is a general-purpose high-level programming language and suitable for providing a solid
foundation to the reader in the area of cloud computing.
The main characteristics of Python are:
Multi-paradigm programming language: Python supports more than one programming paradigms
including object-oriented programming and structured programming
Interpreted Language: Python is an interpreted language and does not require an explicit compilation
step. The Python interpreter executes the program source code directly, statement by statement, as a
processor or scripting engine does.
Interactive Language: Python provides an interactive mode in which the user can submit commands
at the Python prompt and interact with the interpreter directly.
PYTHON – BENEFITS
Easy-to-learn, read and maintain: Python is a minimalistic language with relatively few keywords,
uses English keywords and has fewer syntactical constructions as compared to other languages.
Reading Python programs feels like English with pseudo-code like constructs. Python is easy to learn
yet an extremely powerful language for a wide range of applications.
Object and Procedure Oriented: Python supports both procedure-oriented programming and object-
oriented programming. Procedure oriented paradigm allows programs to be written around procedures
or functions that allow reuse of code. Procedure oriented paradigm allows programs to be written
around objects that include both data and functionality.
Extendable: Python is an extendable language and allows integration of low-level modules written in
languages such as C/C++. This is useful when you want to speed up a critical portion of a program.
Scalable: Due to the minimalistic nature of Python, it provides a manageable structure for large
programs.
Portable: Since Python is an interpreted language, programmers do not have to worry about
compilation, linking and loading of programs. Python programs can be directly executed from source
Broad Library Support: Python has a broad library support and works on various platforms such as
Windows, Linux, Mac, etc.
NUMBERS IN PYTHON
Number data type is used to store numeric values. Numbers are immutable data types, therefore
changing the value of a number data type results in a newly allocated object.
#Integer
>>>a=5
>>>type(a) #Addition
<type ’int’> >>>c=a+b
>>>c #Division
#Floating Point 7.5 >>>f=b/a
>>>b=2.5 >>>type(c) >>>f
>>>type(b) <type ’float’> 0.5
<type ’float’> >>>type(f)
#Subtraction <type float’>
>>>d=a-b
#Long
>>>d
>>>x=9898878787676L
>>>type(x) 2.5
<type ’long’> >>>type(d)
<type ’float’> #Power
#Complex >>>g=a**2
#Multiplication >>>g
>>>y=2+5j
>>>e=a*b 25
>>>y
>>>e
(2+5j)
>>>type(y) 12.5
<type ’complex’> >>>type(e)
>>>y.real <type ’float’>
2
>>>y.imag
5
STRINGS IN PYTHON
A string is simply a list of characters in order. There are no limits to the number of characters you can
have in a string.
LISTS IN PYTHON
List a compound data type used to group together other values. List items need not all have the same
type. A list contains items separated by commas and enclosed within square brackets.
TUPLES IN PYTHON
A tuple is a sequence data type that is similar to the list. A tuple consists of a number of values separated
by commas and enclosed within parentheses. Unlike lists, the elements of tuples cannot be changed, so
tuples can be thought of as read-only lists.
#Create a Tuple
>>>fruits=("apple","mango","banana","pineapple")
>>>fruits
(’apple’, ’mango’, ’banana’, ’pineapple’)
>>>type(fruits)
<type ’tuple’>
#Get length of tuple
>>>len(fruits) 4
DICTIONARIES
Dictionary is a mapping data type or a kind of hash table that maps keys to values. Keys in a dictionary
can be of any data type, though numbers and strings are commonly used for keys. Values in a dictionary
can be any data type or object.
#Convert to string
>>>a=10000
>>>str(a)
’10000’
#Convert to int
>>>b="2013"
>>>int(b)
2013
#Convert to float
>>>float(b)
2013.0
#Convert to long
>>>long(b)
2013L
#Convert to list
>>>s="aeiou"
>>>list(s)
[’a’, ’e’, ’i’, ’o’, ’u’]
#Convert to set
>>>x=[’mango’,’apple’,’banana’,’mango’,’banana’]
>>>set(x)
set([’mango’, ’apple’, ’banana’])
The for statement in Python iterates over items of any sequence (list, string, etc.) in the order in which
they appear in the sequence.
This behavior is different from the for statement in other languages such as C in which an initialization,
incrementing and stopping criteria are provided.
The while statement in Python executes the statements within the while loop as long as the while
condition is true.
The break and continue statements in Python are similar to the statements in C.
break Statement
Break statement breaks out of the for/while loop
continue Statement
Continue statement continues with the next iteration
>fruits=[’apple’,’orange’,’banana’,’mango’]
>for item in fruits:
if item == "banana":
pass
else:
print item
apple
orange
mango
FUNCTIONS
A function is a block of code that takes information in (in the form of parameters), does some
computation, and returns a new piece of information based on the parameter information. A function in
Python is a block of code that begins with the keyword def followed by the function name and
parentheses. The function parameters are enclosed within the parenthesis.
The code block within a function begins after a colon that comes after the parenthesis enclosing the
parameters. The first statement of the function body can optionally be a documentation string or
docstring.
>>>def displayFruits(fruits=[’apple’,’orange’]):
Functions can have default values print "There are %d fruits in the list" % (len(fruits))
of the parameters. If a function with for item in fruits:
default values is called with fewer print item
parameters or without any #Using default arguments
>>>displayFruits()
parameter, the default values of
apple
the parameters are used. orange
>>>fruits = [’banana’, ’pear’, ’mango’]
>>>displayFruits(fruits)
banana
pear
mango
Prepared By: Dr. R. Mahammad Shafi, Professor 38
Unit-2: Hadoop and Python
>>>def displayFruits(fruits):
print "There are %d fruits in the list" % (len(fruits))
for item in fruits:
print item
print "Adding one more fruit"
fruits.append('mango')
>>>fruits = ['banana', 'pear', 'apple']
>>>displayFruits(fruits)
There are 3 fruits in the list
banana
pear
apple
#Adding one more fruit
>>>print "There are %d fruits in the list" % (len(fruits))
There are 4 fruits in the list
MODULES
Python allows organizing the program code into different modules which improves the code readability
and management.
A module is a Python file that defines some functionality in the form of functions or classes. Modules
can be imported using the import keyword. Modules to be imported must be present in the search path.
PACKAGES
Python package is hierarchical file structure that consists of modules and subpackages. Packages
allow better organization of modules related to a single application environment.
FILE HANDLING
Python allows reading and writing to files using the file object. The open(filename, mode) function is
used to get a file object. The mode can be read (r), write (w), append (a), read and write (r+ or w+),
read-binary (rb), write-binary (wb), etc.
After the file contents have been read the close function is called which closes the file object.
Python provides several functions for date and time access and conversions. The datetime module allows
manipulating date and time in several ways. The time module in Python provides various time-related
functions.
CLASSES
Python is an Object-Oriented Programming (OOP) language. Python provides all the standard features
of Object-Oriented Programming such as classes, class variables, class methods, inheritance, function
overloading, and operator overloading.
Class: A class is simply a representation of a type of object and user-defined prototype for an object that
is composed of three things: a name, attributes, and operations/methods.
Instance/Object: Object is an instance of the data structure defined by a class.
Inheritance: Inheritance is the process of forming a new class from an existing class or base class.
Function overloading: Function overloading is a form of polymorphism that allows a function to have
different meanings, depending on its context.
Operator overloading: Operator overloading is a form of polymorphism that allows assignment of more
than one function to a particular operator.
Function overriding: Function overriding allows a child class to provide a specific implementation of a
function that is already provided by the base class. Child class implementation of the overridden function
has the same name, parameters and return type as the function in the base class.
Class Example
# Examples of a class
class Student:
studentCount = 0
def init (self, name, id):
print "Constructor called"
self.name = name
self.id = id
Student.studentCount = Student.studentCount + 1
self.grades={}
def del (self):
print "Destructor called"
def getStudentCount(self):
return Student.studentCount
def addGrade(self,key,value):
self.grades[key]=value
def getGrade(self,key):
return self.grades[key]
def printGrades(self):
for key in self.grades:
print key + ": " + self.grades[key]
SUMMARY
Hadoop is an open-source software framework that is used for storing and processing large amounts of
data in a distributed computing environment. It is designed to handle big data and is based on the
MapReduce programming model, which allows for the parallel processing of large datasets.
HDFS (Hadoop Distributed File System): This is the storage component of Hadoop, which allows for
the storage of large amounts of data across multiple machines. It is designed to work with commodity
hardware, which makes it cost-effective.
YARN (Yet Another Resource Negotiator): This is the resource management component of Hadoop,
which manages the allocation of resources (such as CPU and memory) for processing the data stored in
HDFS.
Hadoop also includes several additional modules that provide additional functionality, such as Hive (a
SQL-like query language), Pig (a high-level platform for creating MapReduce programs), and HBase (a
non-relational, distributed database). Hadoop is commonly used in big data scenarios such as data
warehousing, business intelligence, and machine learning. It’s also used for data processing, data
analysis, and data mining. It enables the distributed processing of large data sets across clusters of
computers using a simple programming model.
When designing applications for the cloud, irrespective of the chosen platform, often found it useful to
consider four specific topics scalability, availability, manageability and feasibility. The consideration of
these four topics will help you discover areas in your application that require some cloud-specific thought,
specifically in the early stages of a project.
Python is an interpreted, object-oriented, high-level programming language with dynamic semantics. Its
high-level built in data structures, combined with dynamic typing and dynamic binding, make it very
attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect
existing components together. Python's simple, easy to learn syntax emphasizes readability and
therefore reduces the cost of program maintenance. Python supports modules and packages, which
encourages program modularity and code reuse. The Python interpreter and the extensive standard
library are available in source or binary form without charge for all major platforms, and can be freely
distributed.
Often, programmers fall in love with Python because of the increased productivity it provides. Since there
is no compilation step, the edit-test-debug cycle is incredibly fast. Debugging Python programs is easy:
a bug or bad input will never cause a segmentation fault. Instead, when the interpreter discovers an error,
it raises an exception. When the program doesn't catch the exception, the interpreter prints a stack trace.
A source level debugger allows inspection of local and global variables, evaluation of arbitrary
expressions, setting breakpoints, stepping through the code a line at a time, and so on. The debugger is
written in Python itself, testifying to Python's introspective power. On the other hand, often the quickest
way to debug a program is to add a few print statements to the source: the fast edit-test-debug cycle
makes this simple approach very effective.
Aditya College of Engineering, Madanapalle under the umbrella of Veda Educational Society was Established in the year 2009 on lofty
and noble ideals to impart excellent technical and value-based Education under the able and dynamic leadership of
Sri R. Ramamohan Reddy, Secretary & Correspondent and Sri. M. Nagamalla Reddy, President and under the visionary guidance of
Dr. S. Ramalinga Reddy, Director.
The Institution is approved by AICTE & affiliated to Jawaharlal Nehru Technological University Anantapur. The Institution is beautifully
nestled against an array of mountains and lush greenery about 10 km from the heart of Madanapalle. Madanapalle is famous for
agricultural products such as Tomato, Mango, Groundnut, and Tamarind etc.,.Madanapalle has the biggest tomato market in Asia.
Bharath Ratna Rabindranath Tagore translated the National Anthem from Bengali to English and also set it to music in Madanapalle.
50