0% found this document useful (0 votes)

137 views10 pages

Solr Elasticsearch

SOLR is an open source enterprise search platform built on Lucene. It provides features like indexing and querying over REST, as well as an administrative UI. SOLR uses request handlers, search components, and request writers to process queries. It allows adding documents and defining schemas. Elasticsearch is a distributed, RESTful search and analytics engine based on Lucene. It represents data as documents containing fields, and uses inverted indexes for fast searching. Elasticsearch has a dynamic schema and supports complex queries through its Query DSL.

Uploaded by

Rubila Dwi Adawiyah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

137 views10 pages

Solr Elasticsearch

Uploaded by

Rubila Dwi Adawiyah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

SOLR AND ELASTIC SEARCH

I. SOLR
SOLR is an open source enterprise search server or web application
build on top of LUCENE API LIBRARY. SOLR provides the features that
Lucene offers, plus many other tools which is very advantageous
instead of using direct lucene. SOLR exposes lucene Java APIs as REST-
FULL services. Putting documents in it is called indexing and it can be
done via XML, JSON, CSV or binary over HTTP. Users can query it via
HTTP GET and by doing so, users will receive XML, JSON, CSV or binary
results.

SOLR: KEY FEATURES

Solr offers Advanced Full-Text Search Capabilities

Optimized for High Volume Web Traffic
Standards Based Open Interfaces XML, JSON and HTTP
Comprehensive HTML Administration Interfaces
Server statistics exposed over JMX for monitoring
Near Real-time indexing and Adaptable with XML configuration
Linearly scalable, auto index replication, auto, Extensible Plugin
Architecture

SOLR: ARCHITECTURE
Request handlers deal with incoming request for http or other
services, they interpret these requests so there will be a lot of
request handlers by default. Other than that, user can also
manage their request handlers to intercept their own requests.
The queries to request index that are inputted will go to different
kinds of request handlers in order for them to perform with users
application needs. Here are the functions request handlers:
o /admin is the request handler that will handle all the
administrative tasks
o /select intercepts different URL contexts
o /spell handles the part where spell-checking is needed
Search components start working after Request handlers finish
their tasks. There are many search components that can be
configured and used by the request handlers. Once the
component handles the incoming requests, it goes directly to the
distributed search where it fetches the records and give it
back.
Request Writers is completely different from request handlers
because request handlers are meant for reading the document
while request writers update the index. There are different forms
of request handlers, such as XML, JSON, and Binary.
Update Handlres handle the indexing process

SOLR: ADMIN UI

The admin UI provides the complete management of the solr server

which can be used to configure and manage some features such as
configuration files, admin cores, and indexes.
SOLR: SCHEMA HIERARCHY

Solar instance Web application can be deployed in multiple servers

so it can each run in one instance server.
Every server or instance can have multiple databases or indexes or
cores.
Each core or index can have documents and each document can have
multiple fields.

SOLR: CORE
Can also reffered to just a core, is a running instance of Lucene index
along with all the Solr configurations (SolrConfig.Xml, Schema.Xml,
etc). Schema.xml is used to define the structure of the core, while
SolrConfig.xml is used to define the architecture of the core or the
instance.
A single Solr application can contain 0 or more cores. Cores are run
largely in isolation but can communicate with each other if necessary
via the CoreContainer. Back then, Solr was initially only supported one
index, and the SolrCore class was a singleton for coordinating the low-
level functionality at the core of Solr.

SOLR: DOCUMENTS AND FIELDS

Solrs basic unit of information is a document, which is a set of data
that describes something. Documents are composed of fields, which
are more specific pieces of information. Fields can contain different
kinds of data, such as a name field, and so on. The filed type tells Solr
how to interpret the field and how it can be queried.
SOLR: INDEXING DATA
A solr index can accept data from many different sources, including
XML files, comma-separated value (CSV) files, data extracted from
tables in a database, and files in common file formats such as Microsoft
Word or PDFs. Here are the three most common ways of loading data
into a Solr index:
Uploading XML files by sending HTTP requests to the Solr
Using index handlers to import from databases
Using the solr cell framework
Writing a custom Java apllication to ingest data through solrs java
client

SOLR: ANALYSIS
There are three main concepts in analysis: analyzers, tokenizers, and
filters.
Analyzers are used both during, when a documen is indexed, and at
query time.
- The sama analysis process need not be used for both operations
- An analyzer examines the text of fields and generates a token stream
- Analuzers may be a single class or they may be composed of a series
of tokenizers and filter classes
Tokenizers break field data into lexical units or tokens and Filters
examine a stream of tokens and keep them, transform or discard them,
or create new ones.
SOLR: SEARCH PROCESS

SOLR: FEATURES

Faceting Pagination
Highlighting Grouping and clustering
Spell checking Spatial search
Query-re-ranking Components
Transforming Real time (get and update)
Suggestors LABS
More like this

SOLR: INSTALATION
Download the Apace Solr package

hduser@prast-virtual-machine:~$ cd ~
hduser@prast-virtual-machine:~$ wget
http://apache.mirror1.spango.com/lucene/solr/5.5.1/solr-5.5.1.tgz

Create a directory in /usr/local and rename it to solr

hduser@prast-virtual-machine:~$ cd /usr/local
hduser@prast-virtual-machine:/ur/local$ sudo mkdir solr

Move file solr-5.5.1.tgz to that directory and extract the files inside
it.

hduser@prast-virtual-machine:~$ sudo mv solr-5.5.1.tgz /usr/local/solr

hduser@prast-virtual-machine:~$ cd /usr/local/solr
hduser@prast-virtual-machine:/usr/local/solr$ tar xzf solr-5.5.1.tgz solr-
5.5.1/bin/install_solr_service.sh --strip-components=2

With bash command, we can now instal apache solr

hduser@prast-virtual-machine:/usr/local/solr$ sudo bash

./install_solr_service.sh solr-5.5.1.tgz

Check the status of the apache solr we just installed

hduser@prast-virtual-machine:/usr/local/solr$ sudo service solr status

If the installation was successful it will show the status of the
installation that was just done.

Found 1 Solr nodes:

Solr process 2750 running on port 8983
.....

By default, apache solr will run at port 8983. Solr gives us the
freedom to create connection within the instalation folder. To create
a connection for a workplace use this command:

hduser@prast-virtual-machine:/usr/local/solr$ sudo su - solr -c

"/opt/solr/bin/solr create -c gettingstarted -n data_driven_schema_configs"

II. ELASTICSEARCH

Elasticsearch is an open-source distributed restful search engine
that is based on Lucene. Elasticsearch is able to achieve fast search
responses because instead of searching directly for the text, it
searches an index instead. This type of index is called an inverted
index because it inverts a page-centric data structure (page -> words)
to a keyword-centric data structure (word -> pages).

HOW ELASTICSEARCH REPRESENTS DATA

Document is the unit of search and index. An index can contain of
one or more ducments and a documen can contain one or more fields.
In database terminology, a document corresponds to a table row, and a
field corresponds to a table column.

SCHEMA

Unlike solr, Elasticsearch is schema-free. However, it is necessary to

add mapping declarations if anything but the most basic fields and
operations is required.
The schema declares:
What fields are there
Which fiel should be used as the unique/primary key
Which fields are required
How to idex and search each field

In elasticsearch, an index may store documents of different

mapping types. Multiple mapping definitions for each mapping type
can beassociated. A mapping type is a way of separating the
documents in an index into logical groups.

To create a mapping, the Put Mappign API is needed, or the other

way is to add multiple mappings when an index is going to be created.

QUERY DSL

The Query DSL is elasticsearchs way of making Lucenes query

syntax accessible to users, allowng complex queres to be composed
using a JSON syntax.

Like lucene, there are basic queries such as term or prefix queries
and also compound queries like the bool query.

Below is the rough main structure of a query:

curl -X POST "http://localhost:9200/blog/_search?
pretty=true" -d
{"from": 0,
"size": 10,
"query" : QUERY_JSON,
FILTER_JSON,
FACET_JSON,
SORT_JSON
}

ELASTICSEARCH INSTALATION

Download elasticsearch from elastic.co/downloads/elasticsearch

and extract the archive file.

Once the archieve file has been extracted, elasticsearch is ready
to run. To strat it up in the foregroud:

cd elasticsearch-<version>
/bin/elasticsearch

Test it out by opening another terminal window and running the
following:

curl 'http://localhost:9200/?pretty'

If the OS is Windows, cURL can be downloaded from

http://curl.haxx.se/download.html

A response such as this will be shown:

{
"name" : "Tom Foster",
"cluster_name" : "elasticsearch",
"version" : {
"number" : "2.1.0",
"build_hash" :
"72cd1f1a3eee09505e036106146dc1949dc5dc87",
"build_timestamp" : "2015-11-18T22:40:03Z",
"build_snapshot" : false,
"lucene_version" : "5.3.1"
},
"tagline" : "You Know, for Search"
}

This means that an elasticsearch node is up and running and it
is ready to be experimented with. A node is a running instnce of
elasticsearch. A cluster is a group of nodes with the same
cluster.name that are working together to share data and provide
failover and scale. The cluster.name can be changed in the
elasticsearch.yml configuration file that is loaded when a node is
started.

When elasticsearch is running in the foreground, it can be
stopped by pressing ctrl+C

SENSE

Sense is a Kibana app that provides an interactive console for

submitting requests to Elasticsearch directly from the browser.

SENSE INSTALLATION
Run the following command in the Kibana directory to download
and install the Sense app:

./bin/kibana plugin --install elastic/sense

Start Kibana

./bin/kibana

Open Sense on the web browser by going to
http://localhost:5601/app/sense

III. REFERENCES
https://www.elastic.co/guide/en/elasticsearch/guide/current/running-
elasticsearch.html

http://hiprast.com/teknologi/tutorial-instalasi-apache-solr/

http://www.elasticsearchtutorial.com/basic-elasticsearch-
concepts.html

http://www.edureka.co/

Installation Manual: Starken AAC Block Wall
50% (4)
Installation Manual: Starken AAC Block Wall
40 pages
Api Rest
100% (1)
Api Rest
346 pages
Apache Solr Presentation
100% (1)
Apache Solr Presentation
37 pages
Nginx Modules Reference PDF
No ratings yet
Nginx Modules Reference PDF
468 pages
Kubernetes Interview Questions
No ratings yet
Kubernetes Interview Questions
88 pages
Solr Configuration: Guide To Installing Open Source Search Solution Solr On Windows and Linux
No ratings yet
Solr Configuration: Guide To Installing Open Source Search Solution Solr On Windows and Linux
13 pages
01 Kong Gateway Installation 1
No ratings yet
01 Kong Gateway Installation 1
48 pages
Walkthrough - Setting Up Solr - 9.2 PDF
No ratings yet
Walkthrough - Setting Up Solr - 9.2 PDF
8 pages
Connect Direct For Microsoft Windows 6.1 Documentation
No ratings yet
Connect Direct For Microsoft Windows 6.1 Documentation
304 pages
Lucene Solr
No ratings yet
Lucene Solr
52 pages
GraphQL Thesis
No ratings yet
GraphQL Thesis
77 pages
JavaServer™ Faces (JSF)
100% (9)
JavaServer™ Faces (JSF)
27 pages
Alfresco CMIS Sample Chapter
No ratings yet
Alfresco CMIS Sample Chapter
21 pages
Django MCQ3
No ratings yet
Django MCQ3
14 pages
Reactive Microservice in Action
No ratings yet
Reactive Microservice in Action
41 pages
Online Task Management System
0% (2)
Online Task Management System
3 pages
Boosting Documents in Solr by Recency, Popularity, and User Preferences
100% (1)
Boosting Documents in Solr by Recency, Popularity, and User Preferences
20 pages
Cloudera Hbase
100% (1)
Cloudera Hbase
145 pages
J2ME Architecture: Unit-Ii What Is J2ME or Java ME?
No ratings yet
J2ME Architecture: Unit-Ii What Is J2ME or Java ME?
18 pages
Building A Secure Web Server
100% (1)
Building A Secure Web Server
41 pages
Last Update: January 2019
No ratings yet
Last Update: January 2019
53 pages
07 - Ingesting New Datasets Into Google BigQuery
No ratings yet
07 - Ingesting New Datasets Into Google BigQuery
8 pages
2016 Big Data Analytics Market Study - Wisdom of Crowdsr Series - Licensed To Pentaho - Copyright 2016 Dresner Advisory Services
No ratings yet
2016 Big Data Analytics Market Study - Wisdom of Crowdsr Series - Licensed To Pentaho - Copyright 2016 Dresner Advisory Services
93 pages
Búsqueda Avanzada en Twitter
No ratings yet
Búsqueda Avanzada en Twitter
14 pages
09 JSP Intro
No ratings yet
09 JSP Intro
32 pages
Using SOLR For Enabling Highly Customized Sitewide Navigation
No ratings yet
Using SOLR For Enabling Highly Customized Sitewide Navigation
12 pages
WildFly Cookbook - Sample Chapter
100% (1)
WildFly Cookbook - Sample Chapter
35 pages
Log4j 2 Users Guide
100% (1)
Log4j 2 Users Guide
161 pages
Apache Solr As A Full-Text Search
No ratings yet
Apache Solr As A Full-Text Search
7 pages
Alfresco Developer Guide
No ratings yet
Alfresco Developer Guide
9 pages
Visual Paradigm For UML Tutorial English
No ratings yet
Visual Paradigm For UML Tutorial English
22 pages
Cost of Quality: Dr.R.RAJU Professor Dept of Industrial Engg. Anna University
No ratings yet
Cost of Quality: Dr.R.RAJU Professor Dept of Industrial Engg. Anna University
24 pages
Redhat Linux Essential
No ratings yet
Redhat Linux Essential
16 pages
10 - Kubernetes (Light Theme)
No ratings yet
10 - Kubernetes (Light Theme)
15 pages
Couchbase Manual 1.8
No ratings yet
Couchbase Manual 1.8
162 pages
Ionic Framework
No ratings yet
Ionic Framework
69 pages
Mastering JBoss Drools 6 - Sample Chapter
No ratings yet
Mastering JBoss Drools 6 - Sample Chapter
26 pages
Performance Tuning Guide
No ratings yet
Performance Tuning Guide
40 pages
CMVR Type Approval
No ratings yet
CMVR Type Approval
22 pages
Web IDE Fullstack Presentation
No ratings yet
Web IDE Fullstack Presentation
24 pages
API Index
No ratings yet
API Index
20 pages
Midhun Raj P
No ratings yet
Midhun Raj P
1 page
Industrial Scale PDF
0% (1)
Industrial Scale PDF
4 pages
Cassandra Tutorial For Beginners: Learn in 3 Days: What Is Apache Cassandra?
No ratings yet
Cassandra Tutorial For Beginners: Learn in 3 Days: What Is Apache Cassandra?
4 pages
Ignite Sample
0% (1)
Ignite Sample
88 pages
Spring Boot and API
No ratings yet
Spring Boot and API
4 pages
LPI 303 From IBM Develeper Works
No ratings yet
LPI 303 From IBM Develeper Works
36 pages
Quarkus 4
No ratings yet
Quarkus 4
10 pages
Open Channel Flow MD Abdul Halim PDF PDF Free
No ratings yet
Open Channel Flow MD Abdul Halim PDF PDF Free
162 pages
Nginx + Apache Tomcat Configuration Example
No ratings yet
Nginx + Apache Tomcat Configuration Example
2 pages
General Knowledge About Zabbix
No ratings yet
General Knowledge About Zabbix
3 pages
Ias PDF
No ratings yet
Ias PDF
56 pages
Book Mastering Nginx Paperback
No ratings yet
Book Mastering Nginx Paperback
3 pages
Lucene Ranking
No ratings yet
Lucene Ranking
13 pages
VMware Virtual SAN Cookbook - Sample Chapter
No ratings yet
VMware Virtual SAN Cookbook - Sample Chapter
21 pages
Alfresco Developer Series
No ratings yet
Alfresco Developer Series
3 pages
Tomcatx Performance Tuning
No ratings yet
Tomcatx Performance Tuning
51 pages
Semantics John Saeed 3rd Edition PDF Angelahadl
100% (1)
Semantics John Saeed 3rd Edition PDF Angelahadl
6 pages
Orifice Flange
No ratings yet
Orifice Flange
70 pages
Energy & Load Calculation (Exercise)
No ratings yet
Energy & Load Calculation (Exercise)
3 pages
Microteach Final Lesson Plan Ed 2500
No ratings yet
Microteach Final Lesson Plan Ed 2500
3 pages
Programming Concept-Sebesta
No ratings yet
Programming Concept-Sebesta
31 pages
List of Safety Posters
No ratings yet
List of Safety Posters
2 pages
6bh Service Manual
No ratings yet
6bh Service Manual
10 pages
CQI
No ratings yet
CQI
10 pages
Computer Science Class 12th Projects - Employee Database Management System
0% (1)
Computer Science Class 12th Projects - Employee Database Management System
27 pages
Chaos Theory IN Cryptography: Risna Dwi Hapsari Zata Atika Rubila Dwi Adawiyah
No ratings yet
Chaos Theory IN Cryptography: Risna Dwi Hapsari Zata Atika Rubila Dwi Adawiyah
17 pages
ISD Development Approach:: Data Synchronization Tasks
No ratings yet
ISD Development Approach:: Data Synchronization Tasks
6 pages
5.2.1.7 Lab - Install Windows 7 or Vista
No ratings yet
5.2.1.7 Lab - Install Windows 7 or Vista
28 pages
Daily Line Rework Rejection Report
No ratings yet
Daily Line Rework Rejection Report
32 pages
Final PPT Maruti
No ratings yet
Final PPT Maruti
27 pages
Harvesting and Analyzing Tweets Using R
No ratings yet
Harvesting and Analyzing Tweets Using R
23 pages
Jul Statement
No ratings yet
Jul Statement
11 pages
EE8403-Measurements and Instrumentation
No ratings yet
EE8403-Measurements and Instrumentation
15 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
7 pages
Department of Architecture and Built Environment
No ratings yet
Department of Architecture and Built Environment
2 pages
ES - ATC and ATM (Distributed System)
No ratings yet
ES - ATC and ATM (Distributed System)
13 pages
Eac 11 Micro Project
No ratings yet
Eac 11 Micro Project
19 pages
List of All Submarine Cables Companies
No ratings yet
List of All Submarine Cables Companies
10 pages
Catalogue
No ratings yet
Catalogue
11 pages
Introduction To Model Validation: Kasey Jones
No ratings yet
Introduction To Model Validation: Kasey Jones
23 pages
2. Desin and 3D Printing of Rocket Exp
No ratings yet
2. Desin and 3D Printing of Rocket Exp
12 pages
The Crying Beach
No ratings yet
The Crying Beach
2 pages
The Crying Beach
No ratings yet
The Crying Beach
2 pages
Metrum PQ Controller DB System 4page Eng R24 Webb
No ratings yet
Metrum PQ Controller DB System 4page Eng R24 Webb
4 pages
Indiana University
No ratings yet
Indiana University
2 pages
OpenJS Node.js Application Developer (JSNAD) Certification Guide
From Everand
OpenJS Node.js Application Developer (JSNAD) Certification Guide
Liora Venith
No ratings yet
CakePHP 1.3 Application Development Cookbook
From Everand
CakePHP 1.3 Application Development Cookbook
Mariano Iglesias
No ratings yet
Offline First Web Development: Design and build robust offline-first apps for exceptional user experience even when an internet connection is absent
From Everand
Offline First Web Development: Design and build robust offline-first apps for exceptional user experience even when an internet connection is absent
Daniel Sauble
No ratings yet
Mastering Play Framework for Scala
From Everand
Mastering Play Framework for Scala
Shiti Saxena
No ratings yet
Unix / Linux FAQ: with Tips to Face Interviews
From Everand
Unix / Linux FAQ: with Tips to Face Interviews
Prof. N.B. Venkateswarlu
No ratings yet
PHP Oracle Web Development: Data processing, Security, Caching, XML, Web Services, and Ajax
From Everand
PHP Oracle Web Development: Data processing, Security, Caching, XML, Web Services, and Ajax
Yuli Vasiliev
No ratings yet
CentOS Stream 9 Essentials: Learn to Install, Administer, and Deploy CentOS Stream 9 Systems
From Everand
CentOS Stream 9 Essentials: Learn to Install, Administer, and Deploy CentOS Stream 9 Systems
Neil Smyth
No ratings yet
Cloud Development and Deployment with CloudBees
From Everand
Cloud Development and Deployment with CloudBees
Nicolas De loof
No ratings yet
Instant Play Framework Starter
From Everand
Instant Play Framework Starter
Daniel Dietrich
No ratings yet
Learning SaltStack - Second Edition
From Everand
Learning SaltStack - Second Edition
Colton Myers
No ratings yet
Kubernetes A Complete Guide
From Everand
Kubernetes A Complete Guide
Gerardus Blokdyk
No ratings yet

Solr Elasticsearch

Uploaded by

Solr Elasticsearch

Uploaded by

SOLR AND ELASTIC SEARCH

SOLR: KEY FEATURES

Solr offers Advanced Full-Text Search Capabilities

The admin UI provides the complete management of the solr server

Solar instance Web application can be deployed in multiple servers

SOLR: DOCUMENTS AND FIELDS

Create a directory in /usr/local and rename it to solr

hduser@prast-virtual-machine:~$ sudo mv solr-5.5.1.tgz /usr/local/solr

With bash command, we can now instal apache solr

hduser@prast-virtual-machine:/usr/local/solr$ sudo bash

Check the status of the apache solr we just installed

hduser@prast-virtual-machine:/usr/local/solr$ sudo service solr status

Found 1 Solr nodes:

hduser@prast-virtual-machine:/usr/local/solr$ sudo su - solr -c

Unlike solr, Elasticsearch is schema-free. However, it is necessary to

In elasticsearch, an index may store documents of different

To create a mapping, the Put Mappign API is needed, or the other

The Query DSL is elasticsearchs way of making Lucenes query

Below is the rough main structure of a query:

Download elasticsearch from elastic.co/downloads/elasticsearch

If the OS is Windows, cURL can be downloaded from

Sense is a Kibana app that provides an interactive console for

./bin/kibana plugin --install elastic/sense

You might also like