0% found this document useful (0 votes)
51 views18 pages

Data Pagination Using Elasticsearch in Golang - by Eugene Nikolaev - Medium

The document discusses how to implement pagination in Golang applications using Elasticsearch. It covers setting up Elasticsearch and Golang, preparing the Elasticsearch client in Golang, creating an index with a mapping, and implementing different pagination strategies.

Uploaded by

sorkunepsa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views18 pages

Data Pagination Using Elasticsearch in Golang - by Eugene Nikolaev - Medium

The document discusses how to implement pagination in Golang applications using Elasticsearch. It covers setting up Elasticsearch and Golang, preparing the Elasticsearch client in Golang, creating an index with a mapping, and implementing different pagination strategies.

Uploaded by

sorkunepsa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Open in app Sign up Sign in

Search Write

Data Pagination Using


Elasticsearch in Golang
Eugene Nikolaev · Follow
8 min read · Dec 25, 2023

Sign up to discover
Elasticsearch, human
a powerful searchstories that deepen
and analytics engine, your understanding
provides robust
of suited
capabilities for data management the world.
for the cases when non-relational
databases are used. This article will explore how to use Elasticsearch’s
features within Golang to implement different data pagination strategies.
Membership
Setting up Elasticsearch and Golang Environment
Free
Access theenvironment.
Before diving into the code, let’s set up the necessary best member-only Ensure
stories.

you have Elasticsearch


Distraction-free installed
reading. No ads. and configured. Additionally,
Support have Golang
independent authors.

installed on your
Organize your machine
knowledge with along
lists andwith the required
Listen Elasticsearch
to audio narrations.Golang
clienthighlights.
libraries. Read offline.
Tell your story. Find your audience.
Join the Partner Program and earn for
You can find how to setup dockerized ES on your
yourlocal
writing.machine in my other

article: https://satorsight.medium.com/setting-up-elasticsearch-with-
Sign up for free

localstack-in-docker-compose-5a48ebbdf7f1 Try for $5/month


For the Golang part I will use go 1.21.1 with go-elasticsearch/v7 v7.11.0 plus
some common utility libraries like prometheus and zap.

Also, you can find entire project used in this article on github:
https://github.com/SatorSight/go-elastic-w-pagination

Setting up Golang part


Here is how I prepare ES client using go-elasticsearch library:

// main.go
func prepareESClient() *es.Client {
esHost := "http://localhost:4566/es/us-east-1/my-data"
Sign up to discover human stories that deepen your understanding
esUsername := ""
esPassword := ""
of the world.
esIndex := "my-index"

lg, err := logger.New(logger.Config{


Level: "debug",
Encoding: "json", Membership
Color: true,
Free Outputs: []string{"stdout"},
Tags: []string{}, Access the best member-only stories.
}, "Development", "my-app", "1")
Distraction-free reading. No ads. Support independent authors.
if err != nil {
Organize your knowledgeinit
panic("logger with lists and
error") Listen to audio narrations.
highlights.
}
Read offline.
TellesLogger
your story. :=
Findlogger.NewLoggerForEs(lg)
your audience.
Join the Partner Program and earn for
your writing.
var t http.RoundTripper = &http.Transport{
Proxy: http.ProxyFromEnvironment,
ForceAttemptHTTP2: false,
MaxIdleConns: 10,
IdleConnTimeout: 90 * time.Second,
TLSHandshakeTimeout: 10 * time.Second,
ExpectContinueTimeout: 1 * time.Second,
ResponseHeaderTimeout: 30 * time.Second,
TLSClientConfig: &tls.Config{
MinVersion: tls.VersionTLS12,
},
// DisableCompression - this is important, in dev env we start AWS Locals
// so we need to disable any compressions,
DisableCompression: true,
}

esCfg := elasticsearch.Config{
Addresses: []string{esHost},
Username: esUsername,
Password: esPassword,
CloudID: "",
APIKey: "",
Header: nil,
CACert: nil,
RetryOnStatus: nil, // List of status codes for retry. Default: 5
DisableRetry: false,
EnableRetryOnTimeout: true,
MaxRetries: 3,
DiscoverNodesOnStart: false,
DiscoverNodesInterval: 0,
EnableMetrics: true,
Sign up to discover human stories that deepen your understanding
EnableDebugLogger: true,
RetryBackoff:
Transport: of the world.
nil,
t,
Logger: esLogger,
Selector: nil,
ConnectionPoolFunc: nil,
}
Membership

Free maxTimeoutStr := "30s"


maxTimeout, _ := time.ParseDuration(maxTimeoutStr)
Access the best member-only stories.

customStorageCfg
Distraction-free := ads.
reading. No es.Config{ Support independent authors.
DefaultIndex: esIndex,
Organize your knowledge with lists and
MaxSearchQueryTimeout: maxTimeout, Listen to audio narrations.
highlights.
PathToMappingSchema: "",
IsTrackTotalHits: Read offline.
true, // always needed for cnt operations.
Tell}your story. Find your audience.
Join the Partner Program and earn for
your writing.
esClient, err := es.New(lg, esCfg, customStorageCfg)
if err != nil {
log.Fatalln("failed to init esClient")
}
return esClient
}
...

// es.go
func New(log *logger.Logger, esCfg elasticsearch.Config, customCfg Config) (*Cli
es, err := elasticsearch.NewClient(esCfg)
if err != nil {
log.Info("Could not create new ElasticSearch client due error")
return nil, err
}

c := &Client{
log: log,
esCfg: esCfg,
esClient: es,
defaultIndex: customCfg.DefaultIndex,
maxSearchQueryTimeout: customCfg.MaxSearchQueryTimeout,
isTrackTotalHits: true,
}

return c, nil
}

Sign up to discover human stories that deepen your understanding


of the world.
Lets first create new index, for that we need to add mapping.json with ES
mapping to the app root:
Membership

Free
Access the best member-only stories.
{
"mappings": {reading. No ads.
Distraction-free Support independent authors.
"properties": {
Organize
"id":your{knowledge with lists and Listen to audio narrations.
"type": "keyword"
highlights.
}, Read offline.
Tell your story. Find your{audience.
"created_at":
"type": "date" Join the Partner Program and earn for
}, your writing.
"username": {
"type": "keyword"
}
}
}
}

Our ES index will contain users with id, created_at and username fields.

// main.go
func createIndex(client *es.Client, ctx context.Context, index string) {
err := client.CreateIndex(ctx, index, "mapping.json")
if err != nil {
log.Fatalf("failed to create index: %v", err)
}
}

// es.go
func (c *Client) CreateIndex(ctx context.Context, index string, mapping string)
var file []byte
file, err := os.ReadFile(mapping)
if err != nil || file == nil {
Sign up to discover human stories that deepen your understanding
c.log.Fatal("Could not read file with mapping defaultIndex schema",
of the world.
zap.String("path_to_mapping_schema", mapping),
zap.Error(err))
}
indexMappingSchema := string(file)

req := esapi.IndicesCreateRequest{ Membership


Index: index,
Free Body: strings.NewReader(indexMappingSchema),
} Access the best member-only stories.

Distraction-free
res, err :=reading. No ads. c.esClient)
req.Do(ctx, Support independent authors.
if err != nil {
Organize your knowledge
return with lists and
fmt.Errorf("err Listen to%v",
creating defaultIndex: audio err)
narrations.
highlights.
}
Read offline.
Tell your story. Find your audience.
defer func() {
Join the Partner Program and earn for
err = res.Body.Close()
your writing.
if err != nil {
c.log.Error("res.Body.Close() problem", zap.Error(err))
}
}()

if res.IsError() {
return fmt.Errorf("err creating defaultIndex. res: %s", res.String())
}

return nil
}

Then in main.go

func main() {
esClient := prepareESClient() // described above
ctx := prepareContext()

createIndex(esClient, ctx, "my-index")


}

func prepareContext() context.Context {


ctx, _ := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.S
return ctx
Sign up to discover human stories that deepen your understanding
}
of the world.

After running it we will have index created, next lets load 100k users into it:
Membership

Free
Access the best member-only stories.

Distraction-free reading. No ads. Support independent authors.


// main.go
func main()
Organize your{knowledge with lists and Listen to audio narrations.
esClient := prepareESClient()
highlights.
ctx := prepareContext() Read offline.
Tell your story. Find your audience.
create100kUsers(esClient, ctx, "my-index") Join the Partner Program and earn for
} your writing.

func create100kUsers(client *es.Client, ctx context.Context, index string) {


client.Create100kUsers(ctx, index)
}

// es.go
func (c *Client) Create100kUsers(ctx context.Context, index string) {
user := User{
ID: 0,
CreatedAt: time.Now(),
Username: "init",
}

for i := 0; i < 100000; i++ {


user.Username = fmt.Sprintf("%v %v", "user", i)
user.ID = i
err := c.Store(ctx, index, user)
if err != nil {
log.Fatal("failed to store", zap.Error(err))
}
}
}

This will run for a while but after it we will have an index with 100k users to
test pagination on.

Sign up to
Fetching discover
Data Withouthuman stories that deepen your understanding
Pagination
of the world.
Let’s start by looking at a simple data retrieval method from Elasticsearch in
Golang without pagination. The following code demonstrates a basic
retrieval mechanism. In Load function I omit most of the boilerplate code
Membership
like error handling because it’s pretty wordy:
Free
Access the best member-only stories.

Distraction-free reading. No ads. Support independent authors.

//Organize
main.go your knowledge with lists and Listen to audio narrations.
func main() {
highlights.
esClient := prepareESClient() Read offline.
ctxyour
Tell := story.
prepareContext()
Find your audience.
Join the Partner Program and earn for
res := simpleLoad(esClient, ctx, "my-index") your writing.
pp(res)
}

func simpleLoad(client *es.Client, ctx context.Context, index string) es.SearchR


// params after index are "from", "size", "cursor", cursor is ignored if its z
res, err := client.Load(ctx, index, 0, 10, 0)
if err != nil {
log.Fatalf("failed to fetch results: %v", err)
}
return res
}

// es.go
func (c *Client) Load(
ctx context.Context,
index string,
from int,
size int,
cursor float64,
) (SearchResult, error) {
query := map[string]interface{}{
"query": map[string]interface{}{
"match_all": map[string]interface{}{},
},
}

sortQuery := []map[string]map[string]interface{}{
{"ID": {"order": "asc"}},
}

query["sort"] = sortQuery
Signifupcursor
to discover human stories that deepen your understanding
!= 0 {
of the world.
query["search_after"] = []float64{cursor}
}

var buf bytes.Buffer


if err := jsoniter.NewEncoder(&buf).Encode(query); err != nil {
Membership
return SearchResult{}, fmt.Errorf("es.client.Load(): error encoding query: %v.
}
Free
Access the best member-only stories.
var res *esapi.Response
var err error reading. No ads.
Distraction-free Support independent authors.

Organize
res, err = your knowledge with lists and
c.esClient.Search( Listen to audio narrations.
highlights.
c.esClient.Search.WithContext(ctx),
Read offline.
c.esClient.Search.WithTimeout(c.maxSearchQueryTimeout),
Tell your story. Find your audience.
c.esClient.Search.WithIndex(index),
c.esClient.Search.WithBody(&buf), Join the Partner Program and earn for
c.esClient.Search.WithFrom(from), your writing.
c.esClient.Search.WithSize(size),
c.esClient.Search.WithTrackTotalHits(c.isTrackTotalHits),
c.esClient.Search.WithPretty(), // todo remove in case of performance degrada
)

result :=
func() SearchResult {
totalCnt := int64(r["hits"].(map[string]interface{})["total"].(map[string]int
if totalCnt == 0 {
return SearchResult{}
}

cntFind := len(r["hits"].(map[string]interface{})["hits"].([]interface{}))
docs := make([]User, 0, cntFind)
var lastSort float64

for _, v := range r["hits"].(map[string]interface{})["hits"].([]interface{})


lastSort = v.(map[string]interface{})["sort"].([]interface{})[0].(float64)
doc := User{}
jsonBody, err := jsoniter.Marshal(v.(map[string]interface{})["_source"].(map
if err != nil {
c.log.Error("es.client.Load() err from jsoniter.Marshal",
zap.Any("v", v),
zap.Any("r['hits']", r["hits"]),
zap.Error(err),
)

return SearchResult{Users: docs, TotalCount: totalCnt}


}

Sign upifc.log.Error("es.client.Load()
toerrdiscover human stories that
= jsoniter.Unmarshal(jsonBody, deepen
&doc); err !=your
nil { understanding

zap.Any("v", v), of the world.


err from jsoniter.Unmarshal",

zap.Any("r['hits']", r["hits"]),
zap.Error(err),
)

Membership
return SearchResult{Users: docs, TotalCount: totalCnt}
}
Free
Access the best member-only stories.
docs = append(docs, doc)
}
Distraction-free reading. No ads. Support independent authors.
return SearchResult{Users: docs, TotalCount: totalCnt, LastSort: lastSort}
Organize
}() your knowledge with lists and Listen to audio narrations.
highlights.
return result, nil Read offline.
} Tell your story. Find your audience.
Join the Partner Program and earn for
your writing.

This function will fetch the first 10 users from ES. The request is gonna
contain _search?from=0&size=10. The same function is gonna be reused
further.

Let’s go next to an actual pagination.

Implementing Basic Pagination with from and size

Probably first that comes to mind, is to do it the same way as we would do it


in SQL based DBs and use offset and limit. ES also has this feature with from

and size params. Let’s implement a basic paginated output iterating with
from and size parameters:

func loadSimplePagination(client *es.Client, ctx context.Context, index string)


from := 0
size := 10

var result []es.User

Signfor
upito:=discover human stories that deepen your understanding
from; i < 100; i += size {
ofindex,
res, err2 := client.Load(ctx, the world.
i, size, 0)
if err2 != nil {
log.Fatalf("failed to fetch results: %v", err2)
}

users := res.Users Membership


result = append(result, users...)
Free
}
Access the best member-only stories.

return result reading. No ads.


Distraction-free Support independent authors.
}
Organize your knowledge with lists and Listen to audio narrations.
highlights.
Read offline.
Tell your story. Find your audience.
Join the Partner Program and earn for
This will iterate through the first 10 pages and your
do 10 requests.
writing.

The problem
The problem begins if we have a lot of data and try to go beyond 10000
records depth. If we try to do that by making request with something like
?from=10000&size=10, we will get following error:

Result window is too large, from + size must be less than or equal to: [10000]
but was [10010]. See the scroll api for a more efficient way to request
large data sets. This limit can be set by changing the
[index.max_result_window] index level setting.

This means that we can use from&size pagination only for the first 10k
records using certain sort conditions. The reason is written in ES docs:

Search requests take heap memory and time proportional to from + size and this
limits that memory.

Sign up towediscover
In theory can raise human
that limitstories that deepen
to something your understanding
like 100k,
of the world.

PUT /my-index/_settings
{
Membership
"index.max_result_window": 100000
Free
} Access the best member-only stories.

Distraction-free reading. No ads. Support independent authors.

Organize your knowledge with lists and Listen to audio narrations.


but if highlights.
we have millions of records it won’t solve the problem and at some
Read offline.
point Tell
performance canaudience.
your story. Find your start degrading. As a workaround ES has Scroll API
Join the Partner Program and earn for
and Search After API, the Scroll one is deprecated, so let use Search After.
your writing.

Enhanced Pagination Using Cursors and search_after API

To use Search After API we first need to choose a sort field and direction. In
this article I will be using the most simple sort by ID ASC:
sortQuery := []map[string]map[string]interface{}{
{"ID": {"order": "asc"}},
}

In real projects integer ids are not always present (for example when using
UUID), and in that case I would consider using created_at + uuid or inner
“_id” field for pagination. For each field in sort subquery ES will give us
cursors in response, for example:

"search_after": [
"1O9tYowBuAaJdMU4BeRn",
1702465740836
],
"sort": {

Sign up to discover human stories that deepen your understanding


"_id": {
"order": "asc"
}, of the world.
"created_at": {
"order": "asc"
}
}
Membership

Free
Access the best member-only stories.

And in the next query


Distraction-free reading.we can take the first record
No ads. (orindependent
Support last if theauthors.
order is
DESC) and use them to paginate to the next set passing as follows.
Organize your knowledge with lists and Listen to audio narrations.
highlights.
Read offline.
Tell your story. Find your audience.
Join the Partner Program and earn for
query["search_after"] = []float64{cursor} your writing.
// or array of cursors if we use multiple fields for sort

Lets implement it in code:


func cursorPaginate(client *es.Client, ctx context.Context, index string) []es.U
from := 0
size := 10
var res []es.User

// make initial request


initRes, _ := client.Load(ctx, index, from, size, 0)
ls := initRes.LastSort
res = append(res, initRes.Users...)

// make subsequent requests applying cursors


for i := 1; i < 10; i++ {
// third param is offset and its not used when using cursors
res2, err2 := client.Load(ctx, index, 0, size, ls)
if err2 != nil {
log.Fatalf("failed to fetch results: %v", err2)
}

ls = res2.LastSort
log.Printf("current cursor: %v\n", ls)
users := res2.Users
res = append(res, users...)

Sign up}return
to discover
res human stories that deepen your understanding
} of the world.

Membership
This will consequently scroll results using cursors.
Free
Access the best member-only stories.

Another thing to reading.


Distraction-free be careful
No ads.about is memory consumption, inauthors.
Support independent this example
all results are
Organize stored
your in memory
knowledge with lists andwhich can overflow
Listen to at some
audio point if loading
narrations.
highlights.
too much data. In real world apps I would rather flush data between requests
Read offline.
to some filestory.
Tell your or send it somewhere
Find your audience. else to prevent leaks.
Join the Partner Program and earn for
your writing.

Another thing to be careful about is that when using multiple cursors, the
order of them is important, and should be the same as the order of the fields
mentioned in the sort. For example,
sortQuery := []map[string]map[string]interface{}{
{"created_at": {"order": "asc"}},
{"_id": {"order": "asc"}},
}

...

"search_after": [
"1O9tYowBuAaJdMU4BeRn", // cursor for _id
1702465740836 // cursor for created_at
],

// error!

This one will give 400 error from the go-elasticsearch library because it will
try to cast cursors into the wrong type, in created_at cursor is float64
timestamp and for _id (inner uuid-like thing) it will be string.

Sign up to discover human stories that deepen your understanding


Conclusion
of the world.
Elasticsearch offers powerful tools for handling data efficiently, especially
when it comes to pagination in large datasets. While basic pagination
methods like from and size can be suitable for small datasets, the cursor-
Membership
based approach using search_after API proves to be more scalable and
Free
efficient for extensive pagination needs. Access the best member-only stories.

Distraction-free reading. No ads. Support independent authors.

Organize your knowledge with lists and Listen to audio narrations.


highlights.
Golang Elasticsearch Programming Read offline.
Tell your story. Find your audience.
Join the Partner Program and earn for
your writing.
Written by Eugene Nikolaev Follow

0 Followers

Senior Software Engineer at GMO GlobalSign

More from Eugene Nikolaev

Sign up to discover human stories that deepen your understanding


of the world.

Eugene Nikolaev Eugene Nikolaev


Membership
Setting up ElasticSearch with Re-usability of code in Rails Grape
Free
Localstack in docker-compose API Access the best member-only stories.
Elasticsearch is a powerful search and Grape is a REST-like API framework for Ruby
Distraction-free
analytics reading.used
engine commonly No ads.
in modern… that canSupport independent
be integrated withauthors.
frameworks like…

Organize your knowledge with lists and Listen to audio narrations.


Jul 10, 2023 50 Jul 13, 2020 3
highlights.
Read offline.
Tell your story. Find your audience.
Join the Partner Program and earn for
your writing.
Eugene Nikolaev Eugene Nikolaev

Angular dev-server with Rails on Embedding Angular in Rails partial


different ports and CORS policy… view
Frontend development with modern JS Angular JS is designed to operate as a full
framework is usually done using dev-server.… stack frontend framework, but sometimes w…

Feb 8, 2021 Jul 28, 2021 1

Sign upfromtoEugene
See all discover
Nikolaev human stories that deepen your understanding

of the world.

Membership

Free
Access the best member-only stories.

Recommended fromNoMedium
Distraction-free reading. ads. Support independent authors.

Organize your knowledge with lists and Listen to audio narrations.


highlights.
Read offline.
Tell your story. Find your audience.
Join the Partner Program and earn for
your writing.
Shubham Chadokar in Level Up Coding yuquan su

How to run goroutines in a [redis] dive into LRU in redis


sequence Principles of Standard LRU Implementation
In recent job interviews, I’ve been asked
multiple times, create a program print…

Jan 10 129 Nov 26, 2023 2

Lists

General Coding Knowledge Coding & Development


20 stories · 1235 saves 11 stories · 617 saves

Stories to Help You Grow as a ChatGPT


Software Developer 21 stories · 642 saves
19 stories · 1072 saves

Sign up to discover human stories that deepen your understanding


of the world.

Membership

Free
Access the best member-only stories.

Distraction-free reading. No ads. Support independent authors.

Organize
Wahyu your
Hutomo Adjiknowledge with lists and CemListen
Bideci to audio narrations.
highlights.
Read offline.
Basic Tutorial GORM Implementing the Saga Pattern in
Tell your story. Find your audience.
GORM is An object-relational mapper (ORM) Go: A Hands-On
Join the PartnerApproach
Program and earn for
code library that automates the transfer of… yourfellow
Hey there, writing.
tech enthusiasts! Today, let’s
delve into the fascinating realm of software…

Jan 4 22 May 1 56 3
Brian in Dev Genius Mahes Sawira

Secure JWT(JSON Web Token) Golang gRPC Rate Limiter using


Authentication Implementation in… Redis Counter
JSON Web Tokens (JWT) serve as a robust In this article, we are going to implement a
solution for handling authentication and… rate limit interceptor into our Golang gRPC…

Jan 29 20 Feb 8 9

Sign up recommendations
See more to discover human stories that deepen your understanding
of the world.

Membership

Free
Access the best member-only stories.

Distraction-free reading. No ads. Support independent authors.

Organize your knowledge with lists and Listen to audio narrations.


highlights.
Read offline.
Tell your story. Find your audience.
Join the Partner Program and earn for
your writing.

You might also like