0% found this document useful (0 votes)
80 views

Splunk Open Source Build Vs Buy Workshop

Uploaded by

kalyan majeti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

Splunk Open Source Build Vs Buy Workshop

Uploaded by

kalyan majeti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Splunk & Open Source:

Build Vs. Buy Workshop


Jon Webster | Senior Manager, Competitive Intelligence

September 26, 2017 | Washington, DC


Forward-Looking Statements
During the course of this presentation, we may make forward-looking statements regarding future events or
the expected performance of the company. We caution you that such statements reflect our current
expectations and estimates based on factors currently known to us and that actual events or results could
differ materially. For important factors that may cause actual results to differ from those contained in our
forward-looking statements, please review our filings with the SEC.

The forward-looking statements made in this presentation are being made as of the time and date of its live
presentation. If reviewed after its live presentation, this presentation may not contain current or accurate
information. We do not assume any obligation to update any forward looking statements we may make. In
addition, any information about our roadmap outlines our general product direction and is subject to change
at any time without notice. It is for informational purposes only and shall not be incorporated into any contract
or other commitment. Splunk undertakes no obligation either to develop the features or functionality
described or to include any such feature or functionality in a future release.
Splunk, Splunk>, Listen to Your Data, The Engine for Machine Data, Splunk Cloud, Splunk Light and SPL are trademarks and registered trademarks of Splunk Inc. in
the United States and other countries. All other brand names, product names, or trademarks belong to their respective owners. © 2017 Splunk Inc. All rights reserved.
Agenda Splunk vs. ELK 3 Year TCO
30 day retention
$30,000,000

$25,000,000

▶ Why Try Open Source? $20,000,000

▶ Open Source Customer Interviews


$15,000,000
▶ Open Source Challenges
▶ Build vs. Buy Considerations $10,000,000

▶ Total Cost of Ownership Model


$5,000,000
▶ Customer Examples
▶ Q&A $0
200GB 1TB 5TB 10TB

Splunk Elastic Stack


Jon Webster
Senior Manager, Competitive Intelligence
[email protected]
Why Try Open Source?

▶ Frictionless ▶ Its FREE! Muah-ha-ha!


• No salesperson will call • Splunk seems cost-prohibitive
• Prove use case before investing • Don’t want to or can’t budget for Splunk
• Deploy without management cycles: • Open Source seems “good enough”
• No budget or procurement issues • Spend on development, not license
• No contracts or legal back and forth

▶ Development Use Cases ▶ Open Source Orientation


• Web, document, or product search engine • Organizational Open Source Initiative
• Sub-second response for application stack • Open Source or Build culture
Why Try Open Source?

▶ Developers ▶ VP & C-Level


• Shiny new toy • Open Source Initiatives
• New training & skills • What everyone remembers: “Use Open
• Job security Source First”

• Resume building • What everyone forgets: “Use the most


appropriate solution for the business”

▶ Managers
• No software budget, lots of developers
• Deploy without management cycles
• Shift Capex (license) to Opex (salaries)
• More staff & HW = bigger budget & title
Open Source Customer Interviews

Production Interviews User Conference Interviews


▶ Dozens of deployments from ▶ 3 Elastic{ON} User Conferences
20GB/day to 10’s of TB/day ▶ All machine data & security sessions
▶ 100’s of pilot deployments
▶ Interviewed 100 Attendees per
conference
OSS Customer Interviews: Key Takeaways

The Elastic Stack Splunk (for comparison)


▶ ‘Sweet spot’ server: 8 x 64, 6TB SSD ▶ 12 x 12, any disk, 800+ IOPS
• Avg. 25 GB/day per data node • 300 GB/day per search peer (data node)
• Avg. compression 300% • Avg. compression 50%
▶ 1TB/day and up: 6-18 month deploy ▶ 1TB/day and up: deploy in weeks
• Multiple clusters for large use cases • Single cluster to 1+ PB/day
• 90% deploy EMB (kafka, redis, MQ) • EMB not required
• Additional datastore (Hadoop) • No additional datastore required
▶ Parsing at index time – slow and fragile ▶ Parsing at search time – fast and stable
▶ Limited visualization – Some DIY ▶ Rich visualization OOTB, extensible
▶ Development backlogs are common ▶ Development backlogs are rare
Why So Much Storage?
JSON format, index every field, redundant “message”, “_source”, & “_all” fields.
Splunk: 297 chars, 1 index, 1 TB raw = ½ TB ELK: 1910 chars, 56 indexes, 1 TB raw = 4.8
on disk TB on disk (including GeoIP & Identity data)
150.128.102.148 - - [07/Aug/2014:00:59:52 +0000] \"GET { "_index": "logstash- "httpversion": "1.1", "mail":
2014.08.07", "response": 200, ”[email protected]”,
/images/web/2009/banner.png HTTP/1.1\" 200 52315 "_type": "logs", "bytes": 52315, "telephoneNumber":
\"http://www.semicomplete.com/blog/articles/week-of-unix-tools/day-1- "_id": "AUzqaoFTJX0- "referrer": ”123.456.7894”,
Q5nESGxf", "\"http://www.semicomplet "mobile": ”123.456.7894”,
sed.html\" \"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 "_score": null, e.com/blog/articles/week- "manager": ”Another
(KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36\ "_source": { of-unix-tools/day-1- Manager”,
"message": sed.html\"", "agent": "priority": ”3”,
"150.128.102.148 - - "\"Mozilla/5.0 (Windows "department": "Technical
Splunk Data is enriched at search time [07/Aug/2014:00:59:52
+0000] \"GET
/images/web/2009/banner.p
NT 6.1; WOW64)
AppleWebKit/537.36
(KHTML, like Gecko)
Department”,
"category": "Technical
Manager”,
No extra data is stored or indexed! ng HTTP/1.1\" 200 52315
\"http://www.semicomplete
.com/blog/articles/week-
Chrome/32.0.1700.107
Safari/537.36\"",
"useragent": {
"watchlist": ”whatever”,
"whenCreated": [
1407373192000 ] ,
of-unix-tools/day-1- "name": "Chrome", "endDate": [ 1407373192000 ]
sed.html\" \"Mozilla/5.0 "os": "Windows 7", },
(Windows NT 6.1; WOW64) "os_name": "Windows 7",
AppleWebKit/537.36 "device": "Other",
(KHTML, like Gecko) "major": "32",
Want to enrich ELK data? Chrome/32.0.1700.107
Safari/537.36\"",
"@version": "1",
"minor": "0",
"patch": "1700" } },
"fields": {
"@timestamp": "2014-08- "@timestamp": [

Green: Original syslog event 07T00:59:52.000Z", 1407373192000 ] },


"host": "sort": [ 1407373192000 ]
"ctest08.sv.splunk.com", },
"clientip":
"150.128.102.148",
Orange: Identity data added "ident": "-",
"auth": "-",
”identity” {
"personalTitle”: "Technical
"geoip": {
"ip": "150.128.102.148",
"timestamp": Manager", "country_code2": "ES",
"07/Aug/2014:00:59:52 "displayName” : ”First "country_code3": "ESP", ”
Red: GeoIP data added +0000",
"verb": "GET",
Lastname”,
"givenName": "First
country_name": "Spain",
"continent_code": "EU",
"request": Lastname”, "latitude": 40,
"/images/web/2009/banner. "sn": ”123-45-6789”, "longitude": -4,
png", "suffix": “”, "location": [ -4, 40 ] }
Why So Much Storage?
Storage optimization – at what cost?

Recommendations: Which means:


• Delete the original ”message” field • Affects Compliance & Debug Uses

• Disable the “_all” field • No Full-Text Search Capabilities

• Disable the ”_source” field • Disables Update API, Highlighting, &


Reindex API
• Set optimal index/analyze options • Not practical for deployments with
in schema for each data source 100s – 1000s of data sources
• Use best_compression option to • More infrastructure required to
reduce disk space maintain performance
Why So Many Servers?
1 TB/day for 90 days – 635 Servers?!
Experts pointed us to these hosting services for best practices:
1TB/day, 90 days retention, 350% raw/disk ratio, 3 total copies of data = 945,000 GB total disk
Elastic.co Qbox Compose.io ObjectRocket Splunk
(IBM)
Total Disk 945,000 945,000 945,000 945,000
GB Mem / 0.125
0.043 0.05 0.1
GB Disk
Total GB
40,635 47,250 94,500 118,125
Memory
Total
Servers @ 635 738 1,476 1,845
64GB/node
Elasticsearch Java Garbage Collection (GC)
Multi-day benchmark demonstrates GC issues

Healthy GC
Pattern

GC Affecting
Performance

Risk of “stop
the world” GC
node restarts
and crashes
Designing the Perfect Elasticsearch Cluster:
the (almost) Definitive Guide
https://thoughts.t37.net/designing-the-perfect-elasticsearch-cluster-the-almost-definitive-guide-e614eabc1a87

▶ “You can't know your workload until you’ve run in production for a while. You'll
have to iterate 2 or 3 times before you get the design right.”

▶ “Don’t run Elasticsearch in the cloud… you don't know what CPU you’ll get. Xeon
E5 v4 provides 60% better java performance than v3. Prepare to get into trouble
with nodes popping out of the cluster like popcorn.”

▶ “Stop the world" restarts: The main problem with Elasticsearch garbage collection
is how it might enter “stop the world” mode in which the JVM becomes
unresponsive until it is restarted
Some Things You Should Know Before Using
Amazon’s Elasticsearch Service On AWS
https://read.acloud.guru/things-you-should-know-before-using-awss-elasticsearch-service-7cd70c9afb4f

▶ "it’s basically impossible to troubleshoot your own AWS Elasticsearch cluster"

▶ "making any change at all will double the size of the cluster and copy every
shard… indexing and search to come to a screeching halt”

▶ "AWS’s have the time, skills or context to diagnose non-trivial issues, so they will
just... tell you to throw more hardware at the problem"

▶ "hosting Elasticsearch on AWS... absolutely does not mean your cluster will be
more stable"
Build vs. Buy
Considerations
Build vs. Buy: 3 Considerations

▶ Time to Market
• Faster value with a solution vs. time required to build it
• Opportunity cost often ignored, may be the highest cost
• Not just the first deployment, expansion & maintenance

▶ Benefit Realization
• Future proof: Mature solutions deliver more value
• Reduce risks: Project, technical, support, IP, personal

▶ Total Cost of Ownership


• Open source software has costs
• Production OSS deployments often exceed Splunk cost
Benefit Realization: Business Value Assessment
Final deliverable provides an Executive Report with CxO Ready Business Case Analysis

ü Alignment with Key Goals ü Proposed Solution ü Detailed Use Cases ü Investment Details
ü Current Challenges ü Adoption Speed ü Benefit Calculations ü ROI Analysis
Sample Worksheet
OSS “Success Stories”
Elastic{ON}15 Elastic{ON}16 Elastic{ON}17
Elasticsearch at Verizon Security Analytics @ USAA Optum’s Security Data Lake
2.7 TB/day, 50 day retention 1-2 TB/day, 30 day retention 8* TB/day, 1 year retention
10+B events/day 4.5B events/day 3B events/day + enrichment
• 128: 8 x 64, 6TB Disk 7 Clusters, grouped by feed • 190 data nodes
• 50: 24 x 256, 20TB Disk • 60: 12 x 96, 12TB SSD • 360 hadoop nodes
(hadoop)
• 21 Master Nodes • 550: 73.5 TB, 4.5 PB
• Logstash, Message Bus &
other Servers not listed • 16 Logstash Nodes
• Wrote their own UI • 4 Kafka, 3 Zookeeper
• 192 TB SAN
• 1.6 PB other storage

Total: 178+ servers, 1.8 PB Total: 104 servers, 2.5 PB Total: 550 servers, 4.5 PB
What is the Splunk Build vs. Buy Workshop?

A customer meeting, where we:

▶ Discuss your Open Source build experience


▶ Translate your experience into actual metrics & costs
▶ Prepare a Build vs. Buy Total Cost of Ownership Model
▶ You validate the TCO Model
▶ We deliver a CFO-Ready Business Case
Business Value Consulting Services
Most Popular Services
Business Value
Data Source Analysis TCO Analysis
Assessment

Align data sources with key Quantify current and/or Assess TCO for
objectives and value drivers future value drivers Cloud vs. On-Premises or
Splunk vs. ELK

Success Stories Value Roadmap Center of Excellence

Document 2-3 real life value Multi-Year Plan based on Assess key roles,
stories from your deployment value and data sources responsibilities and skills
© 2017 SPLUNK INC.

Appendix:
Build vs. Buy Workshop
Executive-Ready
Business Case
Splunk vs. Open Source: 3 Considerations

1. Time to Market
• Value is achieved faster with a platform vs. the time
required to build it
2. Benefit Realization
• A solution’s ability to produce proven customer success
increases likelihood that benefits will be realized
• A platform built from 10,000+ customers will yield more
value than a solution built entirely from scratch
3. Total Cost of Ownership
• Open source software is not free
• Production deployments can easily exceed 4-10x
Splunk cost
Consideration 1: Time to Market

▶ Value is achieved faster with a purpose-built platform vs. the time


required to build it (even basic functions)
▶ Pre-built apps speeds deployment (SplunkBase has 1000+ apps)

▶ Time impacts how much value will be realized

▶ EXAMPLE: Applying this consideration


• Assuming $1.2M/year of projected benefits from a deployment
• If Splunk takes 2 months to deploy, it delivers $1M of value in year 1
• If Open Source takes 10 months to deploy, it delivers $200k of value in year 1
• Assuming the same end result, Splunk delivers $800k MORE value in year 1
• TCO would show $800k as “lost opportunity cost” in the Open Source
calculation
Real Example: Splunk vs. Open Source
From a Fortune 50 Telecommunications Company

Project: Executive dashboard for near real-time TV Programming Analytics

Splunk delivered in 92% less


Open Source Build calendar time with 99% less effort “Buy” w/Splunk

Multiple open Took 6 people 6 Took 1 person 2


source
months’ effort weeks’ effort
solutions VS
manually
Modifications are Modifications are
stitched
small development made by users
together
projects on the fly
Consideration 2: Benefit Realization

Splunk Open Source


▶ 12,000+ production customers ▶ Unknown # of production customers
▶ Vibrant user community ▶ Vibrant development community
▶ 1000+ Splunk apps ▶ No pre-built app store
▶ Proven customer success ▶ No published benchmarks
▶ Documented benefit benchmarks

EXAMPLE: Applying this consideration


▶ An IT Operations project is expected to reduce incident investigation time
▶ Splunk’s documented benchmarks show the customer will achieve 70-90% reduction
▶ Since all functionality must be built for Elastic Stack, it may not achieve the same benefit level
▶ In doing a TCO analysis this must be considered. It would be added as a “lost opportunity cost” to the
Open Source calculation
Consideration 3: Total Cost Of Ownership

▶ Consider all the components of cost


• It’s more than just license fees
▶ Evaluate production-grade deployments
• Small side projects may hide true costs
▶ Scalability and efficiency impact infrastructure and admin costs
• Hardware, people, etc.
▶ Different skill sets are required to build vs. configure
• Highly compensated and scarce open source developers vs. general admins more
widely available and affordable
There Are Many Components Of TCO
License costs are only one of them…

▶ Server, network, workstation ▶ Facility and power ▶ Technology training


hardware ▶ Testing costs ▶ Audit (internal and external)
▶ Software license ▶ Downtime, outage and failure ▶ Insurance
▶ Installation and integration expenses ▶ Technology staff
▶ Purchasing research ▶ Diminished performance ▶ Management time
(users having to wait, etc.)
▶ Warranties and licenses ▶ Replacement
▶ Security (breaches, loss of ▶ Future upgrade or scalability
▶ License tracking – reputation, recovery and expenses
compliance prevention)
▶ Decommissioning
▶ Migration expenses ▶ Backup and recovery process
▶ …
▶ Risks – vulnerabilities,
upgrades, patches, failure
Realities of Production Grade Deployments
Considerations for platform selection – Infrastructure, people, and time

Open Source

o
r
▶ Multiple separate, open source products
▶ Single platform and solution
▶ Limited query capabilities
▶ Rich, powerful query language
▶ Highly paid, scarce, level 3 or 4 resources required
▶ Lower cost, available level 1 or 2 resources
▶ Infrastructure costs at 5-10x Splunk
▶ Architecture optimized for scale
▶ Significant development effort required
▶ Community of pre-built ‘apps’
▶ Lost opportunity cost due to slow time to market
▶ Rapid time to value
Splunk vs. Open Source TCO Model
Full detailed comparison of Splunk vs. Open Source costs based on Customer’s numbers

▶ Hardware acquisition and maintenance


• Servers, storage, load balancers, data center costs
▶ Software licensing and maintenance
• Perpetual, subscription, including renewals
▶ Professional services
• Implementation, configuration
▶ Splunk training / education
• Includes ongoing recommendations
▶ Ongoing administration support
• Sysadmin, architect, developer, power user, Splunk admin
▶ Opportunity Cost
Sample TCO Summaries

TCO for 3 Years TCO for 3 Years


30 day retention 60 day retention
$30,000,000

$30,000,000
$25,000,000
$25,000,000
$20,000,000
$20,000,000

$15,000,000 Splunk Splunk


$15,000,000
OSS OSS
$10,000,000 $10,000,000

$5,000,000 $5,000,000

$- $-
200GB 1TB 5TB 10TB 200GB 1TB 5TB 10TB
Yearly Schedule

This chart represents the 3 year benefits for Splunk vs ELK.


Cumulative Results

This chart represents the cumulative results over 5 years for On-Premesis, Splunk Cloud and AWS.
Security Matters
▶ Open source is community driven;
source code is public
▶ Lack of true product management,
software development and test/QA
opens real vulnerabilities

threat post

“Hackers have taken an interest in Elasticsearch…”


Splunk vs. Open Source
Summary of the 3 considerations
Splunk Open Source
▶ Time to value ▶ Time to value
• Realized in less than three months • Realized 6 to 12+ months
▶ Benefit realization ▶ Benefit realization
• Documented benchmarks and • No published benchmarks or
proven customer success proven customer success
▶ TCO: $2,860,251 ▶ TCO: $5,577,184
© 2017 SPLUNK INC.

Thank You
Don't forget to rate this session in the
.conf2017 mobile app

You might also like