Hazim Dahir, Jason Davis, Stuart Clark, Quinn Snyder - Cisco Certified DevNet Professional DEVCOR 350-901 Official Cert Guide-Cisco Press (2022)
Hazim Dahir, Jason Davis, Stuart Clark, Quinn Snyder - Cisco Certified DevNet Professional DEVCOR 350-901 Official Cert Guide-Cisco Press (2022)
See the card insert in the back of the book for your Pearson
Test Prep activation code and special offers.
Cisco
Certified
DevNet
Professional
DEVCOR 350-901
Official Cert Guide
JASON DAVIS,
HAZIM DAHIR,
STUART CLARK,
QUINN SNYDER
Cisco Press
Hazim Dahir
Stuart Clark
Quinn Snyder
Published by:
Cisco Press
All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means,
electronic or mechanical, including photocopying, recording, or by any information storage and retrieval
system, without written permission from the publisher, except for the inclusion of brief quotations in a
review.
ScoutAutomatedPrintCode
Library of Congress Control Number: 2022933631
ISBN-13: 978-0-13-737044-3
ISBN-10: 0-13-737044-X
The information is provided on an “as is” basis. The authors, Cisco Press, and Cisco Systems, Inc. shall
have neither liability nor responsibility to any person or entity with respect to any loss or damages arising
from the information contained in this book or from the use of the discs or programs that may
accompany it.
The opinions expressed in this book belong to the author and are not necessarily those of Cisco Systems, Inc.
Trademark Acknowledgments
All terms mentioned in this book that are known to be trademarks or service marks have been
appropriately capitalized. Cisco Press or Cisco Systems, Inc., cannot attest to the accuracy of this infor-
mation. Use of a term in this book should not be regarded as affecting the validity of any trademark or
service mark.
Special Sales
For information about buying this title in bulk quantities, or for special sales opportunities (which may
include electronic versions; custom cover designs; and content particular to your business, training goals,
marketing focus, or branding interests), please contact our corporate sales department at corpsales@
pearsoned.com or (800) 382-3419.
For questions about sales outside the U.S., please contact [email protected].
Feedback Information
At Cisco Press, our goal is to create in-depth technical books of the highest quality and value. Each book
is crafted with care and precision, undergoing rigorous development that involves the unique expertise of
members from the professional technical community.
Readers’ feedback is a natural continuation of this process. If you have any comments regarding how we
could improve the quality of this book, or otherwise alter it to better suit your needs, you can contact us
through email at [email protected]. Please make sure to include the book title and ISBN in your
message.
Alliances Manager, Cisco Press: Arezou Gol Editorial Assistant: Cindy Teeters
Director, ITP Product Management: Brett Bartow Cover Designer: Chuti Prasertsith
Education is a powerful force for equity and change in our world. It has the potential to
deliver opportunities that improve lives and enable economic mobility. As we work with
authors to create content for every product and service, we acknowledge our responsibil-
ity to demonstrate inclusivity and incorporate diverse scholarship so that everyone can
achieve their potential through learning. As the world’s leading learning company, we have
a duty to help drive change and live up to our purpose to help more people create a bet-
ter life for themselves and to create a better world.
■ Our educational products and services are inclusive and represent the rich diversity
of learners
■ Our educational content accurately reflects the histories and experiences of the
learners we serve
■ Our educational content prompts deeper discussions with learners and motivates
them to expand their own learning (and worldview)
While we work hard to present unbiased content, we want to hear from you about any
concerns or needs with this Pearson product so that we can investigate and address them.
Hazim Dahir, CCIE No. 5536, is a distinguished engineer at the Cisco office of the
CTO. He is working to define and influence next-generation digital transformation
architectures across multiple technologies and verticals. Hazim started his Cisco tenure
in 1996 as a software engineer and subsequently moved into the services organization
focusing on large-scale network designs. He’s currently focusing on developing architec-
tures utilizing security, collaboration, Edge computing, and IoT technologies addressing
the future of work and hybrid cloud requirements for large enterprises. Through his pas-
sion for engineering and sustainability, Hazim is currently working on advanced software
solutions for electric and autonomous vehicles with global automotive manufacturers.
Hazim is a Distinguished Speaker at Cisco Live and is a frequent presenter at multiple
global conferences and standards bodies. He has multiple issued and pending patents and
a number of innovation and R&D publications.
Stuart Clark, DevNet Expert #2022005, started his career as a hairdresser in 1990,
and in 2008 he changed careers to become a network engineer. After cutting his teeth
in network operations, he moved to network projects and programs. Stuart joined Cisco
in 2012 as a network engineer, rebuilding one of Cisco’s global networks and designing
and building Cisco’s IXP peering program. After many years as a network engineer, Stu-
art became obsessed with network automation and joined Cisco DevNet as a developer
advocate for network automation. Stuart contributed to the DevNet exams and was part
of one of the SME teams that created, designed, and built the Cisco Certified DevNet
Expert. Stuart has presented at more than 50 external conferences and is a multitime
Cisco Live Distinguished Speaker, covering topics on network automation and method-
ologies. Stuart lives in Lincoln, England, with his wife, Natalie, and their son, Maddox.
He plays guitar and rocks an impressive two-foot beard while drinking coffee. Stuart can
be found on social media
@bigevilbeard.
Quinn Snyder is a developer advocate within the Developer Relations organization inside
Cisco, focusing on datacenter technologies, both on-premises and cloud-native. In this
role, he aligns his passion for education and learning with his enthusiasm for helping the
infrastructure automation community grow and harness the powers of APIs, structured
data, and programmability tooling. Prior to his work as a DA, Quinn spent 15 years in a
variety of design, engineering, and implementation roles across datacenter, utility, and
service provider customers for both Cisco and the partner community. Quinn is a proud
graduate of Arizona State University (go Sun Devils!) and a Cisco Network Academy
alumnus. He is the technical co-chair of the SkillsUSA–Arizona Internetworking contest
and is involved with programmability education at the state and regional level for the
Cisco Networking Academy. Quinn resides in the suburbs of Phoenix, Arizona, with his
wife, Amanda, and his two kids. In his free time, you can find him behind a grill, behind
a camera lens, or on the soccer field coaching his daughter’s soccer teams. Quinn can be
found on social media @qsnyder, usually posting a mixture of technical content and his
culinary creations.
Joe Clarke, CCIE No. 5384, is a Cisco Distinguished Customer Experience engineer. Joe
has contributed to the development and adoption of many of Cisco’s network operations
and automation products and technologies. Joe evangelizes these programmability and
automation skills in order to build the next generation of network engineers. Joe is a Dis-
tinguished Speaker at CiscoLive!, and is certified as a CCIE and a Cisco Certified DevNet
Specialist (and a member of the elite DevNet 500). Joe provides network consulting and
design support for the CiscoLive! and Internet Engineering Task Force (IETF) conference
network infrastructure deployments. He also serves as co-chair of the Ops Area Work-
ing Group at the IETF. He is a coauthor of Network Programmability with YANG: The
Structure of Network Automation with YANG, NETCONF, RESTCONF, and gNMI as
well as a chapter coauthor in the Springer publication Network-Embedded Management
and Applications: Understanding Programmable Networking Infrastructure. He is an
alumnus of the University of Miami and holds a bachelor of science degree in computer
science. Outside of Cisco, Joe is a commercial pilot, and he and his wife, Julia, enjoy fly-
ing around the east coast of the United States.
Dedications
Jason Davis:
When you set off to write a book, how do you quantify the time needed to study, write,
and refine? You need time to obtain and configure the lab equipment for examples. There
are still family, career, personal health, and welfare activities that demand attention. How
can you plan when your job and a global health crisis demand the utmost in flexibility
and focus? To all the family and supporters of writers, thank you; there should be a spe-
cial place in heaven for you. I am indebted to my wife, Amy, and my children, Alyssa,
Ryan, Micaela, and Maleah. You provided me with the time and space to do this, espe-
cially through the hardships of 2021. Finally, I am thankful that my God has provided
opportunities and skills to impact technology in some small part and in a positive way;
because He has shown His love for me in innumerable ways, I am motivated to share the
same with others.
Hazim Dahir:
To my father, my favorite librarian: I wish this book could have made it to your shelf. I
think of you every day. To my mother, with heartfelt gratitude for all your love, prayers,
and guidance. I am a better person, husband, father, brother, and engineer because of
you two.
To my amazing wife, Angela: no words or pages can grasp how indebted I am to you for
your encouragement, patience, and wisdom.
To my children, Hala, Leila, and Zayd. I love watching you grow. Thank you for being
such a joy. Never give up on your dreams.
Stuart Clark:
This book is dedicated to my amazing Natalie (Mouse) and our son, Maddox, and our
beloved dog, Bailey, who sadly passed while I was authoring this book. Without their
love, support, and countless cups of coffee, this book would have never been possible. I
would like to thank my father, George, and mother, Wendy Clark, for providing me with
the work ethic I have today; and Natalie’s father and mother, Frank and Barbara Thomas,
for giving me my very first networking book and helping me transition into a new
career. A big thank you to my mentors Mandy Whaley, Denise Fishburne, Joe Clarke,
and my metaldevops brother Jason Gooley, who have helped and guided me for many
years at Cisco.
Quinn Snyder:
Writing a book is an unknown unknown to a first-time author. You have read the prod-
ucts from others, but you don’t know what you don’t know when you set out to write
your own. The only known you have is the support of the people behind you, who make
it possible to think, to work, to create, and to inspire you. They are there for your late
nights, the constant tapping of the keys, listening to you ramble on about what you’ve
just put to paper. This book would not have been possible without the support of
those behind me, propping me up, cheering me along, and believing in me. To my wife,
Amanda, you are my rock—my biggest fan and supporter—and the one who has always
believed in me, and I wouldn’t have been able to do this without you. To my children,
Nicolas and Madelyn, you have given me the space to create and do, and have inspired
me to show you that anything is possible. To my mother, Cynthia, you have inspired a
never-quit work ethic that sticks with me to this day. I thank you all for giving me those
gifts, and I dedicate this book to you.
Acknowledgments
Jason Davis:
Most who know me understand that I appreciate telling stories, especially with humor.
While that humor may not be recognized, it is mine nonetheless, so thank you to those
who have endured it for years. I am thankful for the opportunity to pivot from writ-
ing blogs, whitepapers, and presentations to writing this book. It has been an exercise
in patience and personal growth. The support team at Cisco Press—Nancy, Ellie, and
Tonya—you’ve been great, valuable partners in this endeavor.
I am thankful to a cadre of supportive managers at Cisco over the years who helped me
grow technically and professionally; Mike, Dave, Rich, and Ghaida, you have been awe-
some! Hazim, Stuart, and Quinn, thank you for joining me on this journey. Besides being
wonderful coworkers, our collaboration, though separated by states and countries, has
turned into friendship that has made a mark.
Hazim Dahir:
This book would not have been possible without two SUPER teams. The A Team: Jason,
Stuart, and Quinn. Thank you for an amazing journey and expert work. “Technically,” you
complete me! And, the Cisco Press team: Nancy Davis, Ellie Bru, Tonya Simpson, and
their teams. Thank you for all your help, guidance, and most importantly, your patience.
Special thanks go to Kaouther Abrougui for her contributions to the Security chapter,
Mithun Baphana for his Webex contributions, and to David Wang for his Git contribu-
tions. Kaouther, Mithun, and David are experts in their fields, and we’re lucky to have
them join us.
Many thanks go to Firas Ahmed, for his encouragement, technical reviews, and support.
A great deal of gratitude goes to Hana Dahir, an engineer at heart, who reviewed various
content for me.
I am very thankful to the following Cisco colleagues and leaders: Yenu Gobena, Saad
Hasan, Carlos Pignataro, Jeff Apcar, Vijay Raghavendran, Ammar Rayes, Nancy
Cam-Winget, and many others who supported me throughout my career and encouraged
me to break barriers.
Stuart Clark:
Being a first-time author was such a daunting task, but the great people at Cisco Press,
Nancy and Ellie; my coauthors, Hazim, Jason, and the BBQ Pit Master Quinn; technical
reviewers, Joe Clarke and Bryan Byrne, kept me in check and supporting every chapter. A
huge thank you to my leadership at DevNet—Matt Denapoli, Eric Thiel, and Grace Fran-
cisco—for always supporting my goals and ambitions. Janel Kratky, Kareem Iskander, and
Hank Preston for their vast support and encouragement since joining the DevNet team
in 2017. My dearest friends, Patrick Rockholz and Du’An Lightfoot, whose faith in me is
always unfaltering.
Quinn Snyder:
I’d like to thank the crew at Cisco Press for wading through my incoherent ramblings
and turning them into something worth reading. To Ellie and Nancy, thank you for the
patience and dealing with my inane questions. To Joe Clarke, thank you for your techni-
cal expertise in the review process.
Thank you to my DevNet leaders—Grace, Eric, Matt, and Jeff—for bringing me into this
amazing team and supporting my growth and development to make this work possible.
A special thanks to the guy who taught me “how to DA,” John McDonough. You set me
on a path for success and showed me the ropes, and for that I am grateful. To Kareem
Iskander, thanks for just being there. You always have listened to me, picked me up, and
pushed me forward. To my partners in crime (both with and without beards)—Jason,
Hazim, and Stuart—you guys believed that I could and gave me a shot. I have appreciated
our meetings, our late-night messaging sessions, and the friendship that has developed.
Finally, a special thank you to the man who guided me so many years ago, Barry Wil-
liams. Without your Cisco classes, your instruction, your belief, and never accepting “just
good enough,” I wouldn’t have had the foundation that I do today. You helped a kid from
rural Arizona see the big world and what was possible if I stuck with it, and because of
that, I am where I am today.
Contents at a Glance
Introduction xxviii
Part II APIs
Chapter 5 Network APIs 130
Part V Platforms
Chapter 16 Cisco Platforms 568
Glossary 675
Index 684
Online Elements
Appendix C Memory Tables
Glossary
Reader Services
Other Features
Register your copy at www.ciscopress.com/title/9780137370443 for convenient access
to downloads, updates, and corrections as they become available. To start the registration
process, go to www.ciscopress.com/register and log in or create an account.* Enter the
product ISBN 9780137370443 and click Submit. When the process is complete, you
will find any available bonus content under Registered Products.
*Be sure to check the box that you would like to hear from us to receive exclusive
discounts on future editions of this product.
Contents
Introduction xxviii
Tracing 77
Good Documentation Practices: An Observability Reminder 78
Database Selection Criteria 79
Database Requirements Gathering 80
Data Volume 81
Data Velocity 82
Data Variety 82
Exam Preparation Tasks 83
Review All Key Topics 83
Complete Tables and Lists from Memory 83
Define Key Terms 84
References 84
Part II APIs
Deploy 205
Adding Deployment to Integration 207
Deploying to Infrastructure (Terraform + Atlantis) 207
Deploying Applications (Flux + Kubernetes) 213
Application Deployment Methods over Time 218
The 2000s: Sysadmins, Terminals, and SSH 218
The 2010s: Automated Configuration Management 220
The 2020s: The Clouds Never Looked So Bright 224
Managed Kubernetes (e.g., GKE) 224
Containers on Serverless Clouds (e.g., AWS ECS on Fargate) 227
Serverless Functions (e.g., AWS Lambda) 234
Software Practices for Operability: The 12-Factor App 238
Factor 1: Codebase 239
Factor 2: Dependencies 239
Factor 3: Config 239
Factor 4: Backing Services 240
Factor 5: Build, Release, Run 240
Factor 6: Processes 240
Factor 7: Port Binding 241
Factor 8: Concurrency 241
Factor 9: Disposability 241
Factor 10: Dev/Prod Parity 241
Factor 11: Logs 242
Factor 12: Admin Processes 242
Summary 243
Exam Preparation Tasks 243
Review All Key Topics 243
Complete Tables and Lists from Memory 244
Define Key Terms 244
References 244
Terraform 515
Terraform or Ansible: A High-Level Comparison 518
Business and Technical Requirements 519
Architectural Decisions 519
Technical Debt 520
Exam Preparation Tasks 521
Review All Key Topics 521
Complete Tables and Lists from Memory 522
Define Key Terms 522
References 522
Part V Platforms
Glossary 675
Index 684
Online Elements
Appendix C Memory Tables
Glossary
■ Boldface indicates commands and keywords that are entered literally as shown. In
actual configuration examples and output (not general command syntax), boldface
indicates commands that are manually input by the user (such as a show command).
■ Braces within brackets ([{ }]) indicate a required choice within an optional element.
Introduction
This book was written to help candidates improve their network programmability and
automation skills—not only for preparing to take the DevNet Professional DEVCOR
350-901 exam but also for real-world skills in any production environment.
You can expect that the blueprint for the DevNet Professional DEVCOR 350-901 exam
tightly aligns with the topics contained in this book. This was by design. You can follow
along with the examples in this book by utilizing the tools and resources found on the
DevNet website and other free utilities such as Postman and Python.
We are targeting any and all learners who are learning these topics for the first time
as well as those who wish to enhance their network programmability and automation
skillset.
One key methodology used in this book is to help you discover the exam topics that you
need to review in more depth, to help you fully understand and remember those details,
and to help you prove to yourself that you have retained your knowledge of those topics.
So, this book does not try to help you pass by memorization but helps you truly learn
and understand the topics. The DevNet Professional exam is just one of the foundation
exams in the DevNet certification suite, and the knowledge contained within is vitally
important to consider yourself a truly skilled network developer. This book would do
you a disservice if it didn’t attempt to help you learn the material. To that end, the book
will help you pass the DevNet Professional exam by using the following methods:
■ Helping you discover which test topics you have not mastered
■ Supplying exercises and scenarios that enhance your ability to recall and deduce the
answers to test questions
Passing the DevNet Professional DEVCOR 350-901 exam is a milestone toward becom-
ing a better network developer, which, in turn, can help with becoming more confident
with these technologies.
Regardless of the strategy you use or the background you have, the book is designed
to help you get to the point where you can pass the exam with the least amount of time
required. However, many people like to make sure that they truly know a topic and thus
read over material that they already know. Several book features will help you gain the
confidence that you need to be convinced that you know some material already and to
also help you know what topics you need to study more.
Note that if you buy the Premium Edition eBook and Practice Test version of this book
from Cisco Press, your book will automatically be registered on your account page. Sim-
ply go to your account page, click the Registered Products tab, and select Access Bonus
Content to access the book’s companion website.
■ Print book: Look in the cardboard sleeve in the back of the book for a piece of
paper with your book’s unique PTP code.
■ Premium Edition: If you purchase the Premium Edition eBook and Practice Test
directly from the Cisco Press website, the code will be populated on your account
page after purchase. Just log in at www.ciscopress.com, click Account to see details
of your account, and click the Digital Purchases tab.
■ Amazon Kindle: For those who purchase a Kindle edition from Amazon, the access
code will be supplied directly from Amazon.
■ Other Bookseller eBooks: Note that if you purchase an eBook version from any
other source, the practice test is not included because other vendors to date have not
chosen to vend the required unique access code.
NOTE Do not lose the activation code because it is the only means with which you can
access the QA content with the book.
When you have the access code, to find instructions about both the PTP web app and
the desktop app, follow these steps:
Step 1. Open this book’s companion website, as shown earlier in this Introduction
under the heading “How to Access the Companion Website.”
Step 2. Click the Practice Exams button.
Step 3. Follow the instructions listed there both for installing the desktop app and for
using the web app.
Note that if you want to use the web app only at this point, just navigate to
www.pearsontestprep.com, establish a free login if you do not already have one, and
register this book’s practice tests using the registration code you just found. The process
should take only a couple of minutes.
NOTE Amazon eBook (Kindle) customers: It is easy to miss Amazon’s email that lists
your PTP access code. Soon after you purchase the Kindle eBook, Amazon should send
an email. However, the email uses very generic text and makes no specific mention of
PTP or practice exams. To find your code, read every email from Amazon after you pur-
chase the book. Also do the usual checks for ensuring your email arrives, like checking
your spam folder.
NOTE Other eBook customers: As of the time of publication, only the publisher and
Amazon supply PTP access codes when you purchase their eBook editions of this book.
The core chapters, Chapters 1 through 17, cover the following topics:
■ Chapter 4, “Version Control and Release Management with Git”: This chapter dis-
cusses the basics of version control, Git’s way of managing version controls and col-
laboration, and then covers in detail branching strategies and why they’re important
for the success of any project.
■ Chapter 5, “Network APIs”: This chapter covers how software developers can use
application programming interfaces (APIs) to communicate with and configure net-
works and how APIs are used to communicate with applications and other software.
■ Chapter 10, “Automation”: This chapter covers topics such as SDN, APIs, and
orchestration. Additional helpful context is provided around the impact to IT service
management.
■ Chapter 11, “NETCONF and RESTCONF”: This chapter covers the NETCONF,
YANG, and RESTCONF technologies with examples that will be helpful in your
preparation and professional use.
■ Chapter 16, “Cisco Platforms”: Finally, this chapter contains a mix of practical API
and SDK usage examples across several platforms, such as Webex, Meraki, Intersight,
DNA Center, and AppDynamics. If you have some of these solutions, the examples
should reveal methods to integrate with them programmatically. If you don’t use the
platforms, this chapter should reveal the “art of the possible.”
■ Chapter 17, “Final Preparation”: This chapter details a set of tools and a study plan
to help you complete your preparation for the DEVCOR 350-901 exam.
Each version of the exam can have topics that emphasize different functions or features,
and some topics can be rather broad and generalized. The goal of this book is to provide
the most comprehensive coverage to ensure that you are well prepared for the exam.
Although some chapters might not address specific exam topics, they provide a founda-
tion that is necessary for a clear understanding of important topics. Your short-term goal
might be to pass this exam, but your long-term goal should be to become a qualified net-
work developer.
It is also important to understand that this book is a “static” reference, whereas the
exam topics are dynamic. Cisco can and does change the topics covered on certification
exams often.
This exam guide should not be your only reference when preparing for the certification
exam. You can find a wealth of information available at Cisco.com that covers each topic
in great detail. If you think that you need more detailed information on a specific topic,
read the Cisco documentation that focuses on that topic.
Note that as automation technologies continue to develop, Cisco reserves the right to
change the exam topics without notice. Although you can refer to the list of exam topics
in Table I-1, always check Cisco.com to verify the actual list of topics to ensure that you
are prepared before taking the exam. You can view the current exam topics on any current
Cisco certification exam by visiting the Cisco.com website, choosing Menu, and Training &
Events, then selecting from the Certifications list. Note also that, if needed, Cisco Press
might post additional preparatory content on the web page associated with this book at
http://www.ciscopress.com/title/9780137370443. It’s a good idea to check the website a
couple of weeks before taking your exam to be sure that you have up-to-date content.
Credits
Figure 4-1 through Figure 4-29, Figure 4-32 through Figure 4-34, Figure 6-8, Figure 6-9,
Figure 11-10, Figure 11-11, Figure 12-6, Figure 12-27, Figure 12-28: GitHub, Inc
Figure 5-3 through Figure 5-6, Figure 6-10, Figure 6-11, Figure 6-13: IMDb-API
Figure 5-8, Figure 11-14 through Figure 11-19, Figure 16-16 through Figure 16-19,
Figure 16-31, Figure 16-41 through Figure 16-43: Postman, Inc
Figure 7-9, Figure 7-12 through Figure 7-16: Amazon Web Services, Inc
Figure 12-21 through Figure 12-25, Figure E-1, Figure E-2: Grafana Labs
Table 2-2: Permission to reproduce extracts from British Standards is granted by BSI
Standards Limited (BSI). No other use of this material is permitted. British Standards can
be obtained in PDF or hard copy formats from the BSI online shop: https://shop.
bsigroup.com/.
■ Software Architecture and Design: This section covers the basics of software archi-
tecture definitions and terminologies.
■ Software Development Lifecycle (SDLC) Approach: This section covers SDLC and
the basics of the software design, development, testing, and deployment lifecycle.
■ Software Development Models: This section describes various models like Agile and
Waterfall.
■ Architecture and Code Reviews: This section describes several types of code review,
including peer and stakeholder reviews.
■ Software Testing: This section covers the various types of software testing.
As you start your journey toward DEVCOR certification, it is important to understand that
you will be building software for the purpose of automating operational functions that
consume several human hours. What you build and automate will be used to reduce human
hours and errors while providing a consistent way of conducting various tasks.
In this chapter, we first discuss software development and design in the context of IT opera-
tional functions, and then attempt to briefly describe how we got here and why. In addition,
we want you to understand the software development concepts from a network domain
expertise perspective. We also focus on the architecture design and delivery lifecycle of
software products. The keyword here, after design, is delivery, which encompasses all
aspects of building, testing, and releasing software products. This chapter sets the stage with
a brief description of the lifecycle and various processes that most organizations follow to
develop, test, and maintain software.
6. Which of the following principles is not associated with the Lean software develop-
ment model?
a. Fast and frequent value delivery
b. Continuous learning and innovation
c. Team empowerment
d. Holistic, upfront planning
7. Which of the following metrics does DevOps use to assess performance?
a. Deployment frequency
b. Lead time
c. Change volume
d. All of these answers are correct.
8. Unit testing is considered a type of
a. Black-box testing conducted with no knowledge of the system
b. Gray-box testing conducted with some knowledge of the system
c. White-box testing conducted with full knowledge of the system
d. None of these answers are correct.
9. An important aspect of software quality assurance is peer code review, which may be
looking at
a. Functionality
b. Complexity of code
c. Naming conventions
d. All of these answers are correct.
10. Why do you need architectural patterns?
a. Architectural patterns introduce complexity and give the impression of sophisti-
cated programming skills.
b. Architectural patterns ensure consistency of the code because it is a reusable
solution to a commonly occurring problem within a given context.
c. There so many patterns, and you have to choose one.
d. All of these answers are correct.
Foundation Topics
A Brief History of the Future
Some of us started programming out of necessity, some out of the desire to simplify routine
tasks, some saw an opportunity to improve life around them, and some wanted that job.
No matter how you got here, you’re now an application developer looking to automate the
configuration and orchestration of simple and complex tasks. Over the last few years, a great
deal of change has affected the software development world—not necessarily related to
software programming as much as it relates to the why, how, and where. The logic, we like to
believe, stayed intact. In addition, recent advances in processes in hardware, CPU, and mem-
ory and the significant reduction in price point brought an abundance of processing power
to a large number of people globally. In return, these changes broke down many barriers and
introduced the world to millions of new software developers and a great deal of innovation 1
that accompanied them.
We also like to differentiate between a software engineer and a software developer. Sim-
ply put, we’re differentiating between design and execution, respectively. We think of the
software engineer as the person who takes a problem, breaks it down, proposes a solution
based on requirements and quality trade-offs, builds an architecture that solves the problem,
and then executes a strategy for making that solution happen. The software developer, on
the other hand, is a team member of the execution team. Don’t get us wrong; there is a lot of
creativity in execution, and the differentiation between an engineer and a developer is not
meant to show preference or hierarchy.
The Evolution
Many books have been written about the evolution of software development and the many
revolutions that have affected various aspects of humanity: engineering, business, health
care, and so on. The evolution we want to discuss here is the one that relates to the network
and the various tasks for building and managing one. It is also about running a business. How
can you bring flexibility to a network, application, and processes to help run your business?
As we discuss the relationship between software development and managing a business, we
want to discuss the ever-continuing evolution that has got us here. Without putting timelines
and designating specific years as the beginning or end of something, we go through a quick
and yet necessary journey of the past to justify (and prepare for) the future.
In the beginning, networks and systems were limited to the four walls of the enterprise. The
majority of management tasks were limited to the capabilities provided by the vendor, and
some of those were even proprietary. That paradigm was bad for interoperability and sim-
plicity of operations.
Those capabilities mainly were for status and simple configuration and customization func-
tions. Then Simple Network Management Protocol (SNMP) adoption increased, and there
was a standard model where an “agent” running on the “managed” system used a standard
protocol to communicate with a “manager” or “management server.” The adoption of SNMP
and network management systems (NMS) provided increased efficiency in network opera-
tions and, subsequently, provided some relief for the system administrator who used to con-
figure and maintain various systems individually.
Shortly after, the International Organization for Standardization (ISO) introduced the FCAPS
model, which stands for Fault, Configuration, Accounting, Performance, and Security. The
FCAPS model provided a standard way for defining and assessing the functions of network
management. Security in the form of privilege assignment and information masking was an
essential addition as organizations grew in size and grew the adoption of technology, not to
forget the staff (internal and external) that manages the growth.
As we discuss the network management evolution that brought software development to the
network, we cannot ignore the network’s evolution. The network experienced new develop-
ments and pressures to expand into remote campuses or branches and to support higher
bandwidths reliably for voice and video integration. In addition, as we deployed additional
management and visibility systems, we were quick to see clear inefficiencies in the design
and utilization of the network.
For example, we saw that 70+ percent of our servers were running at 25 percent utilization
of CPU and memory—hence, the birth of “virtualization” and the introduction of another
management layer. With virtualization, we were able to improve efficiency at the server level,
and we saw that it was good for performance and also for business. Then we moved back to
the network and also virtualized it and its main functions, such as routers, switches, firewalls,
and load balancers, and we’re still doing it.
Having virtualized the majority of the enterprise IT functions, we started observing “owner-
ship” issues: who owns what and who’s responsible for where? Why? As the control (or own-
ership) boundaries were quickly dissolving, it was clear that traditional network management
capabilities (and the traditional network manager) were not fully equipped or fast enough to
match the speed of business.
The traditional network operations center (NOC) that kept an eye on the network and busi-
ness applications needed to evolve as well. It needed to utilize software development tech-
niques and technologies to detect network issues better, alert the right people, and possibly
take action.
Eventually, we moved up the stack and started looking at business processes and workflows,
and through virtualization and APIs, we were able to automate the full enterprise stack and
merge (consolidate) the various functions of the enterprise business process management
into a single team. The business application developers wanted flexibility deploying their
code as frequently as the business desired; the network and systems managers wanted to
ensure a higher degree of availability and stability.
Incident Capacity
1
Response Planning
Reactive
Monitoring Modeling Telemetry Policy Orchestration Adoption
Pro
activ
e
Change
Management Automation
■ Application developers
■ Continuous integration and continuous delivery reduce the time required to update
or deliver software while improving quality.
■ Automating processes:
■ Reduce human error and improve the responsiveness of the development environ-
ment.
■ Measuring DevOps teams and efforts using the following sample metrics:
■ Deployment frequency.
■ Lead time.
■ Change volume.
Enterprise Architecture
Business Process (and Architecture) Business
System Architecture System System System
SW SW SW SW SW SW SW SW
Data Data
The software architecture of a system is the set of structures needed to reason about the
system, which comprise software elements, relations among them, and properties of both.
For example, the purpose of Figure 1-2 is to illustrate that the software you’re trying to
architect has clear relationships (and dependency) on other software as well as on data
sources and targets (storage or databases). The relationships can be localized to the system
or cross boundaries to other systems or business processes within the enterprise. The rela-
tionships among the various components may require different data formats, connectors, and
other types of translations that allow heterogeneous systems to communicate or exchange
data. You can find more details in Chapter 2, “Software Quality Attributes.”
In general, architecture defines the organization of the system, all components, their relation-
ships, interactions, dependencies, and the requirements and principles governing the design.
It is also a source of truth when it comes to agreed-upon requirements and design principles,
especially as exceptions are made or changes are requested. It also serves as a communica-
tion vehicle among stakeholders.
In the next few sections, we look at a few concept frameworks regarding how software is
conceptualized, developed, tested, reviewed, and maintained. We follow the general founda-
tional direction by looking at
■ Software requirements
■ Development lifecycle
■ Code review
■ Testing
■ Version control
The remainder of the book handles these topics with examples and specific use cases.
Architecture Requirements
There is no architecture without requirements. Every architecture has a list of requirements
it strives to fulfill. The requirements are discussed, documented, and agreed upon by all
stakeholders.
When you’re designing software, it is often tough to evaluate how the application should
be built and which design pattern should be used. While your experience as a developer
and acceptable practices come in handy, it is often better to start objectively by collecting
requirements. There are two main categories for capturing requirements: functional and
nonfunctional.
It is not uncommon to see an organization use a third category of requirements called
constraints or limitations. Constraints refer to design or architecture decisions that are
somehow beyond your control, or unnegotiable. Some of those design decisions have been
previously made and you, as a developer, have to comply with them. Examples may include
the use of a specific programming language, the use of another software system (external to
your organization), interoperability with a specific system, or possibly, workload movement
with a specific cloud provider. We focus on the main two categories.
Functional requirements specify “what” the software in question should do. They
describe the functionalities of a system, such as
■ Business process
■ Data manipulation
■ User interaction
■ Media processing
■ Administrative functions
Functional requirements can be measured in “yes” or “no” terms and usually include the
words shall or can. For example, some functional requirements for a document editing web
application would be
■ Performance
■ Scalability
■ High availability
■ Modularity
■ Interoperability
■ Serviceability
■ Testability
■ Security
As you can see from the list, these values are also thought of as the quality attributes of the
system. They are often expressed in words like must and should and are commonly mea-
sured over an interval or a range.
NOTE There is always some kind of a trade-off among nonfunctional requirements when
considering the final design. Nonfunctional requirements must be considered as a group
because they will, most certainly, affect each other. For example, increasing scalability may
negatively affect performance.
■ The web GUI should take less than two seconds to load.
■ Users are required to register before they can use the system.
When writing the nonfunctional requirements of your system, avoid using nonspecific
words. Instead of writing that “the request must be fast,” write “the request must be com-
pleted in under 300 milliseconds (300 ms).” Instead of “the GUI must be user-friendly,” say
something like “the GUI must respond in under 500 ms, and the user shall be able to access
all main functionalities from the home screen.”
Functional and nonfunctional requirements are closely related, and it certainly does not
matter where you start the requirement capturing process. In our experience, it is a case-by-
case situation. For example, if we were starting a new innovative type of project, we always
started at the functional requirement level. However, in other projects where we were rein-
venting the wheel, and the essence of the project was to build software that superseded or
replaced existing software, we tended to start by looking at nonfunctional requirements first.
Any way you look at it, at the end of the day, both types of requirements need to be cap-
tured in a single document. We just cannot think of any normal situation where a developer
is concerned with only one set of requirements. Table 1-2 gives a few examples of differences
between the two types.
Another interesting way to look at it is that we would argue that businesses are measured by
the revenue they generate, which can be attributed to their success in their business area. In
this case, business requirements (that fall under functional requirements) are prioritized.
But what about business requirements that verge on the edge of nonfunctional
requirements—for example, scalability and security for a cloud service provider?
As with most things, it is crucial to evaluate requirements to see which ones should be pri-
oritized and which ones impact your application the most. Product owners may focus purely
on functional requirements, and it may be up to the development team to take ownership of
nonfunctional requirements.
To consider the impact that a requirement has on application quality, assess the requirement
by identifying the requirement and creating user stories (for example, users need to have a 1
good experience while browsing), then determining the measurable criteria (for example,
round-trip time should be under 500 ms), and finally, identifying the impact (for example,
users will not visit your site if it is too slow).
Functional Requirements
Functional requirements are designed to be read by a general, not necessarily technical, audi-
ence. Therefore, they are often derived from user stories. User stories are short descriptions
of functionalities, as seen from the end-user perspective. They focus more on how users
interact with the system and what they expect the goal of the interaction to be, rather than
what the system does, but because the software is often very domain-specific, they help you
to understand the system better as a developer. You will see how significant the user stories
are when we discuss the Agile development approach.
Another concept used in the formulation of functional requirements is the use case. In
contrast to user stories, use cases focus less on a user’s interaction and more on the cause
and effect of actions. If a user story stated, “A user can save documents by clicking Save,”
the related use case would state, “When the Save button is clicked, the current document is
saved to the server’s NoSQL database and to the client’s local cache.”
Functional requirements should
■ Be testable.
■ Fully cover every scenario, including what the system should not do.
■ Include complete information about data flow in and out of the system.
When you’re gathering requirements, consider the business, administrative, user, and system
requirements. Business requirements are high-level requirements from a business perspec-
tive, such as the users being able to log in with credentials from an existing application.
Administrative requirements take care of routine actions, such as logging every change made
to a document. User requirements contain the desired outcome of certain actions, like the
creation of a new document. System requirements describe software and hardware specifica-
tions, like a specific error code being returned when an unauthorized user tries to access a
document that is not theirs.
Nonfunctional Requirements
Nonfunctional requirements specify the quality attributes or technical requirements of
the system. They are more related to the system’s architecture than its functionalities and
are often constraints on the technical properties of a system. Nonfunctional requirements
are often very technical and less thought about than functional requirements. Although
functional requirements are commonly defined with the project’s stakeholders’ help and are
more specific, nonfunctional requirements are more implicit and sometimes just assumed.
An example of an assumed but not commonly listed nonfunctional requirement is a short
loading time for a web page (for example, less than two seconds). Another example of an
Architectural Patterns
Sometimes, the topic of architectural patterns is a philosophical discussion because different
patterns have different strengths and usage contexts. Some allow for easier integration; some
allow for easier usability. One easy definition we once saw on Wikipedia goes like this:
We like this simple definition. The context, or the paradigm, is the key word here. It relates
to the organization’s enterprise architecture and defines the software design schema to be
followed. Having a defined pattern allows for collaboration and onboarding of new project
participants without jeopardizing efficiency or productivity of the project. In this section we
briefly cover a few common patterns in software architecture:
Microservices pattern: It’s an architectural approach for breaking an application (a mono-
lithic application) into smaller, independent, and possibly distributed components. Through
deterministic interfaces, they interact together to deliver the intended functionality of a
monolithic system but with higher flexibility, performance, and scalability. Microservices are
discussed in Chapter 2.
Service-oriented architecture (SOA): Distributed applications or components that provide and
consume services. The service provider and service consumer don’t have to use the same lan-
guages or be hosted on the same platforms. They are developed and deployed independently.
Every component has interfaces that define the services it provides or the services it requests.
In addition to the provider and consumer, there are few important components of this pattern:
■ The service registry assists the service provider with offering the services and how
they should be offered and with what type of availability, security, or metering (bill-
ing), to name a few.
■ The service broker provides details about the services to service consumers.
1
SOA allows for faster deployment, lower cost, and higher scalability. At the same time,
because service providers and consumers are built independently and possibly by different
organizations, this may introduce limitations in flexibility and scalability when a high level
of customization is required.
Event-driven: This model is very common for customer engagement applications where an
event (a change in state) is generated, captured, and processed in real time (or near-real time).
It is like SOA in the sense that there is an event producer and an event consumer (which
listens for the event). When the event is generated, the event producer detects the event and
sends it in the form of a message. The producer sends the message not knowing or caring
about the consumer and what the consumer may or may not do with it. The event-driven
architecture has two different models:
■ Event streaming: Multiple events or continuous streams of events are detected and
logged to a database where consumers or subscribers can read from the database in a
customized fashion (e.g., a timestamp or a duration of time).
Event-driven architecture patterns saw a rise with the Internet of Things (IoT) sensors,
devices, and applications, especially as streams of data (and events) are analyzed for pattern
detection or predictive analysis of events in areas like health care or manufacturing.
Model-view-controller (MVC): This model relies on three components of an application:
model, view, and controller:
■ View: This component provides a customizable view of the outcome seen by the user.
Multiple views can be developed and based on the user interaction (e.g., web page or
text message).
■ Controller: This component receives user input from the view and sends it to the
model for processing (or storage).
MVC provides a great deal of simplicity and flexibility because each of the three compo-
nents can be developed independently by a different group of developers.
■ Planning: Developing the concept or context; the creation and capture of use cases or
user stories.
The narrative: You want to build an application that does this to solve that business
problem, and you need to define all stakeholders and define the final product.…
Planning
Deployment Defining
SDLC
Testing Designing
Building
The narrative: The app must do ABC using this system or while adhering to this busi-
ness process enabled by this type of data performing a transaction in no more than
500 ms.
■ Designing: Converting the high-level conceptual design into technical software speci-
fications.
The narrative: This step makes sure that all stakeholders are on the same page and are
in full agreement on the next steps.
The outcome of this stage is the “go-to” reference from this moment on.
NOTE The development models (described in the next section) highlight how this part of
SDLC is managed.
■ Testing: Validating that the software functions as intended. Depending on the organi-
zation or the intended use of the system, there may be various phases or certifications 1
of the software. Examples include functional, stress, system, alpha, and beta testing.
The narrative: The software is almost ready for production deployment, but let’s exer-
cise the system and observe how it reacts to certain normal or stressful scenarios.
■ Waterfall
■ Iterative
■ Agile
■ Spiral
■ V Model
For the purpose of DEVCOR, two models stand out and are required knowledge: Waterfall
and Agile. Agile has several variations that are also understood to stand on their own (Lean,
Scrum, Extreme Programming, and so on). For example, the Lean model is considered a
variation of Agile, but it is not uncommon to view it as an independent model. In the follow-
ing sections, we quickly cover a few models and examples.
Waterfall
The waterfall model is considered the simplest and most straightforward model. As the
name indicates, this mode is sequential. As shown in Figure 1-4, each stage depends on the
one before it. A stage must finish and get the proper signoffs/approvals before the next one
can start.
Requirement
Analysis
Design
Implementation
Testing
Deployment
Maintenance
Working Working
Define Release Subsystem Define Release Subsystem Define Release
1. Our highest priority is to satisfy the customer through early and continuous delivery
of valuable software.
2. Welcome changing requirements, even late in development. Agile processes harness
change for the customer’s competitive advantage.
3. Deliver working software frequently, from a couple of weeks to a couple of months,
with a preference for the shorter timescale.
4. Businesspeople and developers must work together daily throughout the project.
5. Build projects around motivated individuals. Give them the environment and support
they need and trust them to get the job done. 1
6. The most efficient and effective method of conveying information to and within a
development team is face-to-face conversation.
7. Working software is the primary measure of progress.
8. Agile processes promote sustainable development. The sponsors, developers, and users
should be able to maintain a constant pace indefinitely.
9. Continuous attention to technical excellence and good design enhances agility.
10. Simplicity—the art of maximizing the amount of work not done—is essential.
11. The best architectures, requirements, and designs emerge from self-organizing teams.
12. At regular intervals, the team reflects on how to become more effective, then tunes
and adjusts its behavior accordingly.
It’s obvious from the list that Agile development emphasizes teamwork, collaboration,
and flexibility. As mentioned earlier, Agile represents a variety of methodologies. It is not
uncommon to find references or publications internal or external to your organization that
deal with the following Agile methodologies as independent models.
Scrum
Scrum uses sprints (short and repeatable development cycles) and multiple teams to achieve
success. It is a preferred model where frequent changes or design decisions are made. It
requires clear communication of requirements, but teams are empowered to decide on the
best ways to fulfill them.
Extreme Programming
Extreme Programming offers high-quality software, frequently and continuously with a
typical iteration duration of one to four weeks. It allows for frequent changes, which is con-
sidered a plus, but in a poorly managed environment, this could easily degrade the quality of
your deliverable.
Kanban
Kanban provides continuous delivery while reducing individual developer burden. In some
cases, it focuses on day-long sprints. There is no defined process for introducing changes.
Lean
Lean focuses on “valuable features” and prioritization. Lean is sometimes considered as the
origin of Agile or, more like, Agile took the best of Lean and improved upon it. Lean focuses
on the following principles:
■ Make just-in-time decisions: Always have options but finalize decisions at the right time.
In the following section, we provide a brief comparative analysis of the various models. The
Agile development model is closely aligned with the DevOps model. DevOps focuses on
integration and collaboration among development teams to shorten development and deploy-
ment cycles. Agile focuses on frequent and incremental updating and development of sys-
tems and subsystems that contribute to the evolution of the final product. We can safely say
that both models strive to reduce the time to develop, test, and deploy software.
Which Model?
Agile has been gaining ground as a model, not just for software development only but also
as a project management tool for a variety of architectures or business problems. Table 1-3
compares the various development models to help you decide which to use for your proj-
ect. With the exception of the waterfall model, you will probably observe many similarities
among the other models because they’re all Agile in nature. The table lists the pros and cons
of each. Note that you must make a number of trade-offs between quality, cost, and speed
(among other things relevant to your organization).
■ Peer review
The unfortunate part is that architecture is hardly ever reviewed or updated after the devel-
opment lifecycle starts. Code review, on the other hand, cannot be handled that way.
As a software developer, you will typically find yourself very focused on code reviews.
Code review, in light of the current requirements and future roadmaps, is extremely
essential.
Different organizations have different practices, but reviewers generally look at some com-
mon practices. The following is a partial list of examples of what gets checked or verified
during a code review:
■ Scalability
■ Efficiency
■ High availability
■ Testability
■ Modifiability/maintainability
■ Security
■ Usability
■ And more
■ Formatting
■ Naming conventions
■ Documentation
Code review requires collaboration, patience, and documentation of findings. Code review
meetings are also a great opportunity for all involved persons or parties to connect and learn
from each other.
Software Testing
We could easily dedicate a full chapter to testing, but for the purpose of DEVCOR, we limit
the discussion to few paragraphs. Testing can happen at every stage of the development
lifecycle, but nothing is more important than the prerelease times. Releases are systemwide
releases or subsystem releases as you’ve seen with the Agile development model.
Your application is as good as your testing process or methodology. You most commonly
hear about alpha and beta type testing, but that type of testing is conducted by customers
or other stakeholders outside the development teams. This type of testing is called accep-
tance testing.
The type and frequency of testing is specific to your organization. Figure 1-6 presents a
number of testing terminologies; however, the majority of development teams are concerned
with four main types of testing:
■ Unit testing: We like to call this testing at the atomic level (the smallest testable unit),
meaning that you conduct this type of testing at the function or class level within a
subsystem. Unit testing is conducted mostly by the developers who have full knowl-
edge of the unit under test. It’s often advantageous to automate unit testing.
■ Integration testing: This type tests for integration or interaction among the compo-
nents of the overall system. Components interact through interfaces, and that’s what
you’re validating here.
■ System testing: Here, you verify that the full system functionality is as intended or
specified in the requirements. It is not uncommon to call this type of testing func- 1
tional testing.
Unit Regression
Testing Testing
Integration
Component or Subsystem
Testing
Level
(White-Box Testing)
Software Testing
System or Subsystem
Level Acceptance
Usability
(Black-Box Testing) Performance Testing
Testing
Testing Regression
Functional Stress Testing
Testing Testing
Figure 1-6 Various Types of Testing Used by Software Developers and Their Customers
NOTE The terms white-box testing and black-box testing are mostly seen nowadays in
relation to cybersecurity, but they actually originated with software testing. White-box test-
ing indicates that the tester has full knowledge of the subcomponents or inner workings of
the system, and it is usually conducted by developers or development test teams. Black-box
testing indicates that the testing team has little to no knowledge of the inner workings of the
system while testing it. It’s worth mentioning that, depending on your organization, the terms
white-box testing and black-box testing might not be widely used in software development.
Testing efforts are normally executed from organized test plans and generate a list of issues
to be acted upon in the form of bugs, defects, failures, or errors. The issues are also ranked
in severity and priority. It is not unusual to ship the code with bugs or issues included and
documented in the “release notes.”
Test automation has become a discipline on its own, especially with Agile practices
where you’re dealing with large systems with small subsystems, continuous updating, and
integration.
References
URL QR Code
Continuous Architecture in Practice: Software Architecture in the Age of
Agility and DevOps
https://www.informit.com/store/
continuous-architecture-in-practice-software-architecture-9780136523567
■ Modularity in Application Design: This section describes why and how the software
should be composed of discrete components such that a change to one component has
minimal impact on other components. Cohesive functions with specific functions are
the essence of modularity.
■ Scalability in Application Design: This section covers one of the most important
nonfunctional requirements of how to scale your system up and out to meet various
growth requirements.
■ High Availability and Resiliency in Application Design: This section focuses on vari-
ous strategies for improving the availability and resiliency of the system with various
software implementation technologies that affect the availability of the software and
hardware alike.
This chapter maps to the first part of the Developing Applications Using Cisco Core Plat-
forms and APIs v1.0 (350-901) Exam Blueprint Section 1.0, “Software Development and
Design.”
In the first chapter, we covered the relationship between functional and nonfunctional
requirements. We also determined that writing code that delivers functional or business
requirements is the easy part. The part that requires extra attention and careful discussion
is the nonfunctional requirements that make your application efficient, scalable, highly
available, and modifiable, among other attributes. Our goal is to help you write an applica-
tion that requires the least amount of work when changes are required. It’s not efficient if
you have to rewrite an application or some of its main components every time you want to
enhance a characteristic or feature. Therefore, the quality of the application and its devel-
opment practices are determined by your ability to enhance it or modify it with the least
amount of effort. In this chapter, we focus on the various nonfunctional requirements that
describe the quality of the design.
Foundation Topics
Quality Attributes and Nonfunctional Requirements
What defines the quality of a design?
All these qualities are great, but do they really indicate that the system is of high quality?
Maybe. Regardless of what system you’re working with, when discussing quality, you must
be specific about the quality attribute being investigated or discussed. Quality attributes or
nonfunctional requirements must be measurable and testable. The best or easiest example
is security. You can never say that a system is “secure,” but you can say that the system is
secure under a set of conditions, inputs, or stimuli. The same applies to availability. You must
specify the failure type before you can judge the system to be available.
The message we want to deliver here is that quality attributes and functional requirements
are connected. You cannot consider one without the other. You cannot assess performance
or availability if you’re not observing them against the system performing a functional or
business requirement task. In other words, you cannot measure nonfunctional quality attri-
butes without considering the functions the system is performing.
Another message that shines in the preceding paragraph is measure. For the most part, qual-
ity attributes should be measurable and testable. As mentioned, you can quickly define qual-
ity attributes as measurable indicators of how well a system responds to certain stimuli or
how it performs functional requirements. Quality attributes need to be precise and specific.
When discussing performance, you have to use specific parameters such as time or CPU
cycles or time-out values.
In addition, you have to keep in mind that trade-offs exist among the various attributes.
A high-level example would be the inverse relationship between performance and scalability.
As scalability or capacity requirements increase, there is a high probability that performance
would suffer.
The following are quality attributes by which you can judge the quality of a given system:
■ Security: This attribute is possibly the simplest or hardest quality attribute to fulfill. It
can be summarized as the capability of the system (and its subcomponents) to protect
data from unauthorized access.
■ Modifiability: Simply put, this attribute addresses how the system handles change.
There are many types of change. Change can come as an enhancement, a fix, or the
inclusion of a new technology or business process. Can you integrate the new changes
without having to rewrite the system or subsystems?
■ Reliability: This attribute asks how long a system can perform specific functions under
specific load or stress. For example, a system designed to handle 10,000 new connec-
tions per second is hit with a failure scenario that causes the 12,000 existing users to
connect at once. How does the system handle this unforeseen load and for how long?
■ Usability: This attribute determines the ease or difficulty with which a system’s user
can accomplish a task and how much support (help) is available. One organization we
worked with called this attribute serviceability. Both terms are concerned with the
same outcome.
■ Testability: This attribute determines how a system handles normal or erroneous input
at execution time. The system is testable when it can handle various inputs and condi-
tions representing a variety of test cases. For example, the system asks users to input
a birthdate in the form mm/dd/yyyy. How does that system react when a user enters
March 10, 2021? That test case can be tried to test how the system handles abnormal
input patterns.
■ Interoperability: This attribute addresses how the system interacts with other systems.
How does a system exchange information and using what interfaces? Interoperability
relies on understanding the system interface, sending a request to it, and seeing how
the request is handled.
■ Serviceability: This attribute addresses the ability of users to install, configure, and
monitor the software. Serviceability is also referred to as supportability, where users
(or test teams) can identify and debug exceptions and faults.
NOTE The ISO/IEC 25010 standard publication presents a more generic, or standard, defi-
nition of a more comprehensive list of quality attributes. Notice how it creates subcategories
of quality attributes. The ISO/IEC 25010 is good to know if you’re helping your organiza-
tion build a standard list of attributes for internal software projects. See System and software
engineering—System and software Quality Requirements and Evaluation (SQuaRE)—System 2
and software quality models.
Table 2-2 shows how ISO/IEC 25010 describes each of the quality characteristics or attri-
butes. The highlighted categories are essential to DEVCOR development and are handled in
some detail.
NOTE The highlighted quality attributes are necessary for the DEVCOR exam and are
given detailed attention. We consider them independently and not necessarily as main or
subcategories as shown in the table. For example, modularity and modifiability are dis-
cussed in independent sections and not under the main header “Maintainability.” Similarly
for availability and reliability.
It’s clear from Table 2-2 how important it is to have a clearly defined list of nonfunctional
or quality attributes. For example, as a developer, what more do you need than to have your
code highly available, modular, and secure? Therefore, it becomes clear that a focus on these
attributes achieves what every developer is looking to deliver on the code they’re working
on:
■ Quality: Good and standardized coding practices create easy-to-understand and well-
documented code.
■ Unplanned outages: Availability and resiliency at the component level can increase
uptime of the overall system even when failure occurs at the component level.
■ Low maintenance: Modularity and modifiability lower maintenance costs and future-
proof your system.
As you build software applications, it is important to have the end goal in mind. There are
several important points that, as a developer, you need to consider every time:
■ Document your work. You will save yourself and others valuable time.
■ Try to understand the full system architecture, not just your component.
■ Keep it simple.
■ Ask questions. If requirements are not clear, don’t make assumptions that may affect
the whole system or affect those who are on the receiving end from you.
Figure 2-1 A High Availability Scenario Used to Demonstrate Quality Attributes Stages
We use the devices in Figure 2-1 to demonstrate the stages described. For these two redun-
dant switches, one is Active (StackWise-A), and the other is Standby (StackWise-S). The
StackWise Virtual Link (SVL) is used to synchronize state between the Active and Standby
and for transmitting the heartbeat informing both switches of the state of each switch.
Let’s assume that the failure event here is loss of power at Chassis-1 (meaning StackWise-A);
then the processor is no longer available and no longer transmitting heartbeats to Chassis-2.
Then Chassis-2 becomes the Active switch within 50 ms. Table 2-3 demonstrates each of the
stages and the event associated with it.
More advanced high availability concepts are discussed later in the chapter.
System A System B
Module Module
1 10
Module Module
2 2
Module Module
3 30
Module Module
4 40
Benefits of Modularity
The most important benefits of modularity are as follows:
■ Maintaining and modifying the individual modules and the overall system are easier.
Updating or enhancing functionality of the system may easily be done through simple
modification to one or multiple modules without having to touch the entire code base.
■ Reading, understanding, and following the code logic are easier. Modularity also
makes code review and debugging easier.
■ Security is improved. It’s easier to focus on secure coding practices when you’re deal-
ing with smaller chunks.
System A
Module 1
Module 2
Module 3
Module 4
System A
Module 1
Module 2
Module 3
Module 4
System A
Module 1
Module 2 Module 6
Module 3 Module 4
Module 7
Module 5
Module 8 Module 9
Module 10
■ Think black-box design: Modules should have predictable or clear functions with
consistent outputs. Modules that maintain local data (through memory) and that do
not reset variables will have inconsistent outputs. With black-box design, you expect to
have the same output generated for the same input introduced, as shown in Figure 2-6.
Figure 2-6 Input and Output Are Consistent in a Module; No Data Executed by a Module
Should Impact Subsequent Ones
NOTE Try to make your application components as stateless as you can, especially for
modules of generic functions being used multiple times within your application.
■ Pay attention to interfaces: Typically, every module has a function and an interface.
The function is easy to understand because it is the body or purpose of why the mod-
ule exists and how it is implemented. The interface, on the other hand, is what data,
properties, or executable actions, specific to the module, you’re willing to share with
anyone or anything that interacts with the module.
As you look at the leading practices for building efficient modules, you also have to
pay close attention to best practices for building and assigning interfaces. Refer to
Table 2-4 for a list of best practices and how to use them.
Complex and redundant interfaces defeat the purpose of modularity or make it alto-
gether confusing. You need to control and restrict interfaces to the surface or edge
of the module, as shown in Figure 2-7. Reaching into the middle of the module is not
clean and may create issues of coupling. In addition, reducing the number of interfaces
is key to consistency and reduction of coupling.
Module 1
YES Module 2
NO
Figure 2-7 Control the Number of Interfaces and Where They’re Provided
Customer User
Interface
■ Each microservice has a unique purpose within the context of the overall system.
2
■ Each microservice has a specific interface.
■ Microservices are reusable across multiple systems delivering the same outcomes
across various systems.
■ Microservices can interact through the interfaces to deliver a higher level of meaning-
ful outcome.
■ You can easily develop and maintain microservices independently of other microser-
vices. In return, you can support DevOps processes with rapid and frequent deploy-
ments.
■ Administrative or user capacity: Support more users or tenants locally or at the cloud
level.
Horizontal Scalability
Horizontal scaling (a.k.a. scaling out) is concerned with adding resources or processing power
to an application or logical/virtual system. For example, this might mean adding additional
servers to handle additional load to a single system or application. An even simpler definition
is that multiple physical servers in a cluster support a single application. Adding additional
servers or machines to a single application should also lead to load distribution. For example,
if a single machine is running at 80 percent capacity, after scaling out, you should end up
with two machines running as close as possible to 40 percent each (ideally but not always the
case). The most common tool for distributing the load over multiple servers is called load
balancing or server load balancing or clustering (some exceptions may apply).
The left side of Figure 2-9 clearly demonstrates that a single server is handling the entire load
and is a single point of failure. With load balancing, the right side of the figure clearly dem-
onstrates that the load balancer creates a virtual server that responds to client requests and
passes them to one of the physical servers representing the virtual server.
Virtual
Server
Clients Clients
Server Load
Balancer
Distribute the Load and Eliminate
Single Point of Failure Servers
Vertical Scalability
In contrast to horizontal scaling, vertical scaling (a.k.a. scaling up) is concerned with add-
ing resources to a single node or physical system. For example, this means adding memory
or storage to a single server to allow it to support an application. A single node is upgraded
with additional power to do more. One of the most common ways to do this in application
development is multithreading. As you scale up the number of available processors (or cores,
as commonly used nowadays), the application automatically utilizes the additional proces-
sors. It is also not uncommon to see multiple single-threaded modules or instances utilizing
the presence of additional number of processors.
In Figure 2-10, the two scalability scenarios of scaling up and scaling out are clearly
demonstrated.
Vertical Scaling
Horizontal Scaling
Figure 2-10 Horizontal vs. Vertical Scalability
Some publications refer to hybrid scaling or autoscaling, which means using both
approaches to keep the system or application healthy. This means that the system (through
automation and prediction) scales up or out depending on the need of the application.
It’s worth noting here that the term elasticity, which you commonly hear in relation to
cloud resource management, is also a type of scaling up/down or in/out based on predeter-
mined policies or based on system demands. For example, you might have seen systems add
resources to a web application based on time-of-day policy. More resources can be added
during known or predicted high-demand hours, and subsequently, the resources can be
released outside the high-demand hours. 2
1. Understand your application’s memory and CPU requirements and enable your system
to utilize the addition of resources effectively. For example, how many bytes or kilo-
bytes of RAM do you need per 100 concurrent users? How many more users can you
handle by increasing memory? It’s most probably not a linear relationship. Scalability
How does an application react to the loss of connectivity to its subsystems or databases?
What happens when host systems fail?
How does a system handle the hardware failure of one of the nodes in a cluster?
You can measure the availability of a system as the time it is fully operational in a given
period. Although the following formula was initially used for hardware systems, it is an
excellent example to illustrate this point:
MTBF refers to the mean time between failures, and MTTR refers to the mean time 2
to repair (or resolution, or recovery). As a software developer, when you’re thinking about
availability, you should think about what will make your system fail, the likelihood of it hap-
pening, and how much time will be required to repair it.
Every system has an availability goal that includes all planned and unplanned outages, and
we sometimes call this a service-level agreement (SLA). The SLA is a commitment made
by the system architect (owner or operator) that the system will be up for a specified period
(e.g., a billing cycle or a year). For example, an SLA of 99 percent means that the system will
be available 99 percent of the time it is needed. If you use one year as a time of operation,
then the SLA specifies that the system will be available 362 days and 8 hours. Also, 99 per-
cent availability means that there is a 1 percent probability that the system will not be avail-
able when it is needed, and that translates to 3 days and 16 hours.
You commonly see terms like three nines to describe 99.9 percent availability or five nines
to describe 99.999 percent availability. Table 2-5 lists yearly values of availability.
NOTE The availability rating of a system is constrained by the lowest availability number
of any of its components. For example, in an application that uses three interdependent
systems where two of them are considered to have 99.999 percent availability and a third has
99.9 percent availability, the system’s availability is 99.9 percent.
Recently, the focus has been shifting to high availability of five or six nines or even higher
(eight nines), especially as some of the software applications discussed in this book are con-
cerned with running enterprise or service provider networks or business services at a global
level. Certain instances of businesses require being as close to “always on” as possible.
There is a great deal that we could discuss about availability, but we limit the discussion to
these main points:
■ Detection
■ Recovery
■ Prevention
Regardless of how you want to look at it, the following are common ways of detecting a
fault and alerting you to a failure that is about to happen or has already happened:
■ Heartbeat or “hello packets”: Cisco loyalists or experts commonly see this process
used in the majority of network operating systems. This is probably one of the most
important components of achieving high availability. A heartbeat is usually exchanged
between two or more redundant system components or between a software/hard-
ware system and a monitoring system. Sending, receiving, and processing heartbeats
requires system resources. There are a few things that you can do to prioritize heart-
beats between components using special tags or quality of service (QoS) marking on
a network link. This way, you can guarantee (to a certain extent) that heartbeat mes-
sages are not dropped or discarded during high system utilization periods. Figure 2-11
demonstrates the concept of heartbeats informing the server load balancer of their
availability:
■ Simple ping or ICMP echo/echo-reply: We use the term simple in describing this
method because it only informs you that the system or subsystem component is up or
down and nothing else about the overall behavior. However, you can read something
into the latency numbers generated by the echo/echo-reply. If the latency (or round-trip
time) is high, then this could tell you something about the state of the network or the
underlying hardware architecture.
Virtual Server
Clients
Load
Balancer
Servers
■ Sanity checks: These processes check the validity of important output values.
Understanding certain operations with high dependencies and checking the validity of
the operations are very important. For example, if you perform an operation and store the
result in a variable that gets used in a subsequent operation, and all you check is that the
stored value exists without validating the “operation,” then you have a fault that is very
hard to find. Sanity checks are simple and should not stress the system resources.
■ Redundancy: Redundancy is the most common way to deal with failure. In essence,
redundancy strives to remove single points of failure and ensure fast recovery from
faults. Having additional resources capable of performing the task of the failing device
is essential. There are various ways to configure redundant resources. The decision of
which one to use depends on the availability requirements of the system or business
process. Redundancy types include the following examples:
■ Active (hot standby): Redundant systems or nodes receive and process input at
the same time; however, only one system replies back to the request. The standby
system maintains the same state (or data) as the active system and will be avail-
able to take over in a very short time. With the recent advancements in processors
and software development practices, you can expect to see failover or switchover
times in the single or low double-digit milliseconds (e.g., 10–90 ms). Consider,
for example, the high availability feature of the Cisco Catalyst switching platform
called StackWise. Without going into too much detail on how it works, Figure 2-12
demonstrates the hot standby concept mentioned here. The Active switch has both
the control and data planes active, which means that the Active switch will receive
requests (network packets), perform the destination lookup, make a forwarding
decision, and update its lookup tables (databases). The Standby switch, on the other
hand, being in hot standby mode, synchronizes state and lookup tables with the
Active processor but does not perform any lookup or make forwarding decisions
because its control plane is in standby. However, it does forward packets on its
physical ports using decisions made by the Active processor. At failover and after
a number of heartbeats (hello packets) are not received from the Active switch over
the StackWise virtual link, the Standby switch declares itself as Active and contin-
ues the operation using the lookup tables it has already synchronized with the pre-
viously active switch without having to rebuild the lookup tables.
HSRP/
VRRP
STP
Access Access
Figure 2-13 Warm Standby Using Spanning Tree Protocol (STP)
■ Spare (cold standby): The spare system may or may not be operational; it possibly
is not even powered up and, in some cases, may require a manual process to
become effective. In Figure 2-14, the Cisco Catalyst 9600 switch supports the
presence of a second supervisor module leading to what’s known as the redundant
supervisor module (RSM). If you deploy two redundant switches with two redun-
dant processors installed in each of the chassis, then you end up with an active 2
processor in one chassis and a standby in the other chassis. The two remaining ones
(one in each chassis) are called in-chassis standby (ICS). The two supervisor mod-
ules use route processor redundancy (RPR) technology to establish an RPR cold-
standby relationship within each local chassis and stateful switchover (SSO) tech-
nology to establish an active/hot-standby redundancy relationship across the chas-
sis. In the unlikely event that the active supervisor should fail within the chassis, the
standby cold supervisor will transition to the active role. This transition occurs by
fully booting up the ICS supervisor; it remains nonoperational until the SSO redun-
dancy is reestablished with the new StackWise virtual active supervisor across the
chassis. In a way, Figure 2-14 illustrates a good example of cold and warm standby
configurations. The StackWise Virtual Link (SVL) is used to synchronize state
between the active and standby switches and for transmitting the heartbeat inform-
ing both switches of the state of each switch, and the Dual-Active Detection (DAD)
link ensures that you do not end up with two active switches.
■ Timeouts: Timeouts are a close relative to heartbeats and retries. Timeout values are
given to processes that wait for responses from retries or retransmissions. Every time
a retry or a heartbeat is sent, a timer is set and decrements until either a response is
received or it times out. When it times out, then a failure is declared.
■ System upgrade: Using this method, you upgrade the system software during opera-
tions and without significantly affecting the performance of the system. Of course,
there is a reason for the upgrade, and that is a failure. The failure may have occurred
or may have only been known to affect similar systems, so this is why an in-service
software upgrade (ISSU) or hit-less upgrade is needed. Cisco routers and switches
have been supporting this type of upgrade for quite some time, especially on systems
where multiple processors or supervisor cards exist.
■ Rollback: This method has proven to work well especially after changes in configura-
tion or after upgrades. The system maintains a copy of a previously working state or
dataset and reverts back to it in case the new configuration experiences complications
or does not pass certain checks.
Prevention
A great deal of time and effort has been dedicated to failure prevention, and that effort
seems to be paying off. There is a technology element to prevention, which we discuss here,
but there is also an operational process level that should not be ignored. Following a stan-
dard procedure of maintenance and upgrades is essential to preventing unplanned outages.
■ Isolation: At the detection of a fault and to prevent a potential failure, you can remove
the system from service to allow for further analysis or mitigation. This type of pre-
vention is also used for periodic maintenance cycles.
■ Predictive analytics: A great deal of historical data is available for analysis and to train
statistical models to identify patterns or trends that led to a failure. Using telemetry,
logs, and monitoring, you can understand the system’s normal behavior and also how
it behaved before, during, or after a failure. By observing or learning these events, you
can predict failures before they occur.
■ Automation: Automation not only improves provisioning and configuration but also
improves consistency and eliminates human error.
■ High availability planning: These functions and processes harness the power of redun-
dancy at all levels such as servers, databases, storage, replication, routers, and switches.
In a way, this includes everything we’ve discussed in this chapter alongside the busi-
ness processes and the physical infrastructure hosting the business applications. An
example of something that you need to keep in mind as related to SLAs is the storage
and replication requirements.
■ Disaster recovery planning: DRP is concerned with the facilities (i.e., the data center)
and ensuring connectivity between redundant sites and ensuring adequate distance
between the active and standby data centers (a.k.a. disaster recovery data center).
Several standards bodies and government entities specify what are called disaster
zones and subsequently designate disaster radiuses. For example, if your active data
center is an area prone to earthquakes, then your DR site should be outside that zone
or outside the disaster radius.
1. Data backup and replication: This is probably the most important high availability
deployment concept. No redundancy or high availability design is complete (or even
useful) without a proper backup, retrieval, and replication system or process. Data
integrity is key.
2. Clustering: We briefly touched on clustering and server load balancing earlier. With clus-
tering, multiple servers share or access data through common storage or memory systems.
The addition or removal of a server (or virtual server) does not affect the cluster.
Figure 2-15 illustrates how the applications represented by the cluster will continue to
operate normally as long as one server is active. Traditionally, clustering does not nec-
essarily require a load balancer to be involved. The load balancer in Figure 2-15 could
easily be substituted by a switch, and clustering would not be affected.
Virtual Server
Clients
Load
Balancer
Servers
Figure 2-15 Clustering Is an Essential Deployment Model for High Availability in Appli-
cation Design
The storage and servers, as shown in Figure 2-15, exchange heartbeats with the load
balancer or cluster manager in-band or out-of-band (on a dedicated network). With
the recent advancement in network and cloud technologies, clusters started expanding
beyond a single data center into other enterprise data centers or into the cloud.
The Cisco DNA Center, for example, supports a three-node cluster configuration,
which provides both software and hardware high availability. A software failure occurs
when a service on a node fails. Software high availability involves the ability of the ser-
vices on the node or nodes to be restarted. For example, if a service fails on one node
in a three-node cluster, that service is either restarted on the same node or on one of
the other two remaining nodes. A hardware failure occurs when the appliance itself
malfunctions or fails. Hardware high availability is enabled by the presence of multiple
appliances in a cluster, multiple disk drives within each appliance’s redundant disk
drive configuration, and multiple power supplies. As a result, a failure by one of these
components can be tolerated until the faulty component is restored or replaced.
NOTE As you start designing your HA solution, the decision to keep your application
completely on-premises (e.g., served from one or two enterprise data centers), move it to the
cloud, or even use a hybrid of both, highly depends on a choice between latency, perfor-
mance, efficiency, and quality of experience.
Table 2-6 lists a few advantages and disadvantages of the models described previously:
cohesion, coupling, cold standby, content delivery networks (CDNs), clustering, load
balancing, mean time to repair or mean time to recovery (MTTR), mean time between
failures (MTBF), software-defined networking (SDN), service-level agreement (SLA)
References
URL QR Code
https://www.cisco.com/c/en/us/products/collateral/switches/
catalyst-9000/nb-06-cat-9k-stack-wp-cte-en.html
https://www.cisco.com/c/en/us/td/docs/cloud-systems-
management/network-automation-and-management/dna-
center/2-1-2/ha_guide/b_cisco_dna_center_ha_guide_2_1_2.
html
■ Latency and Rate Limiting in Application Design and Performance: This section
covers various performance enhancements or degradation parameters and best archi-
tectural practices.
■ Design and Implementation for Observability: This section describes best prac-
tices for application performance monitoring and user experience management using
observability concepts.
■ Database Selection Criteria: This section describes how application performance and
flexibility depend on proper database selection criteria. We also discuss architectural
decisions for database selection.
This chapter maps to the Developing Applications Using Cisco Core Platforms and APIs
v1.0 (350-901) Exam Blueprint Section 1.0, “Application Deployment and Security,” specifi-
cally subsections 1.4, 1.5, 1.6, and 1.8.
In this chapter, we continue to discuss a few other nonfunctional requirements and how they
affect the application’s quality and performance. As you learned in Chapter 2, “Software
Quality Attributes,” quality attributes intersect in many ways, either in a complementary
fashion or as trade-offs. You saw how modularity positively contributes to scalability. In this
chapter, you get exposed to application performance and what trade-offs that may have to
be made to improve it. For example, will increasing system capacity impact the performance
of the system?
This chapter also includes topics related to observability and maintainability of applications
and what software development principles you need to be aware of to improve your vis-
ibility into your overall environments and application-related parameters through full stack
observability.
The concepts discussed throughout Chapter 2 enable you to build scalable and highly avail-
able software. In this chapter, you learn how to optimize for performance and detectability
through visibility.
5. Which SOLID principle states that higher-level modules should not depend on
lower-level modules?
a. Single responsibility principle (SRP)
b. Open-closed principle (OCP)
c. Liskov’s substitution principle (LSP)
d. Interface segregation principle (ISP)
e. Dependency inversion principle (DIP)
6. Which of the following is used as an indicator of software performance?
a. Latency
b. Round-trip time (RTT)
c. Throughput
d. All of these answers are correct.
7. Network congestion or packet drops are side effects of what network condition?
a. Wireless systems
b. Network overload or oversubscription
c. Slow software
d. All of these answers are correct.
8. Which performance enhancements are commonly used in application design? (Choose
two.)
a. Caching
b. Rate limiting
c. Echo replies
d. Open Shortest Path First
9. What is observability, in simple terms?
a. A way to monitor the performance of a system through its users’ experience
b. A way to detect system issues through the observation of its output
c. A way to manage and monitor distributed applications
d. All of these answers are correct
10. Observability is said to have which three pillars?
a. Logging, balancing, and multiprocessing
b. Tracing, routing, and switching
c. Logging, metrics, and tracing
d. Routing, switching, and multithreading
11. What are the differences between relational and nonrelational databases? (Choose two.)
a. Relational databases can be key-value stores where each key is associated with
one value.
b. Relational databases store data in rows and columns like a spreadsheet.
c. Nonrelational databases are also known as NoSQL, where one type can be a
document-oriented store.
d. There are hardly any differences between the two except that relational databases
are commercial and nonrelational are mostly open source.
Foundation Topics
Maintainable Design and Implementation
Change is constant. You’ve surely heard this phrase before, and we’re sure that it holds true
for all software or code. If you’re sure that, indeed, change is constant, then you can prepare
for it and design your applications to be modifiable and maintainable. By doing so, you will
save yourself and your customers a lot of time. Research has repeatedly shown that a larger
part of the cost of an application is incurred after an application is released. That cost is
spent on maintenance, new functionality, and bug fixes, among other things. Designing for
maintainability is the area where you should put your thought process early in the design or 3
development process. Focus on questions such as “What can possibly change?” “How likely
are these changes?” and “What will these changes impact?” Future-proof your design by
anticipating change.
As we mentioned in Chapter 2, proper modular design allows you to make changes within
a small module or subcomponent without having to redesign or rewrite the entire system.
Modularity is one of the many leading practices that allows for easier and more effective
maintainability of your application. The following factors contribute to that goal:
■ Modular design: Modularity allows for updates, modifications, and fixes without hav-
ing to redesign the application.
■ Coding standards: The use of coding standards simplifies both initial development
and subsequent maintenance. It also keeps the developer’s thought processes within a
predefined set of rules and limitations.
Software developers change and subsequently introduce additional cost to the lifecycle of a
project for various reasons or factors:
■ Bug fixes: Whether bugs or errors are detected during a test cycle, code review, or the
production use of an application, at the end of the day, changes have to be introduced
to the code. In some cases, such changes may require a major redesign of modules,
objects, or classes. Modular and well-documented code speeds up this cycle.
■ New features and upgrades: If your code or application is being used and continuous-
ly exercised, it is a matter of time before new features and upgrades are needed so that
you can stay competitive. New features address customer efficiency or performance
requirements that extend the life of your application and subsequently its shelf life. If
the guidelines detailed in the previous section are followed, then introducing new fea-
tures is simple and fast.
■ Code refactoring: Refactoring is a DevOps feature that strives to improve the struc-
ture and quality of code without changing the functionality. It is often performed after
the initial deployment, most often to improve efficiency and maintainability.
■ Code adaptation to new environments: This is closely related to the previous point,
and it is mostly related to adapting your code to new computing or hosting environ-
ments to improve performance, reduce cost, or gain more market share.
■ Problem prevention or future-proofing: As you optimize your code for new environ-
ments or for performance, you also should lean on lessons learned to future-proof it
for anticipated changes to the runtime environment or the release of new hardware
capabilities.
class NetworkSwitch:
def __init__(self, type, model, config):
self.type = type
self.model = model
self.config = config
class SwitchDeployerService:
def deploy_switch(self, service):
if service.type == 'MoR' # Middle of Row Switch
datacenter.deploy(service.config, '192.168.100.0/24')
Let’s say you want to add support for an out-of-band management network. Is that easily
done? Not really! Adding such support requires rewriting or modifying the original code.
However, if you had already written the code using the single responsibility principle, then
you would definitely be able to add support for as many network types as needed by add-
ing additional classes and eventually conforming to OPC as well. A possible solution for this
would be to build a static or common NetworkSwitch and SwitchDeployerService. Then
build additional classes or features into the classes. In Example 3-2 the segment parameter is
added and the if statement is converted into the extended classes, one for the user and one
for management: NetworkSwitchMoR and NetworkSwitchOOB. No if or but.
Example 3-2 Example Illustrating the Open-Closed Principle
class NetworkSwitch:
def __init__(self, type, model, config):
self.type = type
self.model = model
self.config = config
If, in the future, another network type requires the deployment of the same services, all you
need to do is create and extend the class. For example, if you have a top-of-rack (ToR) data
center switch, all you have to do is create the class NetworkSwitchToR and specify the seg-
ment (in this case, you use the IP subnet for definition).
NOTE The super() function is used to invoke the __int__ function of the NetworkSwitch,
which can be thought of as the super class. It initializes the inherited instance self._xxx.
class NetworkSwitch:
def get_model(self):
return self.model
class NetworkSwitchMoR(NetworkSwitch):
def get_model(self, switch_name):
model = db.get_switch(switch_name)
print(model)
return None
class NetworkSwitchMoR(NetworkSwitch):
def get_MoRmodel(self):
return db.get_MoRmodel(self)
def get_ToRmodel(self):
raise NotAvailableError
class NetworkSwitchToR(NetworkSwitch):
def get_MoRmodel(self):
raise NotAvailableError
def get_ToRModel(self):
return db.get_ToRmodel(self)
Example 3-5 is simpler, easier to read, and easier to maintain and gives the freedom of “on an
as-needed basis.” There’s no need to generate errors without reason; it’s simpler to not use an
unnecessary interface.
class NetworkSwitch:
def get_model(self):
return self.get_model()
class NetworkSwitchMoR(NetworkSwitch):
def get_model(self):
return db.get_MoRmodel(self)
class NetworkSwitchToR(NetworkSwitch):
def get_model(self):
return db.get_ToRmodel(self)
NOTE The interface segregation principle and Liskov substitution principle are closely
related. You can think of ISP as the client’s perspective, where you want the client interaction
to be simple and efficient. LSP is the developer’s perspective, where unnecessary types gen-
erate errors like the one seen previously: NotAvailableError.
Component 1
Component 2
Component 3
System
Figure 3-3 Dependency of Higher-Level Components on Lower-Level Ones
Component 1
Component 1 Interface
Component 2
Component 2 Interface
Component 3
Component 3 Interface
System
Figure 3-4 Programming to an Interface, Not to the Implementation of the Component
As a best practice, allow higher-level classes to use an abstraction of lower-level classes.
High-level classes should manage decisions and orchestrate lower-level classes but not do any
system-specific functions. However, lower-level classes should contain little decision-making
but implement the actual interaction functions, such as API calls, database writes, view
updates, and data manipulations.
NOTE To get more in-depth knowledge of the SOLID principles, we highly recom-
mend that you read Robert Martin’s white paper about the subject: “Design Principles
and Design Patterns by Robert C. Martin,” https://www.academia.edu/40543946/
Design_Principles_and_Design_Patterns.
In computer systems and application design, latency may also be referred to as response
time (RT), but we stick to the term latency to avoid confusing you with the difference
between RTT and RT. This book is mainly concerned with building software that automates
networks, so it is fitting to understand what parameters affect performance for both the
network and the software or computer systems. Table 3-2 provides a simple definition from
both points of view.
Figure 3-5 is a simple attempt to put all of the definitions found in Table 3-2 in one diagram.
Latency
Path
Throughput/Rate
Client Server
Switch
Bandwidth = Capacity of Medium
It’s worth mentioning that the overall system never lacks impurities and imperfections, and
therefore, other factors play a huge role in performance. For the purpose of this discussion,
if you think of the overall system as communication between a client (or an application) and
a server over some communication path that most probably has routers, switches, firewalls,
and an intrusion prevention system (IPS), then you need to consider a few other factors:
■ Network congestion and packet drops: This is a side effect of network overload or
oversubscription.
■ Hardware malfunctions: Hardware failures are a fact of life, but when they come, they
don’t always lead to complete failure. Sometimes failure comes intermittently or spo-
radically. That sometimes puts the system in a fuzzy state.
■ Software issues: Software issues can negatively affect performance. Here are a few
examples:
■ Suboptimal traffic routing that forces traffic to high latency or congested paths,
causing packet drops and frequent retransmissions
■ Operator error where wrong values for resources are manually and erroneously
entered
■ Geographically distributed users and data centers: For many business, mobility,
and disaster recovery reasons, applications may be geographically disbursed, eventu-
ally creating significant distances between them and their users. Even with the most
advanced fiber-optic networks and with a minimum number of hops, the laws of
physics contribute enough latency that affects performance.
With the deployment of software-defined networking (SDN) and the use of automation
and orchestration, user or operator errors are significantly reduced. Successful SDN deploy-
ments highly depend on programmability and automation of various network functions and
policies.
Different types of traffic react differently to latency or packet loss. For example, with TCP
traffic, where traffic acknowledgment is part of the process, packet loss causes lower per-
formance, but all lost traffic is retransmitted. UDP traffic, used for video streaming and IP
telephony, has quality of service or quality of experience consequences. For UDP streams,
dropped packets cause low-resolution videos or choppy voice. The following are side effects
of high-latency, low-throughput systems:
■ Application performance issues; slow response times or slow loading of web pages
■ Frequent retransmission and processing of packets that increase load and resource uti-
lization and possibly starve other processes
Architecture Trade-offs
As mentioned in Chapter 2, nonfunctional requirements are closely related, and you almost
always have to decide or prioritize one over the other based on the business requirements.
This is also true for performance. Consider the following points:
■ Scalability and performance: This point is a clear one. Scaling your system or network
up or out affects resource utilization and may impact performance. As you distribute
resources among multiple servers or networks, you introduce the need for “communi-
cation” among the resources, and that in itself introduces additional latency.
■ Cost: This one is also obvious. Having to use cost-effective networks and resources
may not always provide you with the highest-performance systems.
Improving Performance
As mentioned earlier, low-latency, high-performance systems are highly dependent on con-
figurations that optimize the utilization of resources and create a balance between resource
demands and resource supplies. Various technologies and methodologies for improving
performance can be built into the software or operating systems for prioritizing requests or
increasing available resources. For this discussion, we focus on caching and rate limiting as
possibly the most frequently used ways for improving performance and enhancing the user
experience.
Caching
Caching enables you to store or position frequently accessed data as close as possible
to the user (or server, depending on the performance problem you’re trying to solve). In
Figure 3-6, caching is used to offload the servers by caching responses and replying on
behalf of the server.
Response
Cached
Query
Client 1 1
Respon 4 Query 2
se
Cache Response 3
5 Server
ry
Que 6
Network
e
ons
Client 2 d Resp
he
Cac
Figure 3-6 A Cache Device Capturing Responses and Replying on Behalf of the Server to
Subsequent Queries
Immediate benefits of caching are improved response time, user experience, and the poten-
tial saving of network bandwidth and processing power. We’ve commonly used content
delivery networks (CDNs) for caching static content like web pages or various media types
like videos or training material. Using CDNs is an effective way of caching or storing content
close to the user base.
Caching strategies differ depending on the design decision. The following are a few
examples:
■ Lazy loading: The name tells the story. The resources or data is cached when neces-
sary or when needed, not at initialization time. This strategy saves on resources and
bandwidth and also reduces loading time. Example 3-6 demonstrates the difference
between loading data when needed using lazy loading and loading the data at initial-
ization time. The latter form is called eager loading or forced loading of resources.
In Example 3-6, resources are loaded when needed for GetRecords, rather than at initializa-
tion, as seen in Example 3-7.
Example 3-6 Lazy Loading Is a Type of Caching That Loads Data Only When
Necessary
class LazyLoading() :
def __init__(self) :
self.resource = None
def GetRecords (self, resource)
self.resource = resource <- Load when needed
execute (self.resource)
Example 3-7 Eager or Forced Loading Forces the Preloading of Resources at Initializa- 3
tion Time
As with any other design decision, trade-offs exist, and the cost-effectiveness of caching
should be analyzed against its benefits, especially if the dynamic retrieval requirements out-
weigh the placement of static content. For example:
1. What’s required for executing the query or operation at the back end? Network utili-
zation limitations? Server resource utilization limitations?
2. What is the cost-effectiveness of the caching system or service?
3. What is the type of data and the effectiveness of caching? What are the hits versus
misses? If the cache misses are a higher percentage and requests are having to pass
through to the back-end server, maybe the system needs further adjustment or maybe
it is not needed altogether.
Rate Limiting
Another frequently used strategy for improving performance by managing the load on the
system is called rate limiting. Figure 3-7 demonstrates how rate limiting enables you to
control the rate at which requests or data is passed to the system (or processor) to avoid
overloading it.
Rate
Limiter
Source Destination
Figure 3-7 Rate Limiting External to the Server That Limits the Number of Requests
Reaching the Server and Possibly Overloading
The purpose of Figure 3-7 is only to demonstrate a point about how rate limiters work. The
rate limiters are not always external to the system receiving the requests. The network con-
necting the client (the source) to the server (destination) could possibly have rate limiters to
control the rate at which the server is receiving requests. The server itself may also have rate
limiters implemented to control how many of the requests it should process.
Rate limiting can be applied in many places and for various purposes:
■ Number of requests: This can mean requests for an echo/echo reply as in the ICMP
(ping of death) or for requests against an API. API metering or billing systems may pro-
vide this type of rate limiting as an external mechanism to the system (cloud level) or
locally and internally to the system.
■ User actions: The purpose is to limit how many actions a user can take per web
exchange or experience. A user may be allowed to enter their password three times to
avoid overutilizing network or systems resources (but most importantly for security
reasons).
■ Server-bound traffic: This type of rate limiting is similar to the number of requests,
but you can also use other parameters to control traffic directed toward a specific
server, like the geography from which the request is coming or time of day.
■ Concurrent connections: In various scenarios, you want to control how many concur-
rent sessions or connections a system can have. This type of rate limiting is used to
prevent overload of the system where user experience or security may be at stake.
Parallel Processing
Serial or sequential processing systems process data in sequence and mostly one task dur-
ing a time period. With parallel processing, multiple tasks are executed concurrently and in
parallel, reducing latency and improving performance. There are two main types of parallel
processing:
■ Multithreading: You can design your software in a way that allows for dividing your
tasks or requests into threads that can be processed in parallel. Multithreading is com-
plex and must be decided at development time and most of the time adheres to spe-
cific system requirements.
■ Multiprocessing: You can add additional processors to execute the tasks in parallel.
This approach can also be looked at as processing multiple user requests as they arrive
concurrently and processing them independently. This means that every session or
concurrent user is provided independent memory space and processing timeslot.
Exponential Backoff
Backoff techniques rely on system feedback for rate limiting. When requests arrive and are
not processed in a timely manner and retries are executed, they overload the system and may
eventually increase the latency. In Internet of Things (IoT) implementations, for example, if
you have an application monitoring temperature and you’re unable to get readings, instead of
continuing to poll the sensors for data and overload them, you can use exponential backoffs
to limit the number of requests sent. This way, you allow the sensor various random time
periods between requests.
Observability has three pillars or telemetry types that you, as application developer or
designer, need to take into consideration:
■ Logging: Logging tracks events and their timestamps. Logs and log types are also
important.
■ Metrics or time-series metrics: These metrics can simply be defined as system per-
formance parameters (or application health measures) and are usually measured within
a unit in time. As mentioned in “Latency and Rate Limiting in Application Design and
Performance,” response time, sessions per second, transactions per second—all are
examples of metrics.
■ Tracing: Tracing is the ability to track multiple events or a series of distributed events
through a system.
When designing and building an application, you need to keep in mind that failures or faults
will happen. The failures may be related to a software bug, a hardware issue, the network,
misuse, or many other reasons. Having the right type of telemetry capabilities and the right
instrumentation to collect is important and must be considered in the design phase. In addi-
tion, planning for telemetry or observability could influence your choice of hardware, oper-
ating system, or even your choice of programming language. Always design with the end
goal in mind and coordinate your workflows to include the three types of telemetry.
Logging
Logging has been deployed for tens of years and has proven, if used correctly, to be one of
the most useful telemetry tools. Logging is easy and straightforward, and it provides a great
deal of information about the application’s (a process) state or health.
Logging could be as simple as using a print() statement in various locations of your code to
display some variable’s value. However, using print() everywhere you want to log an event or
a value may complicate your code and render it as inefficient or “unclean.” For that reason,
you should consider (and understand) logging capabilities of your programming language.
Doing that provides a clean standard for logging with consistent logging levels and message
formats.
Python, for example, provides a Logging module that is enabled like this:
import logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s -
%(levelname)s - %(message)s')
logging.info('Welcome to DEVCOR')
When displayed, it looks like this, with day and timestamps:
NOTE Logging is just like any other function or process; it requires careful design and
implementation consideration. For example, is it used as a development aid or for exception
logging? Will logging of the various levels and message formats tax the CPU? How granular
do you want your debugging to be?
Having a logging framework (or standard) is important for consistency, efficiency, and docu-
mentation across multiple development teams. On the other hand, if you’ve been working
with Cisco devices and you need to capture system messages, then it is good to familiarize
yourself with Cisco’s System Message Logging, which uses the IETF RFC 5424 (The Syslog
Protocol) definitions shown in Table 3-4.
Table 3-4 Syslog Message Severities as Defined by the IETF RFC 5424
NOTE Logging is an important part of good programming practices that require efficient,
readable, and well-documented code. Be sure to make your messages informative, clear, spe-
cific, and timestamped. You also need to think about where to capture and store the logs and
how long they should be retained.
Metrics
Metrics are system performance parameters (or application health measures) and are usually
measured as a unit in time. Simple and common examples are application response time,
requests per second, sessions per second, and transactions per second.
Defining the right metrics to be measured or monitored for your project should be based on
the goal, function, or problem you’re trying to solve with your application. Whenever you’re
executing software, you must measure CPU usage, memory usage, swap space, and a few
other things related to the environment.
Network and application performance monitoring has seen major improvements and advance-
ments in the last two decades, and it is still improving as computer and network hardware
includes advanced diagnostics and telemetry embedded into the system. But there are a few
limitations that arise as you start looking at distributed and cloud-based environments where
physical and virtual boundaries restrict metric monitoring.
Application-level monitoring has become an extreme necessity, allowing you to monitor all
parameters affecting application performance throughout the path from the client to the
application, wherever it may be. Another important function of observability should be the
ability to correlate events captured across all services, all paths, hardware, and software.
Figure 3-8 shows a small subset of services and components affecting application perfor-
mance and subsequently affecting business performance.
Monitoring various parameters at a granular level has been the focus of a new generation of
tools called application performance monitoring (APM) applications. AppDynamics is
an example of an APM. AppDynamics utilizes agents (plug-ins or extensions) sitting across the
entire application ecosystem that monitor the performance of application code, runtime environ-
ments, and interactions. The agents send real-time data to the controllers for visualization and
further instructions. The data is used for mapping dependencies, business transaction monitoring,
anomaly detection, root cause diagnostics, and analytics. Figure 3-8 shows the TeaStore applica-
tion with AppDynamics monitoring various aspects about the application, including the business
context and therefore allowing the business to prioritize what’s important.
AppDynamics exposes various APIs for customizing and extending the feature set on the 3
platform side, which are served by the Controller and Events Service, and on the agent side.
Tracing
Traces are an important part of observability because they provide information about the
structure of a transaction and the path or route taken (requests and responses). In the previ-
ous section, we discussed APMs and how they may be able to provide you with applica-
tion dependency mapping. With tracing, you can understand the various services used for
a transaction or a request. In the world of networking, ICMP echo and echo reply as well as
Layer 2 traces enable you to discover the path taken for a trace and what network nodes or
devices are used to deliver the trace. Application tracing is similar.
With cloud or distributed environments, you see the term distributed tracing more often
than just tracing; however, conceptually they are the same. With tracing, you can determine
■ What routes or paths are affected by failing services (what if type modeling?)
■ Trace queries
■ And more
Figure 3-9 is a generic example of what devices, networks, servers, or services can possibly
be in the path between the users and their applications. Many factors can affect the perfor-
mance of an application and subsequently the user experience.
Micro
Services
Branch App
Office Servers
Internet Network
Databases
Services
Employees Content
from Home Delivery
External
Services
Field
Employees
Figure 3-9 Internet and Network Path That Connects Users and User Services
You will probably not see deep level tracing being emphasized during your DEVCOR stud-
ies, but it is good to practice when developing distributed applications and microservices.
AppDynamics (explained earlier), combined with digital experience monitoring systems like
ThousandEyes, can give you the best view of the application and business parameters with
the least amount of work. Figure 3-10 shows an application analysis display using the Thou-
sandEyes platform of services.
Business Apps
User
NOTE Because modules are a part of the overall system, all modules should describe
how they participate in the overall system and what part or service they provide and how
it relates to other parts. Sound familiar? This sounds like a combination of monitoring and
tracing but in text format.
■ Interfaces
■ Implementation information
■ Test information
■ Implementation constraints
■ Revision history
The list, of course, is at the module level, but you cannot forget code comments or low-level
documentation.
NOTE Code comments and low-level documentation are as important as all higher-level
documentation, if not the most important. They give you a window into the developers’
thought processes at the time of writing. They also simplify troubleshooting and speed up
modifiability.
no real purpose. In addition to proper storage, the data needs to be accessed quickly, survive
potential failures, and be robust to grow as the data collected becomes larger and larger.
Looking through these points of consideration, you can easily see that the selection of a
proper database for the application and potential dataset can make or break the success of
an application.
■ Relational: Relational databases are typically referred to as SQL databases, for their
use of Structured Query Language (SQL) to gather information from the store.
Relational databases are constructed of one or more tables of information, with each
table containing rows of information defined by a key, or unique identifying value. To
gather information from the database, you might be required to join multiple tables
together, based on relational attributes shared between the tables (such as a database
that needs to query the transaction logs for an item [one table] along with the inven-
tory records of that same item [another table] for audit purposes). These types of data-
bases are best used with structured data, due to the way the data must be split to align
within a given table.
Examples: PostgreSQL, MySQL
■ Time series: On the surface, time series databases (TSDB) can almost be considered
relational databases because they provide unique keys (driven by the timestamps of
the data being ingested into the database) and the data being stored is very uniformly
structured in nature. However, time series databases are unique in that there is rarely
a relational component or specialized linked tables within a TSDB. The uniformity of
the data (composed of a timestamp and some payload of data) allows for specialized
compression and storage algorithms not possible in other databases. TSDBs also gener-
ally hold onto data only for a predetermined time, rather than forever, allowing them
to be deployed in smaller footprints than traditional databases.
Data Volume
Data volume is self-explanatory on the surface; it’s the amount of data required to be held
within a given database. Certain databases can perform up until a given threshold of size,
at which point they either become unstable (at best) or unusable (at worst). This is due to
the structure in which the data is stored and the ability of the data to be queried after it
resides inside the database. If the data cannot be extracted after it has been placed within
a database, it’s no longer a database, but more like a vault. This can be a factor for the raw
amount of data being stored (especially for NoSQL databases, particularly ones focused on
documents) or the amount of RAM being dedicated to the store (in the case of Redis, which
Data Velocity
Much like data volume, data velocity is also self-explanatory, and refers to the speed at
which the database must be able to ingest data from the various sources of the application.
While this seems like a simple metric (faster is always better, right?), velocity isn’t just a
product of how fast records can be processed but must also consider the patterns in which
the data is sent. In some instances, like those that use TSDBs, data is sent at relatively consis-
tent intervals, and performance is scaled through optimizing the database for the data being
ingested and the underlying subsystem supporting the database (CPU, RAM, and disk).
However, if a database is supporting an application with a bursty traffic profile, it may be
impossible to find a database that is performant enough, depending on the overall load
to the system. In this case, some sort of front-end cache system to the database may be
required to ensure that records are not dropped or lost before they are committed to the
database. In other cases, the choice of database may allow for distribution/sharding of the
overall database to ensure scale, with the idea that the database peers will update themselves
or find quorum on the data written at some point in the future (in the case of NoSQL-based
databases that support horizontal scale-out). Anticipating the baseline and peak traffic pro-
files for the database, along with ways to mitigate growth in velocity requirements, will help
ensure that a selected database will have the ability to scale with growth.
Data Variety
While the preceding two considerations are applicable to all databases, variety has a limited
applicability to most databases, outside of those meant for document archival or big data.
Variety refers to how much the incoming data will differ from the previous set. In most
instances, the data received from the application is of a normalized set: time-series data from
an IoT sensor or a standard entry from a webform entry. Although these sets of data may
have varying levels of structure and payload, they are generally considered uniform in the
presentation and can be easily handled within a semi-rigid database structure and be indexed
and queried properly. Because of the separation possible for different input sources to a
database, the normalization must occur only per source—say, adding a new IoT sensor to a
TSDB needing a new bucket within InfluxDB, rather than a new database or complete pre-
filter of the sensor data to have it align with what exists.
In the case of a document database, the types and structures of the inputs could vary dras-
tically. In the cases of some of the largest databases supporting social media platforms,
indexing of images, video, and audio must occur within the same database. With corpo-
rate document databases, the file types could be endless, including PDFs, word processor
documents, spreadsheets, presentations, and even archived emails of several different for-
mats. Understanding the potential variety of data (if there needs to be any) can serve as a
very powerful filter for database selection.
References
URL QR Code
“Design Principles and Design Patterns,” Robert C. Martin
https://www.academia.edu/40543946/
Design_Principles_and_Design_Patterns
■ Git Workflow: This section discusses Git basic workflow agreement, how to manage
access to the code, and who contributes and who is trusted to manage the workflow.
Basics of branching and forking are also discussed.
■ Git Branching Strategy: This section discusses a strategy for managing the code devel-
opment and stabilization teams and processes. This is a strategy that all developers
must understand and agree upon.
This chapter maps to the first part of the Developing Applications Using Cisco Core
Platforms and APIs v1.0 (350-901) Exam Blueprint Section 1.0, “Infrastructure and Automa-
tion,” specifically subsections 1.10 and 1.11.
This chapter describes version control, version control systems (VCSs) in general, and then
goes on to describe Git as a version control system. The concept of version control is simple
and used in every aspect of our lives: documents, applications, operating systems, webpages,
standards, frameworks—you get the idea. The code you build for automating and orchestrat-
ing your network is no different. We’re confident that you’re not building the code by your-
self, and we’re confident you’re not building it in one sitting. You’re collaborating with others
and most probably in an agile development process where various releases and sprints are in
order. This chapter teaches you how to keep track of the development process.
Foundation Topics
Version Control and Git
When we discussed maintainability and modifiability in the preceding chapter, we started
with a statement that is super fit to use here: “Change is constant.” Therefore, a system
or methodology for tracking changes to a code, who made them, and for what purpose
becomes a very important component of a project. Version control, or simply version-
ing, can be at all levels of the development process: a module, function, feature, system, or
application.
A number of popular systems are in use today, and most of the time the one you use will
be the one that your company supports and that your development team is currently using.
Git is such a system; it has gained a lot of popularity recently for its ease of use (at least the
latest versions) and simplicity, and because it’s open source. In the next few sections, we
describe Git and look at some of its basic and advanced features.
Git differentiates itself in being fast and efficient, and also by having the following
interesting features:
■ A typical VCS stores deltas, or file changes, over a period of time (delta-based version
control), whereas Git stores snapshots of files and the filesystem.
■ Almost all operations are local to your system or computer. All related history is stored
locally and can be accessed regardless of whether you’re online or offline.
■ Data in Git is verified for integrity through a checksum algorithm. A checksum is pro-
duced before storing a file, and the checksum is used to refer to the file or data.
For more information about Git and how it works, we highly recommend you refer to the
free book written by Scott Chacon and Ben Straub and contributed to by the Git developer
community: Pro Git, Version 2.1.240. It was used as a reference for this book. You can refer
to the book and to online documentation for installing and learning Git.
Git Workflow
When multiple people need to collaborate and contribute to the same Git repository for a
project, there needs to be a working agreement on how that work will be done. A team must
agree on an operating model for coordinating their contributions to a shared, source-of-truth
code repository.
The Git Workflow that a team selects depends on the following:
2. Untrusted contributors
■ Are all contributors known up front and trusted with repository read-write access?
■ Will users follow a more complex workflow that is used by the vast majority of
open-source projects?
■ Or will they follow a simpler workflow that novice users of Git may easily under-
stand?
The following sections detail the two primary workflows that teams use: the Branch and Pull
Workflow and the Fork and Pull Workflow.
■ Shared repo access policies: All contributors have read-write access to the shared 4
repository.
■ Trusted contributors: All contributors are known up front and trusted with repository
read-write access.
■ Novice versus intermediate Git users: The team requires a simpler workflow that nov-
ice users of Git may easily understand.
The Branch and Pull Workflow has the following pros and cons.
Pros
■ This model requires only simple knowledge of Git, only incrementally different from
working solo on a Git repository because only a single Git repo is involved.
■ Because there are no forks, the user does not need to understand how to manage
working with distributed Git repos (that is, synchronizing the same source code among
many repos).
Cons
■ This model diverges from the way that the vast majority of open-source software
works (most open-source software uses the Fork and Pull Workflow).
■ All code contributors must have read-write access to the shared, source-of-truth
repository.
■ This workflow is less safe because all team members have read-write access and are
pushing to the same source of truth, allowing the possibility of overwriting each
other’s branches, especially shared branches such as the main or master.
■ This model cannot work for untrusted contributors who do not have write access to
the shared repository.
Sample Setup
Next, let’s look at a sample setup, followed by the actual Branch and Pull example.
1. Assume that you have configured your github.com user profile with an SSH key.
■ This allows you to execute remote Git operations such as clone and push using
SSH-key authentication.
■ See https://github.com/bluecodemonks.
■ See https://github.com/bluecodemonks/go-hello-world.
4. Assume that you have already been added as a read-write contributor to the project.
Figure 4-1 shows the go-hello-world project.
NOTE You cannot follow along with this exercise using this specific repository because
you don’t have write access to the repo at https://github.com/bluecodemonks/go-hello-
world. This illustrates the limitations of the Branch and Pull Workflow. The counter Fork
and Pull Workflow does not have this limitation, however, as you will see in a subsequent
exercise.
Step 4. Notice that a Git remote repository configuration has already been set up
because you cloned from a remote repository.
■ This repository is named origin (by default).
■ It refers to the original URI from which this repo was cloned.
$ git remote -v
origin [email protected]:bluecodemonks/go-hello-world.git
(fetch)
origin [email protected]:bluecodemonks/go-hello-world.git
(push)
Step 5. Look at what branch you are currently on and what branches
are available.
■ You are on the main branch, denoted by the asterisk (*) character.
■ Your local main branch is at the same commit as the remote origin/main
branch. It is set to track the origin/main remote upstream by default (see
Figure 4-3). Thus, if you execute commands such as git push without
fully qualifying where to push, it assumes the target is the default ori-
gin/main remote upstream. To push to a specific git remote and spe-
cific branch on that remote, you may fully qualify the target git push
<remote>/<branch>.
$ git status
On branch update-readme
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: README.md
no changes added to commit (use "git add" and/or "git commit -a")
$ git diff
diff --git a/README.md b/README.md
index f272b4d..8e86322 100644
--- a/README.md
+++ b/README.md
@@ -11,4 +11,10 @@ Run
```
$ go-hello-world
Hello World
-```
\ No newline at end of file
+``` 4
+
+Uninstall
+
+```
+$ rm $GOPATH/bin/go-hello-world
+```
+
+Uninstall
+
+```
+$ rm $GOPATH/bin/go-hello-world
+```
$ git remote -v
origin [email protected]:bluecodemonks/go-hello-world.git
(fetch)
origin [email protected]:bluecodemonks/go-hello-world.git
(push)
■ Notice that you are currently on the update-readme local branch and that
the origin git remote has no such branch:
Example 4-4 Branch and Pull: Pushing a Branch to the Origin Repo
■ Now, verify that the remote origin has a new branch named updated-
readme:
You are taken to the Open a Pull Request page (as shown in Figure 4-7).
■ Notice that you are asking to merge the update-readme branch into the main
branch.
■ This is your opportunity to provide a name for the pull request, as well as
add additional comments if necessary.
Step 14. Scroll down and inspect the proposed code changes. In Figure 4-8, notice the
changes in the lower window, shown as “showing 1 changed file with 7 addi-
tions and 1 deletion.”
Figure 4-8 Branch and Pull: Reviewing Pull Request Changes Before Submitting
■ If the code changes are not what you expected, you may make additional
commits in your local update-readme branch and then push the additional
commits to the origin/update-readme branch, which will then automatically
update the pull request.
Step 15. Click the Create Pull Request button. Figure 4-9 clearly illustrates the pull
request, and Figure 4-10 shows that the pull request is complete and ready for
review.
■ If the code reviewer requests changes, the reviewer can comment on the
code, and you may fix the identified issues by making additional commits
in your local update-readme branch and then pushing the additional commits
to the origin/update-readme branch, which then automatically updates
the PR.
Step 17. Now that your pull request has been code reviewed and approved, you may
merge your PR by clicking the Merge Pull Request button. Figure 4-12 shows
the approval.
Step 18. The PR will be fully merged after the Confirm button is clicked. For the pur-
poses of maintaining this sample repository, do not merge this PR; instead,
leave it in the open state for people to inspect.
Step 19. Because you pushed your branch update-readme to a shared repository, com-
mon courtesy is to delete your branch from the shared repository.
■ Note that this step is unnecessary in the Fork and Pull Workflow.
■ Shared repo access policies: Contributors have a minimal read-only access to the
shared repository.
■ Novice versus intermediate Git users: The team is capable of following a more com-
plex workflow, especially if they are already familiar with the open-source model.
This Fork and Pull Workflow has the following pros and cons.
Pros
■ This model matches the workflow of the vast majority of open-source software.
■ This workflow is safer because team members with write access by default push
changes to their own forks before PR (instead of to a shared repository).
■ This model can work for untrusted contributors who do not have write access to the
shared repository.
4
Cons
■ This model requires a higher competence of Git, beyond working in a solo repository.
Sample Setup
Let’s look at a sample setup, followed immediately by the actual Fork and Pull Workflow.
Step 1. Assume that you have configured your github.com user profile with an SSH
key.
■ This allows you to execute remote Git operations such as clone and push
using SSH-key authentication.
Step 2. Assume that bluecodemonks is a GitHub organization that has a code reposi-
tory that you would like to contribute to.
■ Assume also that the organization already exists, regardless of whether or
not you are a member.
■ See https://github.com/bluecodemonks.
Step 3. Assume that you would like to contribute to the go-hello-world repository.
■ The community has agreed to use this repository as the shared, source of
truth for the go-hello-world project.
■ See https://github.com/bluecodemonks/go-hello-world.
Step 4. Assume that you have read-only access to this project, which has been enabled
publicly readable by the project maintainers so that anyone on the Internet may
view the source code. Figure 4-14 shows the bluecodemonks organization’s
project called go-hello-world.
■ This organization might not know you, but you are still able to contribute!
■ After clicking the Fork button, you might be prompted to specify the target
location for the fork. If this is the case, choose your personal GitHub organi-
zation: https://github.com/<your-gitub-id>.
Step 3. Your fork is now tracking the shared, source-of-truth team repo.
■ A fork is essentially a clone of the original GitHub repo but is stored in a
different GitHub organization than the original rather than your local disk.
GitHub tracks the parent of the fork, to set defaults such as default targets
for pull request initiation. Notice the difference between the original and the
clone (fork) in the logical representation shown in Figure 4-17.
Figure 4-18 Fork and Pull: Locally Cloned from Fork of Origin Repo
Step 6. Look around the newly cloned repository:
$ cd ~/Dev/go-hello-world
$ tree
.
├── LICENSE
├── README.md
├── go.mod
└── main.go
0 directories, 4 files
Step 7. Notice that a Git remote repository configuration has already been set up
because you cloned from a remote repository.
■ This repository is named origin (by default).
■ It refers to the original URI from which this repo was cloned, which is
the personal repository fork, instead of the shared, source-of-truth
bluecodemonks.
$ git remote -v
origin [email protected]:dcwangmit01/go-hello-world.git
(fetch)
origin [email protected]:dcwangmit01/go-hello-world.git
(push)
Step 8. Add an additional Git remote repository configuration for the original shared, 4
source-of-truth repository.
■ Explicitly name the remote repository upstream, as in the original repo that
has been forked.
$ git remote -v
origin [email protected]:bluecodemonks/go-hello-world.git
(fetch)
origin [email protected]:bluecodemonks/go-hello-world.git
(push)
jack [email protected]:jack/go-hello-world.git (fetch)
jack [email protected]:jack/go-hello-world.git (push)
jill [email protected]:jill/go-hello-world.git (fetch)
Step 9. Look at what branch you are currently on and what branches are available.
■ You are on the main branch, denoted by the asterisk (*) character.
■ Your local main branch is at the same commit as the remote origin/main
branch. It is set to track the origin/main remote upstream by default. Thus,
if you execute commands such as git push without fully qualifying where
to push, it assumes the target is the default origin/main remote upstream. To
push to a specific Git remote and specific branch on that remote, you may
fully qualify the target git push <remote>/<branch>. Note the logical rep-
resentation in Figure 4-19.
Step 10. Now it’s time to make a contribution. First, create a local branch to store your
local changes. Because you are currently on the main branch, the new branch is
created from main. The steps are illustrated in Example 4-5 and logically repre-
sented in Figure 4-20.
Example 4-5 Fork and Pull: Creating a Branch
# Check what branch we are on
$ git branch
* main
4
# View your branch status
$ git branch -avv
main ad0bbc5 [origin/main] Added install and run instructions
* update-readme-2 ad0bbc5 Added install and run instructions
remotes/origin/HEAD -> origin/main
remotes/origin/main ad0bbc5 Added install and run instructions
remotes/origin/update-readme bb63877 Update README.md with uninstall instructions
$ git status
On branch update-readme-2
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: README.md
no changes added to commit (use "git add" and/or "git commit -a")
$ git diff
diff --git a/README.md b/README.md
index f272b4d..567f23b 100644
--- a/README.md
+++ b/README.md
@@ -1,5 +1,7 @@
# go-hello-world
```
# go-hello-world
```
$ git remote -v
origin [email protected]:dcwangmit01/go-hello-world.git
(fetch)
origin [email protected]:dcwangmit01/go-hello-world.git
(push)
upstream [email protected]:bluecodemonks/go-hello-world.git
(fetch)
upstream [email protected]:bluecodemonks/go-hello-world.git
(push)
■ Notice that you are currently on the update-readme-2 local branch and that
the origin Git remote has no such branch:
■ Now verify that the remote origin has a new branch named updated-readme:
Step 17. Now initiate a pull request from your personal fork to the shared, source-of-
truth repo on the GitHub UI by clicking the Compare & Pull Request button.
You may initiate this pull request from either your personal fork or the shared,
source-of-truth repo. In the example, shown here in Figure 4-22, the latter is
chosen.
Step 18. You are taken to the Open a Pull Request page.
■ Notice that you are asking to merge the update-readme-2 branch into the
main branch.
■ This is your opportunity to provide a name for the pull request, as well as
add additional comments if necessary.
■ Figure 4-23 shows the Open a Pull Request page and the text box for adding
your comments.
Step 19. Scroll down and inspect the proposed code changes. Figure 4-24, which is
a continuation of Figure 4-23, shows the proposed code changes for review
before submitting.
■ If the code changes are not what you expected, you may make additional
commits in your local update-readme branch and then push the additional
commits to the origin/update-readme branch, which then automatically
updates the PR.
Step 20. Click the Create Pull Request button. Figure 4-25 shows the Create pull
request step, and it is immediately followed by Figure 4-26 showing the pull
request complete.
Figure 4-24 Fork and Pull: Reviewing Pull Request Changes Before Submitting
■ If the code reviewer requests changes, the reviewer can comment on the
code, and you may fix the identified issues by making additional commits in
your local update-readme branch and then pushing the additional commits
to the origin/update-readme branch, which then automatically updates
the PR.
Step 22. Now that your pull request has been code reviewed and approved, you may
merge your PR by clicking the Merge Pull Request button. Figure 4-28 shows
that the approval has been obtained (for example, 1 approval), and now it is
ready for Merge.
Step 23. Now the pull request will be fully merged after the Confirm button is clicked.
For the purposes of maintaining this sample repository, do not merge this PR;
instead, leave it in the open state for people to inspect.
■ Stabilizing code: Stabilizes code through reducing the rate of code changes
The need for code development for new features versus stabilizing code for release is a con-
stant push and pull. Thus, a team needs a working agreement on managing this process.
1. You are able to keep a stable main branch because tests can block the PR merge.
2. You are able to release to production several times a day.
3. You can tag versions of your release directly from the main branch.
4. You have no need for release branches to stabilize your product.
5. You have no need for code freezes to stabilize your product.
6. You have no dependencies on manual quality assurance (QA) to stabilize your product.
Here’s a visualization of the GitHub Flow model. Notice that a single, linear main branch
exists with only feature branches and without the existence of release branches. This exam-
ple (shown in Figure 4-30) is the simplest of the Git branching strategies; it keeps a single
main branch stable.
“feature-X” “feature-Y”
branch branch
If you’re not able to trust your automated tests because they are not comprehensive, then
GitHub Flow does not work. Instead, use a different model called Git Flow.
Most of the open-source world uses GitHub Flow.
BOOK.indb 124
Tag Tag
1.0.0 2.0.0
“develop” branch
accepting changes
from feature branches Merge Merge
(not shown)
19/05/22 5:51 PM
Chapter 4: Version Control and Release Management with Git 125
From the main page shown in the figure, an administrator may add a branch protection rule.
A common practice is to add a branch protection rule for the main branch.
The main shared branch is often a branch that needs protection, affecting important practices
such as
References
URL QR Code
ProGit: Everything You Need to Know
About Git
https://github.com/progit/progit2
github.com
Network APIs
■ Calling an API: This section covers how to call an API using Cisco platforms or others
by using its Uniform Resource Identifier (URI).
■ Selecting an API Style: This section discusses using either Open API (Swagger) or
JSON (JavaScript Object Notation) when making design decisions and doing so in a
machine-parsable format.
■ Network API Styles: This section covers both REST and YANG styles, addressing web
APIs and network device APIs and the differences in the main abstractions they build on.
This chapter maps to the Developing Applications Using Cisco Core Platforms and APIs
v1.0 (350-901) Exam Blueprint Section 3.0, “Cisco Platforms.”
Software developers use application programming interfaces (APIs) to communicate with
and configure networks. APIs are used to communicate with applications and other soft-
ware. They are also used to communicate with various components of a network through
software. This chapter focuses on what APIs are used for, how to use them, and the differ-
ent API styles including data types and REST/RPCs. You learn how to keep the consumer in
mind when you’re designing an API so that developers who are not familiar with your prod-
uct can easily understand your API. In addition, you see the importance of consuming and
visualizing RESTful web services.
Foundation Topics
What Are APIs?
Most of today’s automation workflows and integration use application programming inter-
faces for deployments, validation, and pipelines. All applications expose some sort of API
that governs how an application can be accessed by other applications. An API provides a
set of routines, protocols, tools, and documentation that can be leveraged for programmatic
interaction. The API represents a way in which elements or applications can be programmati-
cally controlled and described. It can also allow external applications to gain access to capa-
bilities and functions within another application.
APIs have the following primary components:
■ Methods
■ Objects
■ Formats
Methods
The major or most commonly used Hypertext Transfer Protocol verbs (or methods, as they
are correctly termed) are HTTP methods that indicate the intent of the API call, although
they also can be nouns. A RESTful architecture API can describe the operations available,
such as GET, POST, PUT, PATCH, and DELETE. These operations are described in more
detail here:
■ The GET request enables you to retrieve information only. The GET request does
not modify or update any information. Because the GET request does not update
or change the state of the devices, it is often referred to as a safe method. GET APIs
typically are idempotent. An idempotent operation is one in which a request can be
retransmitted or retried with no additional side effects; this means that making mul-
tiple identical requests must produce the same result every time until another API
(POST or PUT) has changed the state of the device on the server. For example, you
need an application to retrieve all the users who are listed as Admin on a device or
devices. The GET request would return the same data repeatedly until only a new user
or users remain.
■ A POST request enables you or your application to create some added information or 5
update current information. This would not be on your devices or in your application
or if you are wishing to update this information with newer or correct information. A
POST request is neither safe nor idempotent. This means that if you create a new user
called Bob on your application and send this as a POST request, the new user Bob
would be created. If you run the same code several times after this, you would have
five users all called Bob. Always using POST to create operations is considered best
practice.
■ The PUT request enables you to update a piece of information that is already present
on your device or application. The PUT method is idempotent. If you retry a request
several times, that should be comparable to sending a single request modification.
After you create Bob as a new user, for example, you want to update his address.
Sending a PUT with these details would add his address to his information. If you send
this request again, the PUT replaces Bob’s address in its entirety and overwrites the
current resource.
■ The PATCH request can be used when you require partial resources or need to modify
an existing resource. The PATCH request is neither safe nor idempotent. The main
difference between the PUT and PATCH requests is in the way the server processes
the enclosed entity that will modify the resource, identified by the Request-URI. For
example, Bob bought a new phone, and you need to update just his phone number
records with his new number. In this case, you would use PATCH to request updates
on part of the resource.
resource that is not present would yield a Not Found message because there is noth-
ing to delete. Sadly, you learn that Bob has decided to leave the company and you are
asked to remove all of Bob’s information and details, so you complete the task and go
to lunch. Another member of the team picks up the request while you are at lunch and
runs an API call to delete Bob’s details. That team member is then met with a 404 Not
Found error message.
Table 5-2 summarizes the API methods and whether they are idempotent and considered safe
or not safe.
Objects
An object is a resource that a user is trying to access. In the RESTful architecture, this
object is often referred to as a noun, and it is typically a Uniform Resource Identifier (URI).
A URI can be described in the same way that an IP address is described. It is unique—just like
your home address or your office address, with a number or name; street; city; and depend-
ing on the country, a state, county, or province. The URI is a digital version of your address or
location. Details can be sent to this address, and the sender knows this is a valid address. Using
the URI is like going to a coffee shop and requesting a coffee, but in the URI’s case, this is a
resource you are asking for and not a steaming hot beverage—that would be a GET request.
Formats
A format is how data is represented, such as JavaScript Object Notation (JSON) or Exten-
sible Markup Language (XML).
Whether you are sending or receiving data, APIs use machine-readable formats such as JSON
or XML. These formats allow APIs to be represented as semistructured data. The main rea-
son for their use is that semistructured data is significantly easier to analyze than unstruc-
tured data. This is the reason that both are known as “self-describing” data structures also.
Unlike JSON, which does not have a data binding contract, XML uses data binding and data
serialization. This distinction can be key because most APIs use structured data, not struc-
tured documents. The biggest majority of APIs today use JSON, but some do use XML.
Figure 5-1 shows the workflow for an API request to a server and the API response.
APIs are also used to communicate with various components of a network through software.
In software-defined networking (SDN), northbound and southbound APIs are used to
describe how interfaces operate between the different planes—the data plane, control plane,
and application plane.
The northbound interface can provide such information as network topology and configura-
tion retrieval and provisioning services, and it facilitates the integration with the operations
support system (OSS) for fault management and flow-through provisioning. The southbound
interface defines the way the SDN controller should interact with the data plane (aka for-
warding plane) to modify the network so that it can better adapt to changing requirements.
For example, here are some examples of how the Meraki API can be leveraged to use both
northbound and southbound APIs:
■ Building a dashboard for a store manager, field techs, or unique use cases
5
APIs vs. No API
All the Cisco controller-based platforms, such as Cisco SD-WAN vManage (software-defined
wide-area network), Cisco DNA Center, and Cisco Application Centric Infrastructure (ACI),
have a robust collection of APIs. This allows developer and engineering teams to build code
and integrate Cisco APIs into the workflow, CI/CD pipeline testing, and monitoring.
However, APIs are not just limited to Cisco controllers and devices. Cisco has APIs for gath-
ering data, such as the CCW Catalog API and Cisco PSIRT openVuln API. You can access
them at https://apiconsole.cisco.com/. They allow Cisco partners, customers, and internal
developers who need to access APIs to get Cisco data and services.
APIs can certainly be helpful for developers when they are maintained and supported, but
not every website or service has the ability to do so. Some websites and services do not pro-
vide an API, or some have had an API and have chosen to take down their API services from
their websites. Providers such as Twitter, Facebook, and Google are known to have power-
fully built and well-documented APIs as they see the value for developers, and this way, they
can provide an interface and keep their services secure at the same time. Whereas some pro-
viders get this right, numerous service providers have exposed an API to their services, and
this has led to a data breach and their company details being leaked, along with their custom-
ers’ information too (not good!). But even the big players of the tech world get this wrong too
sometimes; some of the bigger companies have either significantly scaled down their APIs’
functionality or changed their API terms of service. Some companies have even shut down
their APIs altogether when there is serious risk of attack or data leaks, or they are not making
enough profit. But what if the service you want to consume does not provide an API?
Web Scraping
One of the most common gripes about APIs is that they are subject to change (as mentioned
previously for company policy, security, or an update). Such changes can leave developers
frustrated and with nonworking code or integrations. Another case is that there is no API,
the company has no desire to build one, or it just does not have the resources to do so.
When no API is present or you wish to grab data in a reliable way, web scraping provides a
reliable alternative to assembling data. Here are some of the pros of web scraping:
■ Rate limiting: A well-designed API has rate limiting; web scraping, in contrast,
does not. You can access the data swiftly, and as long as you are not generating vast
amounts of traffic to the website, setting off alarm bells to the provider’s DDOS ser-
vice, you will be fine.
■ Updates and changes: APIs are subject to changes over time. As much as a provider
does not wish to remove an API resource, it does happen. This is one reason that using
an undocumented API resource is not recommended because this capability can be
taken away at any time and without any notice! Web scraping can be done at any given
point and on any website or URL.
■ Data format and reliability: A well-designed and well-built API provides customers
access to the services they need and in a structured data format. However, some APIs
lack investment and might not provide the best format, so you might find yourself
spending more time cleaning up the data than you wished or receiving old data back
because the API has not been updated. In both cases, web scraping can be custom
driven, giving you good data back.
Figure 5-2 shows web scraping, web harvesting, or web data extraction, which is data scrap-
ing used for extracting data from websites.
Jeff Bezos’s API Mandate: How the AWS API-Driven Cloud Was Born
Cloud computing and Amazon Web Services (AWS) changed the way that a lot of compa-
nies built their infrastructure. The huge surge in moving to the cloud prompted a change in
businesses and the way they updated their delivery model and their engineering team’s struc-
ture, and with this, the way they deployed, engineered, and designed their solutions.
In a cloud-first world, APIs were one of the key attributes at the forefront. In the early
2000s, Jeff Bezos’s API mandate was born. In his API mandate, he outlined the principles
and foundation as to how AWS could be more agile. He did not say what technologies his
team must use but defined only the outcomes of the systems. Here is the mandate he wrote
in the early 2000s.
All teams will henceforth expose their data and functionality through service interfaces.
By providing an external API so that customers can have access to data and connect their
systems, less guiding is needed for the customer and less handholding. This is one of the
foundations for the huge success of Amazon and AWS; it helped build greater collaboration
and was a big money maker for the company.
Teams must communicate with each other through these interfaces.
There will be no other form of inter-process communication allowed: no direct linking, no
direct reads of another team’s data store, no shared-memory model, no backdoors what-
soever. The only communication allowed is via service interface calls over the network.
Imagine that the support and operations team would like customer data from the network
or the infrastructure because a customer has reported slow data loading and service page 5
timeouts. The customer wants to know the state of the load balancer—for example, how
many devices are in the VIP pool and which servers and VMs are taking the loads. For a lot
of companies, these teams do not have access to such devices (not even read-only access!),
which means they must find the owners or team for the devices and create a support request
ticket for this information. Creating a self-serve format can resolve many of these challenges.
Providing an API is only part of the puzzle; ensuring a standard and keeping to it can be
even harder. The expression “throwing it over the fence” is commonly heard when building
infrastructure; this means that after the team has built it, their work is done. Often, in this
method, getting support or help becomes almost impossible, so you or your customer is
left to figure out the details on your own. When you’re designing APIs, the format and the
schema become your own company’s mandate. Without this, how would you know whether
what you are designing and building works and delivers the correct level of details or when
an error is raised? If your company doesn’t form a standard methodology, the process can
get very complex and fast. When all teams adhere to the same process and take ownership
of documenting, this produces value for all the company.
It doesn’t matter what technology you use.
Technology moves fast. The way the engineering team did things 10 years go might not be
compatible with today’s technology. APIs should change, grow, and be adaptable, just as your
business does. As much as you can learn from other teams within an organization and as
much as you often try to use the same tools to get great consistency, this is often not possi-
ble for several reasons, such as team skill, different platform requirements, and even budgets.
All service interfaces, without exception, must be designed from the ground up to be
externalize-able. That is to say, the team must plan and design to be able to expose the
interface to developers in the outside world. No exceptions.
By reading this far, you likely have gathered that Bezos’s mandate is that of an “API First
World” or that APIs are “first-class citizens.” In an API First World, this assumes the design
and development of an API come before the implementation itself, not as an afterthought.
This method is built with developers in mind because the thinking is “How will developers
use this API?” Later, as feedback is provided from the consumers of the API, the functional-
ity of the API can be expanded and grown.
The mandate ended with this message:
“Anyone who doesn’t do this will be fired. Thank you; have a nice day!”
The lessons learned from the mandate are that changes do not always come from the top, but
when they do, they send a very powerful message to an organization that this is the direction
to take. Such lessons greatly help clarify situations and provide clear guardrails for engineer-
ing teams and developers alike. Jeff Bezos was clearly passionate about APIs and saw their
value for Amazon. Building an API-first culture allowed a singular stance and position in
which his API mandate could serve to structure Amazon’s API organizations effectively.
Calling an API
Now that we have established the principles behind APIs and where they are used in today’s
systems and organizations, we can investigate how to call an API and the steps required. In
this example, you can assume that the API call is being made by a user to an external API.
■ User: The person who makes a request. The user makes a call to the API via its URI.
■ Client: The computer that sends the request to the server, giving a request verb, head-
ers, and optionally, a request body in a Python script.
■ Server: The computer that replies to the request. The API gives the data to the initial
requesting program.
To construct an API request, you need to know the following information for the API you
are calling. Good API documentation provides all these steps and details to successfully use
and consume the API resources. One of the key elements of API documentation is the meth-
ods and which methods can be used with the API’s resources.
The URL is the endpoint you are intending to call; the endpoint is the API point of entry in
this most pivotal piece of an API.
This example uses a GET request to send an API call to the Cisco DNA Center. The client’s
health is part of the client’s API. It returns the information by client type: wired and wireless.
The API call being made is ‘/dna/intent/api/v1/network-health’. Response details are parsed
to provide the information.
NETWORK_HEALTH = '/dna/intent/api/v1/network-health'
headers=headers, verify=False)
network_health = response.json()['response']
network_health[0]['goodCount'],
network_health[0]['badCount'], network_health[0]
['healthScore']
))
If authentication is used when you’re making the request, you need to know the type to use;
basic and OAuth are two common types of authentications. This example uses basic authen-
tication. The format of the credentials could be ‘USERNAME:PASSWORD’, and it needs to
be Base64-encoded.
If the API requires any HTTP headers to be sent, the headers represent the metadata associ-
ated with the API request and response. As an example, here are some of the most common
API headers:
■ Authorization: “Basic”, plus username and password (per RFC 2617), identifies the
authorized user making this request.
In this example, basic authentication is used along with a request to retrieve a token from the
API ‘/dna/system/api/v1/auth/token’. The device URL, API call, username, and password are
passed in through a YAML file (not shown). The Content-Type, application/json, indicates
that the request body format is JSON:
AUTH_URL = '/dna/system/api/v1/auth/token'
USERNAME = '<USERNAME>'
PASSWORD = '<PASSWORD>'
"templateId": "3f7c91b6-4a17-4544-af59-390e51f1de45",
"targetInfo": [
"id": "10.10.21.80",
"type": "MANAGED_DEVICE_IP",
}
Several public APIs (public APIs are also known as open APIs) are available. The HTTP/JSON
API is an application programming interface made publicly available to developers. These APIs
are published on the Internet and often shared for free or with a limited free tier use. Lists and
GitHub repositories provide many links to public APIs; for example, see https://github.com/
public-apis/public-apis. Here, developers can browse and build recommendation systems, clas-
sifiers, and many other machine-learning algorithms and tools on top of these APIs.
The IMDb API has a free tier service that enables developers to query the database of mov-
ies and TV shows. The free tier allows 100 API calls per day (24 hours). Once you reach the
limit, you cannot make further calls unless you upgrade or wait until the next day. The IMDb
API provides access to more than three million data items, including cast and crew informa-
tion, plot summaries, release dates, and ratings. The IMDb provides all IMDb API documen-
tation and online testing; API documentation is also presented in Swagger documentation.
NOTE To use the IMDb API documentation, you need an API key for testing APIs.
Figure 5-3 shows the online documentation and online testing for the IMDb API.
Figure 5-3 IMDb Online Documentation Page for API Testing and Reference
IMDb also provides Swagger-based testing and documentation of its APIs, as shown in
Figure 5-4.
and video games). IMDb’s identifiers always take the form of two letters, which signify
the type of entity being identified, followed by a sequence of at least seven numbers that
uniquely identify a specific entity of that type.
In Example 5-1, “tt0075148” is the unique identifier for the movie Rocky, where tt signifies
that it’s a title entity and 0075148 uniquely indicates Rocky.
Example 5-1 Performing an API Call Based on an IMDb Unique Identifier for a Movie Title
{
"searchType": "Title",
"expression": "tt0075148",
"results": [
{
"id": "tt0075148",
"resultType": "Title",
"image": "https://imdb-
api.com/images/original/MV5BMTY5MDMzODUyOF5BMl5BanBnXkFtZTcwMTQ3NTMyNA@@._V1_
Ratio0.6762_AL_.jpg",
"title": "Rocky",
"description": "1976"
}
],
"errorMessage": ""
}
actorList: [
{
id: "nm0000230",
image: "https://imdb-
api.com/images/original/MV5BMTQwMTk3NDU2OV5BMl5BanBnXkFtZTcwNTA3MTI0Mw@@._V1_
Ratio0.7273_AL_.jpg",
name: "Sylvester Stallone",
asCharacter: "Rocky"
},
Using Python and the IMDb API, you can access resources with the Python Requests
library. To gather all details on the Top 250 movies of all time, use the URL https://imdb-
api.com/en/API/Top250Movies/{API_KEY}, as shown in Example 5-3. IMDb’s APIs data set
is provided in JSON Lines file format. The files are UTF-8 encoded text files, where each line
in the file is a valid JSON string. Each JSON document, one per line, relates to a single entity,
uniquely identified by an IMDb ID (see Example 5-4).
Example 5-3 Python Code Using the IMDb API Resource to Get the Top 250 Movies
import requests
import json
API_KEY = '[add_user_key]'
URL = "imdb-api.com"
url = f"https://{URL}/en/API/Top250Movies/{API_KEY}"
response = requests.request("GET", url)
pretty json = json.loads(response.text)
print (json.dumps(pretty_json, indent=2))
Example 5-4 Python Output in JSON Format Showing APIs for the Top 250 Movies
{
"items": [
{
"id": "tt0111161",
"rank": "1", 5
"title": "The Shawshank Redemption",
"fullTitle": "The Shawshank Redemption (1994)",
"year": "1994",
"image": "https://m.media-amazon.com/images/M/MV5BMDFkYTc0MGEtZmNhMC00ZDIzLW-
FmNTEtODM1ZmRlYWMwMWFmXkEyXkFqcGdeQXVyMTMxODk2OTU@._V1_UX128_CR0,3,128,176_AL_.jpg",
"crew": "Frank Darabont (dir.), Tim Robbins, Morgan Freeman",
"imDbRating": "9.2",
"imDbRatingCount": "2435338"
},
{
"id": "tt0068646",
"rank": "2",
"title": "The Godfather",
"fullTitle": "The Godfather (1972)",
"year": "1972",
"image": "https://m.media-amazon.com/images/M/MV5BM2MyNjYxNmUtYTAwNi00MTYx-
LWJmNWYtYzZlODY3ZTk3OTFlXkEyXkFqcGdeQXVyNzkwMjQ5NzM@._V1_UX128_CR0,1,128,176_AL_.
jpg",
"crew": "Francis Ford Coppola (dir.), Marlon Brando, Al Pacino",
"imDbRating": "9.1",
"imDbRatingCount": "1685910"
},
[output shortened for brevity]
Because this code is not very easy to read, you can use a Python library called Tabulate to
put the information into a grid format (see Example 5-5 and Figure 5-6). Tabulate can be
installed with PIP.
Example 5-5 Python Code Using the IMDb API Resource to Get the Top 250 Movies
Formatting Output in Table Format
import requests
import json
from tabulate import tabulate
API_KEY = '[add user key]'
URL = "imdb-api.com"
url = f"https://{URL}/en/API/Top250Movies/{API_KEY}"
response = requests.request("GET", url)
response = response.json()
headers = ["ID", "Rank", "Full Title", "Year", "Crew", "IMDb Rating", "Certificate"]
table = list()
for item in response['items']:
info = [item['id'], item['rank'], item['fullTitle'],
item['year'], item['crew'], item['imDbRating']]
table.append(info)
print(tabulate(table, headers, tablefmt="fancy_grid"))
Figure 5-6 The Top 250 Movies Formatting Output in Table Format
■ Internal APIs: Internal APIs are closed from external access and not accessible by any-
one outside the organization. These APIs often contain information that the company
would wish to keep secure, such as employee details and internal services. An example
of an internal API might be to provide new services on the company infrastructure
to book meeting rooms or access employee directory information. Internal APIs can
be event-driven APIs (or streaming APIs); they are also referred to as asynchronous
APIs or reactive APIs. These APIs do not wait for interaction; instead, they use a
“push architecture.” Like the push model in model-driven telemetry, event-driven API
clients subscribe and receive event notifications when an event happens or something
changes.
■ External APIs: External APIs are created for external consumption outside the organi-
zation and use by third-party companies, software teams, and developers. An example
is the Spotify API, which is based on the REST architecture. In it, the API endpoints
can return JSON metadata about music artists, albums, and tracks directly from the
Spotify Data Catalogue. Like Internal APIs, external APIs can be asynchronous APIs
or reactive APIs. One example where you might see event-driven APIs is a stock track-
er displaying price changes in stock chat applications. A lot of social media sites use
event-driven APIs to push out the latest content from companies or people you follow
via their platforms.
■ Partner APIs: Partner APIs are a hybrid of internal and external APIs and are designed
for an organization’s partners. This means that the consumer must have some form of
relationship to the organization. Special permissions and security procedures, such as
onboarding registration, are often required to access partner API resources. Cisco’s
5
API Console is an easy-to-use resource for Cisco partners, customers, and internal
developers (see https://apiconsole.cisco.com/). External and internal users who are not
signed in can only access publicly available documentation, and partners and regis-
tered users can access Cisco data and services.
■ The Cisco API Console, shown in Figure 5-7, allows Cisco partners and customers to
access and consume Cisco data in the cloud.
Figure 5-7 Cisco API Console, Where Cisco Partners and Customers Can Access and
Consume Cisco Data in the Cloud
■ Performance: Providing a library of resources and reusable APIs helps speed up devel-
opment and ongoing production of services.
■ Security: Providing a secure connection and content sharing both inside and outside a
company enables the API to securely expose systems.
■ Engagement: Publishing APIs can lead to software companies and other teams writing
software and services to interact and consume API resources.
■ Monetization: If the company’s API is external facing, the API’s design can be to
make money, either directly or indirectly via subscription or pay-per-service.
Some command tools for building and developing an API include the following:
■ Postman: Postman is a collaboration platform for API development. Postman enables API-
first development, automated testing, and developer onboarding. It enables you to write,
edit, or import schema formats including RAML, WADL, OpenAPI, and GraphQL. Then
you can generate collections directly from the schema. A feature of Postman is mock serv-
ers, which allow you to generate mock APIs from collection servers to simulate your API
endpoints. Postman can also help create documentation for individual requests and col-
lections. Cisco DevNet provides a number of Postman collections, which you can access
through the Postman workspace (see https://www.postman.com/ciscodevnet).
The Cisco DevNet Postman Collection, shown in Figure 5-8, provides a prebuilt
collection of Cisco APIs.
Figure 5-8 Cisco DevNet Postman Collection’s Prebuilt Collection of Cisco APIs
■ Apigee: Apigee, from Google Cloud, is a cross-cloud API testing tool that enables
you to measure and test API performance and supports, and to build APIs using other
editors like Swagger. Apigee has several features that enable you to design APIs to cre-
ate API proxies and visually configure or code API policies as steps in the API flow.
Security APIs can administer security best practices and governance policies. Other
features, such as Publish APIs, Monitor APIs, and Monetize APIs, are also available.
■ Swagger: Swagger is a set of open-source tools for writing REST-based APIs. It enables
you to describe the structure of your APIs so that machines can read them. In Swagger,
you can build and design your APIs according to specification-based standards. You
also can build stable, reusable code for your APIs in almost any language and help
improve the developer experience with interactive API documentation. Swagger can
also perform simple functional tests on your APIs without overhead and provide gover-
nance settings and enforce API style guidelines across your API architecture.
■ Fiddler: The Fiddler Everywhere client can intercept both insecure traffic over HTTP
and secure traffic over HTTPS (administrative privileges are required to capture secure
traffic). The client acts as a man-in-the-middle to capture traffic and can ensure the
correct cookies, headers, and cache directives are conveyed between the client and
5
the server. The Fiddler’s Composer allows you to compose requests to APIs. You can
organize, group, and test your APIs, which is very helpful when creating and testing
API requests.
■ Remote-procedure call (RPC) blocks of code are executed on another server. XML-RPC
uses Extensible Markup Language (XML) to encode commands. In XML-RPC, a client
performs an RPC by sending an HTTP request to a server that implements XML-RPC
and receives the HTTP response. JSON- RPC uses the JSON format to transfer data.
RPCs can be seen in standard API technologies such as SOAP, GraphQL, and gRPC.
■ GraphQL is a query language. To make it easy to recall, remember that this is what the
QL in the name means. GraphQL prioritizes client details to provide the exact data it
needs. This helps to simplify data aggregation, which can be retrieved from multiple
sources, allowing the developer to use one API call to limit the number of requests.
This capability is very useful in mobile applications because it can help save battery
life and CPU cycles being consumed by applications. GraphQL uses a type system
to describe data. The type system defines various data types that can be used in a
GraphQL application.
NOTE GraphQL was originally created by Facebook in 2012 for internal use and was
made external in 2015. Now GraphQL is an open-source tool with more than 13,000 GitHub
stars and 1,000 GitHub forks. More than 14,500 companies are said to use GraphQL in their
tech stacks to power their mobile apps, websites, and APIs; they include Twitter, Shopify,
Medium, and the New York Times.
A SOAP interface supports the atomicity, consistency, isolation, and durability (ACID)
properties:
■ Durability: If there is a system failure, the completed transaction will remain, and com-
mitted transactions must be fully recoverable in all but the most extreme circumstances.
HTTP/JSON
Where would we be today without the Internet? A high-performing Internet connection is at
the top of the priority list when someone is looking to purchase a new home or to relocate.
This is with the huge adoption of remote working, hybrid working, social media, online
gaming, and media data streaming, typically all running at the same time in the home. A lot
of this information is driven using HTTP. When you’re making HTTP requests, this format
is used to structure requests and responses for effective communication between a client
to a named host, which is located on a server. The goal is for the client request to access
resources that are located on the server. When requests are sent, clients can use various
methods for this process. The request process is documented in RFC 2616 as part of Hyper-
text Transfer Protocol, or HTTP/1.1.
Figure 5-9 shows an example of an HTTP GET request and the response.
HTTP REQUEST
: method: GET
: path: /codeexchange
: schema: https
accept-language: en-
US,en;q=0.9
cache-control: max-age=0
Figure 5-9 An HTTP GET Request and Response
A request line starts with a method; the method is a one-word command that instructs the
server what it should do with the resource. This could be GET, POST, PUT, PATCH, or
DELETE. The path identifies the resources on the server and then the protocol, scheme, and
version (for example, the HTTP version number).
When you open a web browser to https://developer.cisco.com/codeexchange, Example 5-6
shows that the method is a GET request, the path is /codeexchange, and the schema is
HTTPs.
Example 5-6 Using https://developer.cisco.com/codeexchange
: method: GET
: path: /codeexchange
: schema: https
The HTTP headers allow the client and the server to communicate and in which way they do
so. The headers are not case sensitive, header fields are separated by a colon, and key-value
pairs are in clear-text string format. The HTTP protocol specifications outline the standard
set of HTTP headers and describe how to use them correctly.
The following headers show the accepted language and the amount of time a resource is con-
sidered fresh:
accept-language: en-US,en;q=0.9
cache-control: max-age=0
The Accept-Language request HTTP header advertises which languages the client can under-
stand. If none is found, a 406 error could be sent back. The max-age defines, in seconds, the
length of time it takes for a cached copy of this resource to expire. After the copy expires,
the browser must refresh its version of the resource by sending another request.
The message body is also known as the request body. The message body data is transmitted
in an HTTP transaction message immediately following the headers. The body can be made
of JSON or XML formats containing data; it can be sent on the body of the request if it is
needed to complete the request. The message body is optional; depending on the method
being used, they may be appropriate for some request methods and unsuitable for others.
NOTE The acronyms URL, for Uniform Resource Locator, and URI, for Uniform Resource
Identifier, are often interchangeable. However, they do have different meanings:
■ A URL is a type of identifier that informs you how you can access a resource—for
example, https, http, or ftp.
■ A URI is an identifier of a special resource.
■ Simply stated: All URLs can be URIs, but not all URIs can be URLs. You often hear or
read the two terms; in the same way, some people might say jacuzzi when they mean
hot tub.
REST/JSON
REST on its own is not a standard; however, RESTful implementations do make use of stan-
dards. They include HTTP, URI, JSON, and XML. Generally, with REST APIs, JSON is the
most popular programming language used when sending data for request payloads and send-
ing responses. JSON is a data representation format just like XML or YAML, which, as you
may recall, means YAML Ain’t Markup Language. JSON is small and lightweight; it can also
be incorporated into JavaScript because JSON is a superset of JavaScript, and anything writ-
ten in JSON is acceptable JavaScript. JSON also is language agnostic; it is easily readable by
humans and machines. A big bonus with JSON is that it’s easy to parse. Every programming
language has a library that can parse JSON objects or strings in data or classes.
JSON can represent the following types: strings, numbers, Booleans, null (or nothing), arrays,
and objects. When you’re creating APIs, JSON objects are the most-used formats because an
object in JSON uses key-value pairs. The key-value pairs can be any of the other types too,
such as strings and/or numbers. Example 5-7 provides an example of a JSON object.
Example 5-7 JSON Object
{
"id": "L_62911817269526195",
"organizationId": "62911817269526195",
"name": "Site Number 1",
"productTypes": [
"appliance",
"switch",
"wireless"
],
"timeZone": "America/Los_Angeles",
"tags": [
"tag1",
"tag2"
],
"enrollmentString": null,
"url": "https://n22.meraki.com/Site-Number-1-ap/n/_VGdgdw/manage/usage/list",
"notes": null
}
JSON opens with a curly brace and closes with a curly brace. Within the curly braces are
the key-value pairs that make up the object. For this to be a valid JSON format, the key must
be enclosed by double quotation marks and a colon, followed by the value for that key. In
Example 5-7, there are multiple key-value pairs; they need to be separated by a comma. In
this example, “id” is the key, and “L_575334852396597311” is the value. Because there
is another key-value pair after this, the line ends with a comma. Here, this value is a num-
ber. The “name” key has a string as its value, and “productTypes” has an array as its value.
5
Finally, you can see this example ends with a key of “notes” and the value is set to null. It’s
the same way with empty objects using opening and closing braces ({}).
Cache-Control
RESTful APIs are very efficient and can perform very quickly because caching was built into the
REST architectural style. As you saw previously, caching is added to the Cache-Control header
in the HTTP response. As such, the data is returned from the local memory cache instead of
having to query the database to get the data every time. Caching helps with REST performance
by reducing the number of calls made to an API endpoint and lowering the latency of requests.
The drawback to this is that the data retrieved could be stale, and debugging stale data often
leads to many problems. Cache constrictions do require that a data response to a request
should be identified as cacheable or noncacheable. Other rules state that lowercase format-
ting is preferred over uppercase or mixed case, and when there is more than one directive,
they should be separated by a comma.
RFC 7234 defines the syntax and semantics of the cache-control standard. However, this
RFC does not cover the extended cache-control attributes.
The following snippet shows some of the standard cache-control directives that are used by
a client in an HTTP request:
Cache-Control: no-cache
Cache-Control: no-store
Cache-Control: no-transform
Cache-Control: max-age=<seconds>
Cache-Control: max-stale[=<seconds>]
Cache-Control: min-fresh=<seconds>
Cache-Control: only-if-cached
Cache-response directives, which are used by the server during an HTTP response, include
the following:
Cache-Control: max-age=<seconds>
Cache-Control: s-maxage=<seconds>
Cache-Control: must-revalidate
Cache-Control: no-cache
Cache-Control: no-store
Cache-Control: no-transform
Cache-Control: public
Cache-Control: private
Cache-Control: proxy-revalidate
NOTE JSON-RPC 1.0 id is the request ID. It can be of any type. It is used to match the
response with the request that it is replying to.
JSON-RPC 2.0 id is an identifier established by the client that must contain a string, number,
or null value if included. If it is not included, it is assumed to be a notification. The value
should normally not be null, and numbers should not contain fractional parts.
The following example uses a Cisco Nexus 9000 Series device. (The NX-API CLI supports
show commands, configurations, and Linux Bash. It also supports JSON-RPC.)
cat show_version.json
"kern_uptm_mins": "3",
"kern_uptm_secs": "30",
"rr_reason": "Unknown",
"rr_sys_ver": "",
"rr_service": "",
"plugins": "Core Plugin, Ethernet Plugin",
"manufacturer": "Cisco Systems, Inc.",
"TABLE_package_list": {
"ROW_package_list": {
"package_id": "mtx-openconfig-all-1.0.0.0-9.3.3.lib32_n9000"
}
}
}
},
"id": 1
}
Like REST, RPC also supports additional formats: they are XML-RPC and Protocol Buffers
(Protobuf). Protobuf is similar to the data serializing protocols JSON and XML. Whereas
JSON and XML are quite friendly to human eyes, Protobufs are not because they are com-
piled in bytes.
gRPC
You are likely familiar with the acronym gRPC if you have used or investigated model-driven
telemetry. Originally designed by Google, today Google Remote Procedure Call is a free
open-source project with an open spec and roadmap. Many Cisco platforms, such as Cisco
Nexus switches, introduced telemetry over gRPC using a Cisco proprietary gRPC agent from
NX-OS Release 7.x.
gRPC can be used as a framework for working with remote-procedure calls. This allows you
to write code as if it was designed to run on your computer, even though what you may have
written will be executed elsewhere.
gRPC is built on HTTP/2 as a transport. The main goals for using HTTP/2 are to improve
performance, enable full request and response, initiate multiple requests in parallel over a
single TCP connection, multiplex, minimize protocol overhead via efficient compression of
HTTP header fields, and add support for request prioritization and server push.
By default, gRPC uses the Protocol Buffer (Protobuf) Interface Definition Language (IDL)
for describing both the service interface and the structure of the payload messages. Protobuf
is an open-source mechanism used to serialize structured data for efficient binary encoding
(or machine readability), which can be used to exchange messages between services and not
over a web browser, unlike REST. By using binary encoding rather than text, the payload is
kept compact and very efficient.
The latest version of Protocol Buffer is proto3, and the advantage of the Protobuf IDL is that
it enables gRPC to be completely language and platform agnostic, supporting code genera-
tion in Java, C++, Python, Java Lite, Ruby, JavaScript, Objective-C, and C#. Example 5-9
provides an example.
syntax = "proto3";
message House {
int32 id = 1;
string name = 2;
float cost = 3;
}
message Houses {
repeated House houses = 1;
}
To start a .proto file, you need to specify which version you are running. In Example 5-9,
version three is defined. The house message definition specifies three fields (name/value
pairs)—one for each piece of data that you want to include in this type of message. Each
field has a name and a type. This example creates a single detail about one house. You
can use this code again if you add an array. If you use the repeated field, this field can be
repeated any number of times (including zero) in a well-formed message.
5
OpenAPI/Swagger
The OpenAPI Specification (OAS) was originally known as the Swagger Specification.
OpenAPI documents are both machine- and human-readable, which enables you to easily
determine how to consume and visualize RESTful web services. There is, however, a
difference between OpenAPI and Swagger.
The OAS is an API specification defined and maintained by the OpenAPI initiative. The OAS
was donated by Swagger to the Linux Foundation under the OpenAPI Initiative in 2015.
OAS sets a standard, programming-language-agnostic interface description for HTTP APIs.
This allows for both machines and humans to discover and comprehend the capabilities of an
API service without the need to access the source code, any additional documentation, or an
inspection of network traffic (tools such as Fiddler and Developer Tools in Chrome are often
used to do this). One benefit of using OpenAPI is that it helps when you’re designing an API.
OAS can use either JSON or YAML file formats for an API to provide these functions:
■ Defining the API’s endpoints and available operations (GET, POST, PUT, PATCH,
DELETE).
■ Outlining the parameters required for the input and output of each operation.
OpenAPI 3.0 distinguishes between parameter types—path, query, header, and cookie.
■ Describing APIs protected using security schemes such as Basic and Bearer.
Because OpenAPI is language agnostic, you can define an API in generic terms so that feed-
back can be given before anything can be implemented and added to the design. OpenAPI
provides service-side stubs to build the API; this also allows client SDKs to be generated by
client or third-party systems. Several companies use OpenAPI, including Atlassian, Mule-
Soft, Netflix, and PayPal.
Swagger is built by SmartBear Software, the leader in quality software tools for teams. Here’s
a great way to think about Swagger: say you have built an API, but without documentation
and accessibility your API might not be consumable. One reoccurring piece of feedback
from developers is that API documentation isn’t very good, and this is where Swagger enters
the picture. It helps you build, document, test, and consume RESTful web services. It can
do this from the source code by requesting the API to return a documentation file from its
annotations. In short, Swagger can take the code written for an API and generate the docu-
mentation for you.
Swagger has several services and tools:
■ Swagger Editor: With the Swagger Editor, you can design, describe, and document
new and existing APIs in JSON or YAML formats within a browser and preview your
documentation in real time. The Swagger Editor is open source, and you can contrib-
ute to the project via GitHub.
■ Swagger UI: The user interface can use an existing JSON or YAML document and
make it fully interactive. The tool can arrange the RESTful methods (such as GET,
PUT, POST, and DELETE) categorizing each operation. Each of these methods is
expandable, and once it is expanded, a full list of parameters with their corresponding
examples can be provided.
by generating server stubs (stubs can be used for quick prototyping and mocking). For
example, if you want to provide an SDK along with your API, Swagger Codegen would
be great tool to help implement it.
■ Client/server
■ Stateless
■ Cacheable
■ Uniform interface
■ Layer system
REQUEST
HTTP/JSON/XML
HTTP/REST/SOAP
RESPONSE
HTTP/JSON/XML
If a RESTful API communication between the client and the server is stateless or nonpersis-
tent, this means that when the client sends the data to the server, all the data should be sent
for the server to process the request. For example, if the required data is not sent or is incor-
rect, the server returns an error to the client. When the session is closed out, the server does
not retain the session or any information about the request that was made to it.
NOTE HTTP can also use a persistent connection, referred to as HTTP keep-alive. Persis-
tent connections improve network performance because a new connection does not have to
be established for each request.
To separate the client and server, RESTful APIs use a uniform interface. The uniform inter-
face provides a mapping to resources. For example, network APIs could be hostnames, rout-
ing information, or interface details/state.
There are four main guidelines for uniform interfaces:
■ Hypermedia as the Engine of Application State (HATEOAS): Clients deliver the state
via the body content, query-string parameters, request headers, and the request URI.
Services deliver state, which also needs to include links for each response so that cli-
ents can discover other resources easily.
NETCONF APIs
For many years, network devices such as routers, switches, and firewalls were configured
using the CLI, but this model did not scale and was subject to human error. The NETCONF
protocol was developed and standardized by the Internet Engineering Task Force (IETF). It
was developed in the NETCONF working group and published in December 2006 as RFC
4741 and later revised in June 2011 and published as RFC 6241. Looking to make network
devices more programmable, NETCONF was created to address configuration management,
gathering configuration and operation details from network devices.
Operational data is read-only. In this use case, NETCONF could be used to gather the same
information that would be gathered from issuing show commands. For example, say an engi-
neer is looking to gather all the information from a number of devices as part of an audit of
the access lists to ensure the network devices meet the security compliance. Configuration
data requires a change to be made on the network devices. Again, to compare this to the
CLI, this would be when the engineer uses the configuration terminal to enter configura-
tion commands to make changes to the network devices. For example, after the engineer has
completed the audit of the access lists, several updates and new requirements are needed on
the network devices to prevent a breach in security.
The NETCONF API can use two types of YANG models—open and native. The NETCONF
protocol uses (typically) XML-encoded YANG-modeled data.
■ Open models allow commonality across devices and platforms. An open configuration
promotes a vendor-neutral model for network management that uses YANG, a data
modeling language for data sent over via the NETCONF protocol. Open YANG mod-
els are developed by vendors and standards bodies. Open is further broken down into
open standard and open source.
■ Native models are developed and maintained by each vendor. They are designed to
integrate to features or configuration relevant only to that vendor’s platform.
NETCONF works in a client/server model. A common example of this seen today is via
Python or Ansible being the client and a router being the server. NETCONF supports JSON
and XML as the data encoding methods, but Cisco does not support JSON. A NETCONF 5
message is based on an RPC-based communication; this allows the XML message to be
transport independent. The most-often-used transport is Secure Shell version 2 (SSHv2).
Table 5-4 shows the main differences between RESTful and NETCONF APIs.
NETCONF also can use different device datastores: they are running, startup, and candidate.
If you utilize the candidate configuration datastore, you can load device changes into the
candidate datastore without impacting the running or startup configuration/datastore. Only
until the configuration is pushed via a commit will the changes be applied to the devices
and written to the running configuration. This capability can also be very useful to roll back
changes because the device keeps past candidate configurations. (Cisco IOS XE and IOS XR
keep 10 candidate configuration files; this number can be expanded to more if required and
is platform specific.)
Figure 5-12 shows the operational workflows and datastore when using NETCONF.
:candidate :startup
<copy> <copy>
<commit>
Candidate Running Startup
Figure 5-12 The Operational Workflows and Datastore When Using NETCONF
References
URL QR Code
https://apiconsole.cisco.com/
https://github.com/public-apis/public-apis
https://imdb-api.com/Identity/Account/Register
https://www.postman.com/ciscodevnet
https://www.ics.uci.edu/~fielding/pubs/dissertation/top.html
API Development
■ API Design Considerations: This section identifies the challenges that need to be
addressed when designing an API and considers what functionality is required by con-
sumers or customers of the API.
This chapter maps to the Developing Applications Using Cisco Core Platforms and APIs
v1.0 (350-901) Exam Blueprint Section 2.0, “Using APIs.”
Application programming interface (API) development covers both API design and API
architecture. Whereas API design focuses on the API itself, API architecture is the entire
solution, which includes the back end or infrastructure that the API provides access to. APIs
are designed for developers; end users do not often see the API itself. When creating an API,
developers should ensure that the API is clean and reusable. Being consistent in both use for
consumption and in documentation ensures that developers will continue to use and con-
sume an API. Providing an API client/software development kit (SDK) assists developers
by speeding up adoption and integration into other applications and services, no matter the
language that the developers have chosen to use.
There are two main methods of building APIs: inside-out and outside-in. Both methods have
advantages and disadvantages. This design selection depends on what stage a company or
team is in its API development. All APIs should be built with security at the forefront of the
design. One of the first steps is API authentication; this process ensures the security of the
company’s API, prevents attacks and data breaches, and safeguards critical services by iden-
tifying and authorizing clients.
There are several API authorization methods, and as part of the API development process,
the first step in securing an API is to accept only queries sent over a secure channel, such
as Transport Layer Security (TLS), the successor to the now-deprecated Secure Sock-
ets Layer (SSL). Security methods on top of this include basic authentication, API keys or
tokens, and OAuth. Some APIs also use rate limiting and error handling. This additional
layer of protection helps by throttling API requests, using algorithmic-based rate limiting,
and preventing an API from distributed denial-of-service (DDoS) attacks and other malicious
abuse. When an API is secure, consumption and performance of the API will keep develop-
ers happy and their connections, applications, and experience better too.
Pagination and caching can help the performance of an API. Pagination can help break
down the data and resources into small chunks. Depending on the API being developed, the
1. What does a software development kit (SDK) help speed up and improve?
a. Asynchronous API performance
b. Adoption and developer experience of an API
c. Adoption and modification of an API
d. Adoption of API documentation
2. What is the OpenAPI Specification (OAS)?
a. OAS defines a standard, programming language–agnostic interface description for
all APIs.
b. OAS defines a standard, programming language–agnostic interface description for
GraphQL.
c. OAS defines a standard, programming language–agnostic interface description for
REST APIs.
d. OAS defines a standard, programming language–agnostic interface description for
REST and RPC API.
3. An API client allows the client to abstract which of the following? (Choose two.)
a. Authentication
b. Resources
c. Headers
d. Parameters
Foundation Topics
Creating API Clients
An API client or API software development kit (SDK) can help speed up and improve the
adoption and developer experience of an API. (The term devkit is also used when referring
to an API or SDK.) API clients are not limited to being created by API owners or companies.
Consumers who frequently use APIs as part of their daily workflow often create API clients
if there isn’t an official API client library available or the ones that are available do not meet
their requirements. Because most REST APIs use HTTP request methods, using an API client
allows a client to abstract the API methods, such as authentication, resources, and error
handling, from the end developer no matter which programming language is used. The use
of multiple programming languages lowers the requirement for developers to write custom
implementation of an API based on the programming language being used.
When creating an API client, teams should adhere to several best practices. Following best
practices ensures adoption of an API client if it is to be shared with other teams or outside
the team, perhaps to customers. The API client should be automatically generated from an
API definition such as the OpenAPI Specification (OAS). This specification helps define
but is not limited to the API endpoints and authentication processes. It helps any updates to
the API client and removes the burden of updating API endpoints manually. The automation
of this process can also provide a changelog, providing a documentation path when a new
API version is released. An out-of-date API client can lead to frustration, broken pipelines,
and in the worst case, outages or loss of service. 6
Code Generation Client API Libraries for IMDb
When you want to start building an SDK using OAS, Swagger provides a number of tools
implementing the OpenAPI specification. One of these tools is SwaggerHub. SwaggerHub
provides all the tools to be able to generate client SDKs for APIs in many languages. Once
built, the SDK contains wrapper classes that can used to call the API from code, such as
Python or an application, without having to use HTTP requests and responses.
To start, open a web browser and navigate to https://swagger.io/tools/swaggerhub/. There,
you can create a free 14-day trial account. (You can create a free personal account in Swag-
gerHub using an email address or a GitHub ID, as shown in Figure 6-1.) When creating an
account, enter the required developer name and password account details. You can skip the
organization sections. After you complete this step and set up the account, go to My Hub to
start using SwaggerHub.
Figure 6-2 Creating an API Client by Clicking the Create API Button
The Import API box then appears (see Figure 6-3). In this box, fill in the fields, starting
with the IMDb-API URL in the Path field; in this case, use https://imdb-api.com/swagger/
IMDb-API/swagger.json. Then enter the owner name (this is your account and should be
prepopulated; if it is not, select your owner name from the list). For this example, you do not
need to publish the API client. Leave the Visibility field set as Private. When these fields are
complete, click the Import button.
Figure 6-3 Inserting the IMDb-API URL to Begin Creating the SDK
SwaggerHub checks the imported URL file to ensure it is a valid JSON file and can be con-
verted into the YAML format. The name and version can be left as the defaults; leaving them
ensures any future versions from the IMDb-API are shown and displayed as separate ver-
sions. Providing a self-describing file such as JSON helps create a CHANGELOG for the API
client, which is easy to parse. 6
To import the new API, click the Import OpenAPI button, as shown in Figure 6-4.
Figure 6-6 SwaggerHub Features Used to Build API Content into Client SDKs for APIs in
Many Languages
After you open the zipped file, there are several options to use the new IMDb SDK. For the
sample API client, treat this as a published SDK for a test developer to be able to install and
use. Start by creating a new GitHub repository. In this example (see Figure 6-7 and Figure
6-8), a private repository called python-client-generated is created, but feel free to give the
repository a suitable name of your choice.
Figure 6-7 Creating a New Public or Private GitHub Repository for the IMDb SDK CLI
6
Figure 6-8 Creating a New Public or Private GitHub Repository for the IMDb SDK
No URL or endpoint is defined in the newly built SDK; therefore, the default base URL in
the SDK code needs to be updated to https://imdb.com/en before the SDK will work with
the IMDb API database. Because this SDK has been designed to work with the IMDb API,
you can hard-code the IMDB API URL into the back-end code. Not all SDKs have hard-
coded endpoints; some SDKs allow the use of environment files or the importing of environ-
ment details, so the SDK can be used on different systems. For example, the Cisco DNAC
SDK at https://github.com/CiscoDevNet/DNAC-Python-SDK allows you to specify the
IP/FQDN, username, and password at the command line.
To change the endpoint of the SDK, open the SDK folder that was downloaded. Within the
SDK folder, open the swagger_client folder and navigate to the configuration.py file. Next,
open the configuration.py file in your code editor (such as VSCode/Atom); then update line
49 with the URL https://imdb.com/en. When this is done, save and close this file.
Next, initialize the new repository. Then add, commit, and push the IMDb SDK files as
shown in Example 6-1.
Example 6-1 Initializing the New Repository, and Adding, Committing, and Pushing
the IMDb SDK Files
git init
git add *
git commit -m 'first push'
git remote add origin [email protected]:[username]/python-client-generated.git
git branch -M main
git push -u origin main
If you head back to the newly created GitHub repository (see Figure 6-9), you will find a
fully documented guide for using the new IMDb SDK with complete README, Installation,
and Getting Started documentation for API endpoints.
Figure 6-9 A Fully Documented Guide for Using the New IMDb SDK
Now that you’ve cloned the IMDb Python SDK from GitHub to a local machine, you can
test it out within the Python REPL. (REPL stands for Read Evaluate Print Loop, which is the
name given to the interactive MicroPython prompt that is accessible on Pycom devices.)
At this point, you need to install the newly created IMDb SDK. The first step is installing the
requirements.txt file via PIP (see Example 6-2). The requirements.txt file includes the follow-
ing five Python libraries, which are used by the SDK:
■ Certifi provides Mozilla’s carefully curated collection of root certificates for validating
the trustworthiness of SSL certificates while verifying the identity of TLS hosts. (It has
been extracted from the Requests project.)
■ Six is a Python 2 and 3 compatibility library. It provides utility functions for smooth-
ing over the differences between the Python versions with the goal of writing Python
code that is compatible on both Python versions.
■ The dateutil module provides powerful extensions to the standard datetime module,
available in Python.
■ Setuptools is a fully featured, actively maintained, and stable library designed to facili-
tate packaging Python projects.
After you install the requirements file and run setup.py, you can import the swagger client
and run it. The dir() method returns a list of valid attributes of the object, as shown in
Example 6-3.
Example 6-3 Installing the IMDb SDK in a Python Environment
The SDK also provides examples in Python; they were prebuilt when the SDK was created
within SwaggerHub. You can use the example to test the SDK and provide results from the
IMDb API without having to write a lot of code. This capability is very useful and important
for an SDK; when this SDK is shared, the developer who wishes to use the SDK can look at
the examples and quickly start to build code to consume the API resources.
For example, to test this newly built SDK, in the GitHub README file for the SDK, navigate
to the API a_pi_top250_movies_api_key_get (see Figure 6-10).
Figure 6-10 How to Use the Python IMDb SDK in the IMDb SDK README
Documentation
The README has a complete list of all the API endpoints and how a developer can use this
SDK. By clicking the selected API, you can open and use a code example and use case of the
API endpoint (see Figure 6-11).
Figure 6-11 URL Links to Each API from the IMDb Swagger Documentation
You can copy the code for the API endpoint API/Top250Movies in Example 6-4 into the
Python REPL or into a code editor. The API client documentation shows that a developer
IMDB API key is required in a string format. If the SDK and code are run without an IMDB
API key, an error log is shown:
6
After the Python script is executed, it retrieves the API endpoint data and prints out the top
250 movies of all time from the IMDb endpoint in JSON format (see Example 6-5). When
the code runs, it checks for any errors being returned from the API. In this example, the
code completed without error. Because the output is quite verbose, you can save the file
locally to review or look for errors. You can do this quickly using output redirection on the
command line. The > symbol creates a new file if one is not present, or it overwrites the file
if one already exists. This file can then be opened in a text file via a code or text file editor.
You can use the redirect feature by providing the following path:
{'error_message': '',
'items': [{'crew': 'Frank Darabont (dir.), Tim Robbins, Morgan Freeman',
'full_title': 'The Shawshank Redemption (1994)',
'id': 'tt0111161',
'im_db_rating': '9.2',
'im_db_rating_count': '2465589',
'image': 'https://m.media-amazon.com/images/M/MV5BMDFkYTc0MGEtZmNhMC00Z-
DIzLWFmNTEtODM1ZmRlYWMwMWFmXkEyXkFqcGdeQXVyMTMxODk2OTU@._V1_UX128_CR0,3,128,176_AL_.
jpg',
'rank': '1',
'title': 'The Shawshank Redemption',
'year': '1994'},
Two of the most popular Python libraries for creating a CLI wrapper are argparse and Click.
Argparse is part of the standard Python library (from Python 2.7; argparse was the replace-
ment for outparse). Arguments can generate different actions; for example, add_argument()
can be used to create an argument. The default action, though, is to store the argument
value. Additional supported actions of argparse include storing the argument as a single
argument or as part of a list, and storing a constant value when the argument is encountered,
such as handling true/false values for Booleans.
The Python Click module is used to create CLI wrappers also. Click is an alternative to the
standard optparse and argparse modules. Click allows arbitrary nesting of commands and
automatic help page generation; it can also support the loading of subcommands at run-
time. Click uses a slightly different approach than argparse because it uses the notion of
decorators. These commands need to be functions that can be wrapped using decorators.
The creator of the Click library wrote the “why of click” at https://click.palletsprojects.com/
en/8.0.x/why/.
In Example 6-6, you can add three IMDb endpoints from the SDK code: top250, box_
office, and in_theaters. Click provides a command-line parameter to call each of these API
endpoints using the IMDb SDK. The beginning of the code is the same as the previous
example for the top250 code. Here, you add the import click statement so that this module
is accessible for the code to run. Next, under the api_key, you create a click.group with the
function of CLI, which ends with pass. The pass statement allows for multiple subcommands
to be attached later in the code for additional functionality.
Example 6-6 Importing the Click Library and Creating a Click Group
@click.group()
def cli():
pass
6
Each API call is created as a function and adds the @click.command() decorator to it. In Click,
commands are the basic building blocks of command-line interfaces. Example 6-7 shows how
to create a new command and use the decorated function as a callback. The function, named
get_250, is passed to the command. In most examples of CLI helpers, the command is often
named after the function or API call to help describe the action being performed.
Example 6-7 Creating a New Command for the Function Using Click
@click.command()
def get_250():
try:
api_response = api_instance.a_pi_top250_movies_api_key_get(api_key)
pprint(api_response)
except ApiException as e:
print("Exception when calling MoviesApiApi->a_pi_top250_movies_api_key_get:
%s\n" % e)
To make this command callable from the CLI, you attach commands to other commands of
a type group; this process is referred to as arbitrary nesting of scripts. For this example, the
function name get_250 is used:
cli.add_command(get_250)
The great part about Click is being able to add additional API endpoints or functions with-
out refactoring a lot of code. In this case, it is simple to add additional functions and Click
commands to extend the flexibility of code. To complete this and add additional features,
you add the next two API endpoints: box_office and in_theaters. In any command that has a
hyphen (-) in it, the hyphen is automatically converted to an underscore (_) when referring to
the API endpoints (see Example 6-8).
Example 6-8 Converting Hyphens Automatically to Underscores
def in_theaters():
"""
Show those movies still in theaters.
"""
in-theaters Show those movies still in theaters
Before running the code, you can look at the optional arguments now built into the code.
Click has a great built-in feature that automatically builds a help function into the commands
(see Example 6-9). This feature provides details to which argument can be run at the com-
mand line. Additional information can also be added for each command as an overview of
what each argument does or the expected results (see Example 6-10).
Example 6-9 Click’s Built-In Help Function
@click.group()
def cli():
pass
@click.command()
def get_250():
try:
api_response = api_instance.a_pi_top250_movies_api_key_get(api_key)
pprint(api_response)
except ApiException as e:
print("Exception when calling MoviesApiApi->a_pi_top250_movies_api_key_get:
%s\n" % e)
@click.command()
def get_box_office():
try:
api_response = api_instance.a_pi_box_office_api_key_get(api_key)
pprint(api_response)
except ApiException as e:
print("Exception when calling MoviesApiApi->a_pi_box_office_api_key_get:
%s\n" % e)
@click.command()
def get_theaters():
try:
api_response = api_instance.a_pi_in_theaters_api_key_get(api_key)
pprint(api_response)
except ApiException as e:
6
print("Exception when calling MoviesApiApi->a_pi_in_theaters_api_key_get:
%s\n" % e)
cli.add_command(get_250)
cli.add_command(get_box_office)
cli.add_command(get_theaters)
if __name__ == "__main__":
cli()
When running the complete code, you can pass an argument after the filename from the
command list. Passing no argument shows the help page. Each argument runs only the API
endpoint call linked to the click command.
Adding the get-theaters argument to the command line calls the IMDb API endpoint for
in_theaters and returns the results in JSON format:
infrastructure your team owns, and manage, these are still your API customers. Likewise, if
you build an API for external-use customers, your company is providing an API service that
will create their experience. Often the best API design comes from asking the customers or
consumers what they want from the API, such as workflow integrations, alerting/monitoring,
and what applications will be accessing the API.
A lot of enterprise services are already up and running, and an API is often built on top of
this fully functioning infrastructure. This method is called API inside-out design. This
type of design has number of advantages because it allows the design to be modeled on the
current infrastructure and services. The API resembles normal operations, but many of the
interactions are shifted and then performed via an API. This method allows API developers
to select the most common workflows and request and work on them first. This approach
shows problems such as security flaws or performance factors. Some of the downsides to
this approach are that you do not have access to customer/developer feedback. The down-
sides of this lead to wasted code cycles and trying to over-replicate the functionality of
back-end systems. Often, in an inside-out approach, if the API is designed or built during
an ongoing project, the API is thought of as being bolted on. This results in a bad developer
experience and low standard in API design in most cases. Having a good developer experi-
ence and a high standard is the main reward of using an API-first design, or API outside-in/
user interface (API first approach). This design considers what functionality is required
by the consumers or customers of the API and what they will be asking for. Also, this design
covers what is important to the end developers—for example, which features they need and
how easily the developers or consumers can integrate their applications and workflows. An
outside-in API design allows for a lot more elasticity, covering more use cases in single API
calls, thus leading to fewer API requests and calls being made by consumers and less traffic
on the wire and back-end infrastructure for the provider. Outside-in API design leads to a
more elegant API and one that a developer or consumer can use.
Good API REST architecture and design follow the HTTP methods GET, PUT, PATCH,
POST, and DELETE, as described in Chapter 5, “Network APIs.” Most APIs operate over
HTTP, which allows a solid base for these APIs to be designed over. When you’re naming
resources, either singular or plural is acceptable if resource names are maintained and consis-
tent on all APIs. The same goes for the use of capitalization.
Singular API naming resource:
/data/device
Plural API naming resource:
/data/devices
HTTP methods such as GET and PUT should not be included in endpoints. The endpoint
should contain only nouns because the API call is transferring state, not processing instruc-
tions. For example:
/data/getDevice
For further reading on API design, consider Roy Fielding’s 2000 dissertation, “Architectural
Styles and the Design of Network-based Software Architectures,” which introduced REST
(https://www.ics.uci.edu/~fielding/pubs/dissertation/fielding_dissertation_2up.pdf). Also,
see RFC 3986, “Uniform Resource Identifier (URI): Generic Syntax,” by Tim Berners-Lee,
Roy T. Fielding, and L. Masinter (http://www.rfc-editor.org/rfc/rfc3986.txt).
headers = {
'Content-Type': 'application/yang-data+json',
'Accept': 'application/yang-data+json',
'Authorization': 'Basic YWRtaW46Q2lzY28xMjM='
}
If you were to copy and paste the Base64 details into a decoder tool, you would see that the
developername:password is admin:Cisco123. When you’re using Base64 as the authenti-
cation method, it is highly recommended that you employ additional security mechanisms
such as HTTPS/TLS.
Bearer authentication (token authentication) depends on how the API is defined. For exam-
ple, basic authentication requires the request to send a developer name and password or
identifying credentials to the requester; again, the developer account is set up on the server.
This is often why bearer authentication is considered an extension of basic authentication.
After the initial request is made via the credentials (typically, an HTTP POST request), the
request asks for a token to be returned. Further API calls will leverage the token; there is
no further requirement to send the developer credentials for subsequent API calls, only the
token itself. Hence, the word bearer implies the developer has a ticket (token), and if it is
valid, access is permitted. Tokens can be short-lived and last a few hours, or they can be
long-lived. Some tokens can be refreshed and might last a week, or they can be nonrefresh
and nonexpiring. There is no fixed design when it comes to a token’s lifetime or expiry when
designing an API. Companies choose the option or mixture of options that best meets their
API design requirements.
A bearer token is an opaque string that is generated by the server in response to a developer
login request and not intended to have any meaning to clients using it. The client must send
this token in to the authorization header when making requests to protected resources. The
bearer token can have a lifetime too; for example, Cisco DNA Center’s API uses token-based
authentication and HTTPS basic authentication to generate an authentication cookie and
security token that are used to authorize subsequent requests. By issuing an HTTP POST
request API call, Cisco DNAC returns a token in the response body:
{"Token":"<token>"}
As with the basic authentication method, additional security mechanisms such as HTTPS/
SSL should be employed.
As you saw with the IMDb API example, API keys are another method of API security.
Developers can request an API key. However, the API keys don’t identify developers; they
identify projects.
The key is sent in the query string, in the request header, or as a cookie. Like a password,
the key is meant to be a secret that is known only between the client and server. API keys
are not considered secure; they are accessible to clients, making it easy for someone to steal.
When an API key is stolen, it generally does not expire, making the API key reusable indefi-
nitely, unless the API key is revoked or a new key is regenerated.
Best practice suggests that API keys are least secure when sent in a query string, and it is
strongly suggested that API keys should be used in the authorization header instead of a
query.
API keys should be considered for use in authentication if an API needs to block unidenti-
fied traffic or to limit or rate limit the number of calls made to an API. This method allows
filtering by logging the key to the API. The IMDb API does this, limiting the free account
to 100 calls per day. However, this limit could be as short as per hour or per minute. As with
the basic authentication method, additional security mechanisms such as HTTPS/TLS should
be employed when using API keys.
Cookie authentication uses the HTTP cookies to authenticate the client requests and main-
tain session information on the server over the stateless HTTP protocol. The client sends a
developer name and password, as you saw with basic and bearer authentication. The devel-
oper account is set up on the server to establish a session. When it is successful, the server
response includes the Set-Cookie header that contains the cookie name, value, and expiry
time. With every subsequent client request to the server, the client sends back all previously
stored cookies to the server using the cookie header.
Cookie authentication is vulnerable to cross-site request forgery (CSRF) attacks, so using
CSRF tokens for protection is recommended. This feature adds protection against CSRFs
that occur when using REST APIs. This protection is provided by including a CSRF token
with API requests. You can put requests on an allowed list so that they do not require pro-
tection if needed.
The last authentication method to discuss is OAuth 2.0, which is the industry standard.
OAuth 2.0 provides limited access to a client. The OAuth protocol for getting an access
token or refresh token is called flows (or grant types). The flows allow the resource owner
to share the protected content from the resource server without having to share credentials.
Several popular flows are suitable for different types of API clients when using OAuth2.0:
■ Client: The client is the application requesting access to a protected resource on behalf
of the resource owner. This access must be authorized by the developer, and the
authorization must be validated by the API.
■ Resource owner: Typically, the resource owner is the developer who authorizes an
application to access their account.
■ Resource server: The resource server hosts the protected resources. This is the API
you want to access. 6
■ Authorization server: The authorization server verifies, identifies, and authenticates
the resource owner and then issues access tokens to the application.
The most-used implementations of OAuth are access tokens or refresh tokens. A designer
might choose to use one or both methods:
■ Access token: The server issues a token to clients, much like an API key, and then it
allows the client to access resources. However, unlike an API key, an access token can
expire. After the token expires, the client has to request a new one.
■ Refresh token: OAuth 2.0 introduced an artifact called a refresh token, which is an
optional part of an OAuth flow. When a new access token is required, an application
can make a POST request back to the token endpoint using the grant type ‘refresh_
token’. This request allows an application to obtain a new access token without having
to prompt the developer.
■ Cursor pagination
If a developer is working with smaller subsets of data, pagination improves the API response
time. It enables the developer to not return everything in a single response when returning
data with multiple replies. This also prevents huge volumes of data traveling over a network,
and for this reason, pagination can help conserve bandwidth and memory. Overall, pagina-
tion in API design helps with the end developer experience. When developers are designing
an API to support pagination, the most-often-used method is to have predetermined page
size. Defining the right size of data to be returned per page is normally based on several fac-
tors, such as use, capacity, and end developer experience. By using page size, a developer
can limit how much data or item per page is returned. Default limits are often built in, but
when dealing with unknown limits, the recommendation is to allow developers to set a limit
themselves. If an API response does not contain a link to the next page of results, this would
mean the developer has reached the final page. For example, the default limit request might
return 20 pages, and due to the size, this request is slow and hard to parse. With pagina-
tion, the developer can choose to view the required number of pages and items on the page.
This could be a single page or a preset range of pages—say, pages 5 and 6. An API could
allow the developer to select a starting page—for example, page 10 through page 15; this is
referred to as offset pagination.
The GitHub API supports page-based pagination. By using cURL and the GitHub stargazer
tool, you can list the people who have starred the repository at ciscodevnet/yang explorer
(see Example 6-12). To fetch all the stargazers for ciscodevnet/yang explorer, you would have
to make two requests.
Example 6-12 GitHub Stargazer Tool Listing the People Who Have Starred the Reposi-
tory at ciscodevnet/yang explorer
curl -I https://api.github.com/repos/ciscodevnet/yang-explorer/stargazers
HTTP/2 200
server: GitHub.com
date: Mon, 23 Aug 2021 10:00:51 GMT
content-type: application/json; charset=utf-8
cache-control: public, max-age=60, s-maxage=60
vary: Accept, Accept-Encoding, Accept, X-Requested-With
etag: W/"8e407f3e6d12cb560d3e01b0f42804a29724a5a4894174e4e87301b99ec4532e"
x-github-media-type: github.v3; format=json
link: <https://api.github.com/repositories/42690240/stargazers?page=2>; rel="next",
<https://api.github.com/repositories/42690240/stargazers?page=13>; rel="last"
[output removed for brevity]
When making a REST API call to a Meraki Dashboard API, after you make a GET request
to an endpoint, the results that are returned could contain a huge amount of data. Using
pagination on the results ensures responses are easier to handle because only a subset of the
results is returned in the first response. If you need to get more data, you can execute subse-
quent requests to the endpoint with slightly different parameters to get the rest of the data.
The Meraki Dashboard API allows developers to sort data. The developer also can deter-
mine the data before it is sent back. An example is timestamps. Developers can use the
timestamp values startingAfter and endingBefore if they want to paginate based on time.
Using a timestamp on data can be very effective when using real-time data. However, offset
pagination does not care that the data has been modified; it just gets the same data at the
offset (in this example, page 5). If real-time pagination is required, cursor-based pagination
should be used. APIs on social media sites such as Facebook and Twitter use cursor-based
pagination. Cursor-based pagination does not use the concept of pages; because the data
changes quickly, results are considered either “previous” or “next.” If there is no concept of
pages, this means developers cannot skip to a certain page, as you saw with offset pagina-
tion; cursor-based pagination works by returning a pointer to an exact item in the dataset.
Some APIs (Facebook being one) do support both methods of pagination. If page selection
is required, offset-based pagination tends to be used. If the API is to provide performance
and real-time data, cursor-based pagination is used.
In both offset-based pagination and cursor-based pagination, developers poll or call the API,
which is the most typical use of APIs where a developer or application is making a request
every 30 seconds—for example, in the client/server architecture. This means that if the client
or application is sending a request every 30 seconds to the server, new data could be avail-
able for 19.9 seconds with the client or application.
Another drawback could be that the API is rate limiting calls per hour or per day. This is
similar to the concept of SNMP polling versus event-based model-driven telemetry, where
most SNMP polling was done every 5 to 10 minutes; this left enough of a gap for customers
to notice their router interfaces were down before their provider (the one they were paying to
monitor) noticed!
Figure 6-12 shows stateless versus stateful connection differences between RESTful APIs and
streaming APIs.
6
Send Request
REST Client/Application API
Send Response,
Close Connection
Send Request
Streaming Client/
Application API
API provider (which could be an SDN controller such as Cisco SD-WAN or Webex Teams)
can send an HTTP POST request to an external system in real time after an alarm or event is
received. When you create a webhook for a particular event or alarm, the notification data
is sent as an HTTP POST (often in a JSON format). The streaming method still operates over
the same HTTP methods as RESTful API, but this is still a unidirectional method. Even if the
client sends the initial connection, HTTP and HTTPS are unidirectional protocols where the
client always initiates the request.
The Streaming API methods commonly utilize the WebSocket protocol, which is standardized
by the IETF as RFC 6455 (https://datatracker.ietf.org/doc/html/rfc6455). A WebSocket uses a
bidirectional protocol, and there are no predefined message patterns such as request/response.
The client and the server can send messages to each other. Unlike the request/response method
seen with REST API, using a WebSocket allows the client or server to talk independently of one
another; this is referred to as full-duplex. A WebSocket does not require a new connection to be
set up for each message to be sent between the client and server. As soon as an initial connec-
tion is set up, the messages can be sent and received continuously without any interruption. The
trade-off here is that having a communication open all the time does increase the overhead of
resources because the client and the server communicate over the same TCP connection for the
duration of the WebSocket connection lifetime.
From this information, you can see the details. The 401 error code field is an integer coding
the error type. Next to this is a string giving the unauthorized message. Both the code and
message are mandatory. The last line is the description; this is an optional field in the error
response. If it is used in the error format, it can provide detailed information about which
parameter is missing or what the acceptable values are. The Cisco DNA Center API does pro-
vide this information, and from the output, you can see that this is a string. In this case, the
error denotes that the incorrect credentials were provided in the request.
The only other optional information not shown in the example is the infoURL field. When
used in the error response, it would provide a URL linking to documentation.
API timeouts should be handled by the following error codes: 408, 504, and 599. If a 408
error code is returned when consuming, the client or application did not produce a request
within the time that the server was prepared to wait and thus shut down the connection. In
the RESTful client/server architecture, a client consuming an API does not send the complete
HTTP request within three minutes. In this case, the connection is closed straightaway, and the
server should issue a 408 error code within the header field back to the client in the response.
In some cases, the client might have a remaining request that has been delayed on the wire. In
this case, the client could repeat the request but only by creating a new connection.
The following output shows HTTP error code 408 received by the client:
Content-Length:0
Connection: Close
The 504 status error is a timeout error. It denotes that a web server attempting to load a page
for you did not get a response within a given timeframe from another server from which
it requested the information. It’s called a 504 error because this is the HTTP status code
that the web server uses to define this type of error. Typically, the error can happen for a
few reasons, but the two most common are that the server is overloaded with requests or is
under some form of attack (DDoS), or some maintenance is being performed at the time the
request was sent. Other possible issues could be DNS or firewall rules. From an API design
side, you should ensure that the API Gateway sends an HTTP response to the client and set a
maximum integration timeout, such as 30 seconds.
Nginx (https://www.nginx.com/) is good choice as an API gateway because it has advanced
HTTP processing capabilities needed for handling API traffic. It also can add higher time-
outs such as connect_timeout, send_timeout, or read_timeout if there are known or
expected network latency issues to the API destination (see Example 6-14).
Example 6-14 A 504 Error Code Generated by an Nginx API Gateway
<html>
<head><title>504 Gateway Time-out</title></head>
<body bgcolor="white">
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx/1.13.12</center>
</body>
</html>
Also like error code 504 is error code 599. This status code is not specified in any RFCs but
is used by some HTTP proxies to signal a network connect timeout behind the proxy to a
client in front of the proxy. This error handling code is issued if a server takes too long to
respond (most people have encountered this typical error when requesting information over
the Internet). The server should generate this error code to prevent the developer or applica-
tion from waiting in an endless state for a reply. The error should be triggered by the server
being overloaded; thus, the response back to the client or application notifies them that the
server is taking too long to respond. When a client makes an API call to a server, the server
only begins the countdown timer when the server receives the request and begins counting
how long it takes to respond. In most cases, this error is caused by network latency.
The common element of error codes 408, 504, and 599 is that they are seen and used in
cases that can be linked to limiting—whether the issue is limited resources on the server, the
network, or timing. Rate limiting in an API is an important design concept because it pre-
vents an API from being overwhelmed by too many requests. Consequently, API rate limiting
is put in place as a method of defense. No matter whether the request is genuine or mali-
cious, rating limiting should be implemented in an API design to ensure service availability
and performance.
As you saw with the IMDb API, rate limiting is applied based on the developer’s account. As
you can see in the example shown in Figure 6-13, even the top tier account does not permit
unlimited use. This method used by IMDb and other API providers is referred to as a devel-
oper rate limiter in which the number of API requests is tied into the developer’s account
or API key. Another type of rate limiter is a concurrent rate limiter. This method tracks or
limits the number of parallel sessions. They could be limited based on developer, IP address,
API key, or any other type of identifier—even developer location, such as country. Concur-
rency is used to help prevent intentional attacks such as DDoS attacks.
NOTE When limits are reached, the APIs return a limiting signal or a 429 HTTP response.
administered via a time interval. This permits only the fixed number of requests in the time
allowed. For example, if a client or application is allowed to make 1000 calls per day, they
could make these calls within 30 minutes and saturate the API resources.
Sliding window addresses the issues that can occur with the fixed window algorithm. The
sliding window algorithm, like the leaky bucket, helps smooth out traffic. It can do this
by setting a counter, but it considers the previous window to shape the traffic, estimating
the size of the current request rate for the current window. This creates a rolling window
smoothing out the traffic.
Caching
Caching can improve the performance of an API and lower the number of requests being
sent to the server. API caching achieves this result by using an expiration mechanism. On the
server, caching reduces the need to send a full response back to the client. In both cases, it
reduces network bandwidth requirements.
There are two main types of cache headers: expires and cache-control. The expires header
sets a timer for a date/time in the future, and after it expires, the client needs to send a
new request to the API. Cache-control is a preferred expiration mechanism (or a failsafe for
expiration caching). The cache-control header states how long the cache will be available.
Cache-control has several options, referred to as directives. The directives, described here,
determine whether a response is cacheable and can be set to specifically determine how
cache requests are handled.
■ no-cache: The content can be cached but must be revalidated for every request until
the server provides the content. This directive checks to see if there is an updated ver-
sion. The no-cache directive uses the ETag header (a response header uses a token that
identifies the version or state of a resource if a resource changes, as does the ETag).
■ no-store: This directive is normally used for sensitive data (such as credit card or bank
details). This means the cache cannot be stored.
■ public: This directive means the resource can be stored by any cache and by any inter-
mediate caches.
■ private: Unlike public, this directive can be cached only by a client and not by an
intermediate cache.
■ max-age: This directive indicates the maximum time that a cached response should be
used (seconds). The maximum value is 1 year (31,536,000 seconds).
Figure 6-14 shows the three types of API cache models. The first is a client, or private, cache
model; it lives on a client machine. Typically, this type of cache is used by an individual
developer or application. A proxy cache resides between the client and the API server. Some
content delivery networks/content distribution networks (CDNs) and ISPs (Internet
service providers) use a proxy cache because the API is meant to be used by many develop-
ers or applications. The final example is an API server, or gateway, cache; it is configured on
the API server itself. An API gateway responds to a client or application request by looking
up the endpoint response from the cache instead of making a new request.
REST Client/
Application Proxy API
Transport Layer Security (TLS), software development kit (SDK), OpenAPI Specification
(OAS), content delivery network/content distribution network (CDN), SwaggerHub, API
inside-out design, API outside-in/user interface (API first approach), pagination
References
URL QR Code
https://swagger.io/tools/swaggerhub/
URL QR Code
https://imdb-api.com/swagger/IMDb-API/swagger.json
https://github.com/CiscoDevNet/DNAC-Python-SDK
https://en.wikipedia.org/wiki/Command-line_interface
https://click.palletsprojects.com/en/8.0.x/why/
http://www.rfc-editor.org/rfc/rfc3986.txt
https://datatracker.ietf.org/doc/html/rfc6455
https://www.nginx.com
Application Deployment
■ CI/CD Pipeline Implementation: This section helps you understand CI, CD, and GitOps
principles and functionality and how application deployment has changed over time.
■ Software Practices for Operability: The 12-Factor App: This section explores key
components that make for scalable, available, and observable applications in both on-
premises and cloud-based data centers.
This chapter maps to the Developing Applications Using Cisco Core Platforms and APIs
v1.0 (350-901) Exam Blueprint Section 4.0, “Application Deployment and Security,” specifi-
cally subsections 4.2, 4.5, and 4.6.
Traditionally, after an application was ideated, created, and developed, the responsibility of
the development team ended, and the resulting work product was given to the operations
teams to deploy and maintain for the end users. However, over the past decade, the demarca-
tion that once existed between the two teams has blurred, especially as software develop-
ment methodologies became iterative and agile, rather than releases at specific points in time
when features obtained critical mass. The resulting change in approach has created not only
many new ways that applications can move from code to production but also changes in the
ongoing maintenance and improvement of the application, as well as where that application
resides in relation to both the application owner and the users.
To this end, several organization structures and philosophies have been borne that enable
organizations to make such a drastic shift, including the hybridization of development and
operations (DevOps) and teams focused on maintaining and ensuring the stability of the
application and its underlying infrastructure using automated processes and enhanced vis-
ibility enabled through new application design approaches. These teams work in tandem,
leveraging automated code test, build, and deployment methodologies for both infrastruc-
ture as well as applications to ensure that applications are delivered reliably and consistently
to everyone, regardless of whether those applications are hosted in a private or public cloud.
This chapter provides context around this foundational shift in application development
and delivery, starting with the early days of developers and system administrators, through
modern-day agile teams delivering microservices-based applications to a cloud provider with
the click of a mouse or a commit to the main branch of code. Although the focus of this
chapter is on applications, methodologies appropriate and required for the underlying infra-
structure of the application are covered also, because this is a foundational requirement for
Foundation Topics
The Evolution of Application Responsibilities
Application development and delivery have gone through a drastic shift in the past several
decades, creating new roles, methodologies, and processes along the way. Prior to any dis-
cussion of how things have evolved, it is important to set a baseline of terminology and tools
because the evolution of these processes, methodologies, and mindsets has created a bit of
confusion and blurring of lines, even for the most experienced individuals. After the defini-
tions have been established, it becomes much easier to see where tooling, processes, or both
are required to achieve the end result of application deployment.
the applications that directly impact anyone who uses those applications or services. These
developers would work within their defined teams to prioritize the tasks that were to be
accomplished within the window of time in which the work was slated. When the work was
completed and tested, the resulting work product was shipped to the operations team to
deploy and run.
Where development left off, operations picked up. This team (or set of teams, depending
on the application and infrastructure) was responsible for keeping the application or sys-
tem functional and responsive for the end customers. This team was also thought of as the
front-line support for any issues that manifested, either through the deployment or through
operation of the application/system at scale (both in network and in load) that wasn’t feasible
during testing in the development lifecycle.
You can easily see how these two teams could quickly come in conflict with one another:
operations teams blaming developers as being “lazy” in their code by making assumptions or
not testing the full suite of features/options/functionality and developers blaming operations
for being resistant to change and always saying “no” to updates or new deployments to sup-
port the application/system. This conflict, which is very unproductive for cohesive culture
within an organization, also creates problems for end users of the application or system
because features are slow to be rolled out and adopted against the codebase, resulting in a
less than satisfactory experience.
While the root causes for the distrust vary by organization and team, much of the friction is
caused by the seemingly opposing ideas of the two teams: development is supposed to add new
features, whereas operations is tasked with keeping the application or system running at all times.
New features or bug fixes merged into the production codebase could have unintended knock-on
effects that could require operations to troubleshoot an issue beyond their control, while devel-
opment must respond to the needs of the end users to ensure a quality experience, attract new
users, or prevent users from abandoning the service for something else. 7
You could additionally make an argument that some of the issue stems from trust between the
teams. Neither team believes that the other is operating in good faith and is only focused on the
metrics that they are judged on, rather than working in tandem to ensure that the ultimate metric
is achieved, supporting end users throughout their journey within the application or service.
In addition to the cross-functional nature of the teams, there are several generally accepted
practices for DevOps teams to implement within their environment.
■ A version control system (VCS) or some other source code management (SCM)
platform
■ A real-time communications system that can link human interactions with feedback
from observability instrumentation
A Cultural Shift
What often gets lost in any discussion of DevOps is that an organization does not simply
“buy DevOps.” It is not the tools, the process, or even a job role at its core; it’s a cultural
change and standard method of operation that needs to be adopted within an organization.
This requires an organization that embraces change and failure, is open and transparent in
communication (and systems, code, and metrics), and most importantly, creates and fos-
ters trust between organizations. Only through embracing this culture can an organization
decide on the actual implementation of how DevOps will look internally, from roles and
responsibilities, tooling, and operational processes. These processes work toward these
goals of embracing a culture of change and agility, ensuring sources of truth for all code that
all have visibility to, providing testing and validation of code prior to release, and enabling
automated infrastructure changes, allowing teams to reduce errors and focus on higher-level
problems associated with enabling the business. However, the tooling, team organizational
and reporting structures, and even the methodologies used to release the software vary dras-
tically from organization to organization (and sometimes between business entities within
an organization). This is the flexibility (and some would say challenge) with implement-
ing DevOps; there’s no single way to perform these functions, and it can take many small
changes or iterations to settle on the exact tooling and procedures for an organization or
entity. As DevOps has roots in agile software development methodology, some would say
this is a feature rather than a bug.
Modern-day SREs generally have backgrounds in software design and engineering, systems
engineering, system administration, and network design and engineering (or some permuta-
tion of the listed competencies). These individuals act as a conduit, assisting in bridging
the gaps between operations and development, while also providing leadership and vision
beyond the sprint, anticipating and remediating potential issues before they appear to ensure
utmost reliability. In its purest form, SRE can be practiced and performed by anyone; it
needn’t be a specific job title.
The distinction here, while somewhat pedantic, is important when referring to the differ-
ences in organizational concepts and processes. DevOps principles work on breaking down
the walls that exist between development and operations through culture and a blending
of teams and responsibilities (though everyone has their own unique strengths and weak-
nesses), whereas SRE looks to solve the silo problem through adding additional people with
the responsibility of translation to the process. Neither solution is right or wrong, but is a
by-product of the organization and business unit to which these principles are applied, the
current talent of the contributing teams, and the ability to upskill and learn new processes.
To use the common answer within IT, “it depends.”
■ A rigid focus is placed on reliability above and beyond what would be considered
acceptable through definition of SLAs/SLOs. This could be achieved through systems,
network, or application design, or a combination of the three to achieve the desired
reliability, latency, or efficiency specs that exceed objectives.
Out of these practices, you could create principles that SRE should strive for that are tangen-
tial to these tenets. Again, there is no common standard for these principles, but by combing
through SRE requirements docs posted to job boards, you can see the varying responsibili-
ties that these individuals can have, but you can draw a parallel with these responsibilities
against these core tenets. Some SREs specifically focus on things like capacity planning and
infrastructure design, which align with the second and third points, because planning and
design can’t occur without proper understanding of the current environment’s load, latency,
and growth in usage over time (which requires observability tooling). Other SREs may be
focused on the operational efficiencies gained through creation of CI/CD pipelines, which
focuses on the energy, time, and frustration spent in moving apps from development to pro-
duction in a secure manner (the “toil” mentioned previously) but doing so in a fashion that
ensures reliability and uptime of the application and system. This function touches on all
three principles because automated infrastructure and deployments are beneficial only when
uptime is compromised and any resulting issue with the application/system is identified
through detailed observability at all layers.
The choice of VCS and CI/CD provider is driven mostly by the organization, collaboration
and security requirements, and third-party integrations with other functions (such as secu-
rity and vulnerability scanning). Typical VCS include git, svn, hg, or p4 (though git is most
prevalent), any of which could be hosted on-premises or through an SaaS offering. After
the VCS system is selected, the desired CI/CD platform can be chosen. Some of them, such
as Jenkins, are external applications that require integration into the chosen repositories,
whereas others, such as GitLab-CI, integrate directly with GitLab to provide a single portal
for both source code and pipelines. For the scope of this section, the choice of VCS and CI
provider is largely inconsequential to the overall understanding of the functions they pro-
vide, and this section focuses on those functions.
As mentioned earlier, the full pipeline provided by a CI/CD system is two discrete functions,
integration and delivery/deployment, which are interrelated but serve two purposes and
divisions in labor. The following sections describe this division and the functions that occur
within the two subsystems in more detail.
a specific feature. However, once the code is merged from the feature branch into the main
code, it may undergo additional testing from vulnerability or security scans to ensure that
the entire package as written does not contain a known vulnerability, which may not have
been tested adequately in performing the same security analysis against the code for a single
feature. This level of customization is entirely possible within the pipeline.
into production, but it is entirely within the realm of possibility that it may pass the required
tests but not function correctly in production. Because of this, strong adherence to the
branch/merge discipline, a focus on unit and functional testing, and a continuous improve-
ment cycle for the pipeline should all be implemented to adhere to the rigorous uptimes
required by production systems, applications, and services.
stages:
- validate
- deploy_to_prod
- deploy_to_test
- verify_deploy_to_prod
- verify_deploy_to_test
7
- verify_website_reachability
lint:
stage: validate
image: geerlingguy/docker-centos8-ansible:latest
variables:
script:
- ansible-playbook --syntax-check -i inventory/prod.yaml site.yaml
- ansible-playbook --syntax-check -i inventory/test.yaml site.yaml
deploy_to_prod:
image: geerlingguy/docker-centos8-ansible:latest
stage: deploy_to_prod
script:
- echo "Deploy to prod env"
- ansible-playbook -i inventory/prod.yaml site.yaml
environment:
name: production
only:
- master
deploy_to_test:
image: geerlingguy/docker-centos8-ansible:latest
stage: deploy_to_test
script:
- echo "Deploy to test env"
- ansible-playbook -i inventory/test.yaml site.yaml
environment:
name: test
only:
- test
verify_test_environment:
image: ciscotestautomation/pyats:latest-robot
stage: verify_deploy_to_test
environment:
name: test
script:
- pwd
- cd tests
# important: need to add our current directory to PYTHONPATH
- export PYTHONPATH=$PYTHONPATH:$(pwd)
- robot test.robot
artifacts:
name: "TEST_${CI_JOB_NAME}_${CI_COMMIT_REF_NAME}"
when: always
paths:
- ./tests/log.html
- ./tests/report.html
- ./tests/output.xml
only:
- test
verify_prod_environment:
image: ciscotestautomation/pyats:latest-robot
stage: verify_deploy_to_prod
environment:
name: test
script:
- pwd
- cd tests
# important: need to add our current directory to PYTHONPATH
- export PYTHONPATH=$PYTHONPATH:$(pwd)
- robot prod.robot
artifacts:
name: "PROD_${CI_JOB_NAME}_${CI_COMMIT_REF_NAME}"
when: always
paths:
- ./tests/log.html
- ./tests/report.html
- ./tests/output.xml
only:
- master
internet_sites:
image: ciscotestautomation/pyats:latest-robot
stage: verify_website_reachability
environment:
name: test
script:
- pwd
- cd tests/websites/
- make test
only:
- master
Pipeline Components
A CI/CD pipeline is composed of several stages, each of which has a defined action or set
of actions defined within it. These stages are declared at the top of a CI/CD definition file 7
(usually written in YAML, but varies based on CI/CD provider) and define the process in
moving the code from source to finished product. Figure 7-1 shows the process for a simple
CI/CD pipeline in which the code progresses serially through each of the defined stages of
the process.
Build
The build stage is responsible for moving the code from its source to a verified and final
product. The resultant product differs depending on source code, because languages like C,
C++, Java, Python, and Go create binary files (or bytecode), whereas languages like Ruby do
not. However, the result is that the code has been linted (checked for syntactical errors/
omissions), compiled, and (optionally) built into a Docker container.
Test
After the image is built (regardless of binary file, Docker container, or otherwise), automated
testing helps validate that the code is functional at a base level. This may include performing
unit tests of individual components of the functions of the code, performing load or perfor-
mance testing, or validating interoperability with resources required by the application. By
automating these tests, developers ensure their apps deploy correctly to the environment and
behave as anticipated or expected. Thorough testing and validation are required to ensure
that the application is functional as expected; untested functions or integrations are not
caught in this stage and can be a blind spot for developers. Tools like Codecov exist to pro-
vide an understanding of how much of the written code is covered by tests within the pipe-
line, ensuring that as new features are incorporated into the application, testing and coverage
can evolve to ensure the automation doesn’t provide a false sense of security.
Testing also needn’t be a singular phase within the pipeline. Although things like unit tests
should be done at the front end of the pipeline, integration and interoperability tests may
not occur until the release is deployed within an environment (because there may be no way
to perform this test within the pipeline without a deployment). The idea, however, should
always be to catch any issues, exceptions, or errors as early as possible within the pipeline as
is feasible to prevent those problems from leaking into a production environment.
Release/Deliver
If everything succeeds during the testing phase, the software is moved to the release (or
delivery, if keeping with the continuous delivery convention) phase. In this stage, the
compiled and tested software is moved to an artifact repository so that it is ready to be
deployed, whether in manual or automatic fashion. This artifact repository is kept separate
from the source code repository because they serve different functions, and it is impractical
to include compiled binaries that developers do not need within the main repository.
The artifact repository varies depending on the tooling and privacy required for the finished 7
application. For internal applications leveraging an on-premises CI/CD pipeline, the artifact
could be internal file storage accessible via HTTP/FTP, JFrog Artifactory, or an internal
Docker registry. If the application is deployed via SaaS applications (such as GitHub), the
registry could simply be the “Releases” section of the repository or a container pushed to
Docker Hub. Regardless of the location of the repository, the CI/CD pipeline supports the
passing of credentials through environment variables or through a secret keystore (such as
Vault) to ensure that the credentials can be changed as needed and to ensure that they are
not exposed within the pipeline definition file (because it is committed as part of the source
code repository). The resulting release can signify the end of the pipeline if the goal is only
to deliver the application (for example, delivering an application within a container to be
consumed by other users within their environments), or if moving the application is required
to be a manual process due to a business process.
Deploy
After the application has been released and stored within the artifact repository, it can then
be deployed to one or more environments. To deploy the resultant application to the infra-
structure, the pipeline must be made aware of the systems on which it is to deploy the appli-
cation. While the deployment methodology varies based on the organization or development
team, it is common practice to use some sort of automated configuration management plat-
form, such as Puppet, Ansible, or SaltStack. If the infrastructure supporting the application
■ Rolling: The application is updated on all nodes in a serial fashion, resulting in users
potentially hitting different versions of the application, depending on how the ingress
traffic is directed, the number of nodes requiring the new release, and the time
required to move the node to the new release.
When the deployment is deemed to be a success, the green side can be updated to
the current version of the code/application and used as the environment for which the
next release is deployed to first.
■ Canary: This release is similar to the rolling release, in that nodes are gradually updat-
ed to the new version of the application. However, thresholds define the stages of the
rollout, limiting the scope of impact in a failure. For example, a rollout may start with
15 percent of the nodes being rolled out initially. When this segment’s rollout is suc-
cessful, the deployment would then deploy to another 35 percent (making 50 percent
of the nodes) and then finally deploy to the remaining 50 percent.
The canary rollout offers a failsafe above the rolling release without the need for paral-
lel infrastructure required for a blue-green deployment. However, ensuring a successful
canary rollout is nontrivial due to the instrumentation and feedback loops required to
determine “successful” rollouts. Some CI/CD providers offer integrations to enable a
canary rollout, but they require both nondefault configuration to be enabled and infra-
structure that supports the required metrics to be returned to determine the success
(such as K8s).
delta between the current and desired state. The previous state is saved such that even after
the configuration is applied, the previous state can be restored through a rollback action.
Second, Terraform functions on the concept of immutable infrastructure, devices, and tar-
gets that are meant to be unchanged, and if any change to the infrastructure is required, it
generally destroys and re-creates with the new configuration (rather than updates in place)
though there are exceptions to this. Finally, Terraform is written in Go and does not include a
mechanism for accessing devices over SSH. The only way that Terraform can interface with a
device, controller, or cloud is via REST API. However, Terraform, using its redundancy graph,
ensures that regardless of how the configuration is written within the HCL file, it is applied
in the correct order against the REST APIs; stepwise or ordered configuration is not required
for Terraform to interact with the target’s APIs.
Because Terraform defines the end state of the infrastructure in a codified fashion
(Infrastructure as Code, IaC), it lends itself to being included as part of a CI/CD pipe-
line to effect change on the devices or cloud on which applications will be deployed. In
an infrastructure pipeline, the HCL configuration files are committed to a branch within a
VCS repository, and the pipeline runs the required init, plan, and apply functions against the
branch’s environment (for example, test). After the configuration is applied, automated test-
ing, such as reachability or verification of the infrastructure control plane, can be performed
to ensure the desired outcome is achieved through the configuration. When these tests are
completed, the branch can be merged into the main branch, kicking off a deployment to the
production infrastructure (and subsequent testing, if desired).
Using Terraform is not without its challenges, however. To assist with this effort, applications
such as Atlantis were created, providing an interactive experience between the VCS and the
application of the HCL configuration. Atlantis works by listening for webhooks from the
VCS, which cause it to perform a Terraform action, such as “plan.” When Atlantis performs
the action, it returns the results of the action to the VCS via API. These results are displayed
within the VCS pull request management menu. A workflow would look like the following:
1. An infrastructure developer creates a Terraform HCL file based on the desired end
state.
2. The HCL file is pushed to VCS with a branch.
3. A pull request is initiated within the VCS.
4. Atlantis is mentioned within the VCS PR comment tool to run a “plan” against the
HCL.
5. Atlantis runs the desired plan against the target infrastructure using Terraform. Results
are then returned to the VCS PR window for review.
6. If all checks out, Atlantis can be invoked again using the apply command to deploy
the change to the infrastructure, and the PR can be merged into the main code branch.
7. If someone else attempts to create a PR within the same directory before the previous
one has been applied, Atlantis prevents the second change from occurring until the
first change is either applied or discarded.
It is possible to test Atlantis using a local machine, a GitHub account, and a port-forwarding
tool such as ngrok. The host running the Atlantis server binary needs access to the VCS and
to the target infrastructure, though if using a developer workstation for testing, the port-
forwarding tool can easily expose a URL for the webhooks to be relayed to the workstation
without worrying about firewalls and NAT. The device can also utilize VPN connections to
the infrastructure, allowing Atlantis to be tested using sandboxes or other infrastructure in
remote locations.
Although the Atlantis documentation walks you through the required setup based on VCS,
the example uses a Terraform “null-resource.” Extending this concept, it is possible to provi-
sion some network infrastructure, assuming a Terraform provider exists for that infrastruc-
ture. In Examples 7-2 and 7-3, a set of Terraform configurations containing a “variable” file,
as well as the desired end state of the fabric, is provided.
Example 7-2 Atlantis variable.tf File
internet_sites:
variable "user" {
description = "Login information"
type = map
default = {
username = "admin"
password = "C1sco12345"
url = "https://10.10.20.14"
}
terraform {
required_providers {
aci = {
source = "CiscoDevNet/aci"
} 7
}
}
# Configure the provider with your Cisco APIC credentials.
provider "aci" {
username = var.user.username
password = var.user.password
url = var.user.url
insecure = true
}
# Define an ACI Tenant Resource
resource "aci_tenant" "atlantis-testing" {
name = "atlantis-testing"
description = "This tenant is created by terraform using atlantis"
}
After this configuration is committed to a branch within a repository and a pull request (PR)
created, Atlantis receives the webhook; it processes a workflow like using Terraform on a
local workstation (specific Terraform commands are seen in Example 7-4).
Example 7-4 Atlantis Receiving the Webhook from GitHub and Performing Terraform
Actions
{"level":"info","ts":"2021-11-29T16:34:15.437-0700","caller":"events/events_control-
ler.go:318","msg":"identified event as type \"opened\"","json":{}}
{"level":"info","ts":"2021-11-29T16:34:15.437-0700","caller":"events/events_control-
ler.go:346","msg":"executing autoplan","json":{}}
{"level":"info","ts":"2021-11-29T16:34:20.841-0700","caller":"events/work-
ing_dir.go:202","msg":"creating dir \"/Users/qsnyder/.atlantis/repos/qsnyder/
devcor-atlantis/1/default\"","json":{"repo":"qsnyder/devcor-atlantis","pull":"1"}}
{"level":"info","ts":"2021-11-29T16:34:21.757-0700","caller":"events/project_com-
mand_builder.go:266","msg":"found no atlantis.yaml file","json":{"repo":"qsnyder/
devcor-atlantis","pull":"1"}}
{"level":"info","ts":"2021-11-29T16:34:21.757-0700","caller":"events/project_finder.
go:57","msg":"filtered modified files to 2 .tf or terragrunt.hcl files: [main.tf
variable.tf]","json":{"repo":"qsnyder/devcor-atlantis","pull":"1"}}
{"level":"info","ts":"2021-11-29T16:34:21.757-0700","caller":"events/
project_finder.go:78","msg":"there are 1 modified project(s) at path(s):
.","json":{"repo":"qsnyder/devcor-atlantis","pull":"1"}}
{"level":"info","ts":"2021-11-29T16:34:21.757-0700","caller":"events/project_
command_builder.go:271","msg":"automatically determined that there were 1 projects
modified in this pull request: [repofullname=qsnyder/devcor-atlantis path=.]","json"
:{"repo":"qsnyder/devcor-atlantis","pull":"1"}}
{"level":"info","ts":"2021-11-29T16:34:21.758-0700","caller":"events/project_
command_context_builder.go:240","msg":"cannot determine which version to use from
terraform configuration, detected 0 possibilities.","json":{"repo":"qsnyder/
devcor-atlantis","pull":"1"}}
{"level":"info","ts":"2021-11-29T16:34:22.059-0700","caller":"events/project_locker.
go:80","msg":"acquired lock with id \"qsnyder/devcor-atlantis/./default\"","json":
{"repo":"qsnyder/devcor-atlantis","pull":"1"}}
{"level":"info","ts":"2021-11-29T16:34:25.066-0700","caller":"terraform/terraform_
client.go:279","msg":"successfully ran \"/usr/local/bin/terraform init -input=false
-no-color -upgrade\" in \"/Users/qsnyder/.atlantis/repos/qsnyder/devcor-atlantis/1/
default\"","json":{"repo":"qsnyder/devcor-atlantis","pull":"1"}}
{"level":"info","ts":"2021-11-29T16:34:26.319-0700","caller":"terraform/terraform_
client.go:279","msg":"successfully ran \"/usr/local/bin/terraform workspace show\"
in \"/Users/qsnyder/.atlantis/repos/qsnyder/devcor-atlantis/1/default\"","json":
{"repo":"qsnyder/devcor-atlantis","pull":"1"}}
{"level":"info","ts":"2021-11-29T16:34:28.251-0700","caller":"terraform/terraform_
client.go:279","msg":"successfully ran \"/usr/local/bin/terraform plan -input=false
-refresh -no-color -out \\\"/Users/qsnyder/.atlantis/repos/qsnyder/devcor-atlantis/
1/default/default.tfplan\\\"\" in \"/Users/qsnyder/.atlantis/repos/qsnyder/
devcor-atlantis/1/default\"","json":{"repo":"qsnyder/devcor-atlantis","pull":"1"}}
This example creates a local artifact (the outfile from Terraform) that can then be applied via
communication in the pull request window, as shown in Figure 7-3.
Figure 7-4 Applying a Terraform Plan Within the PR Workflow Using Atlantis
{"level":"info","ts":"2021-11-29T16:37:58.469-0700","caller":"events/events_control-
ler.go:417","msg":"parsed comment as command=\"apply\" verbose=false dir=\"\" work-
space=\"\" project=\"\" flags=\"\"","json":{}}
{"level":"info","ts":"2021-11-29T16:38:04.360-0700","caller":"events/apply_command_
runner.go:110","msg":"pull request mergeable status: true","json":{"repo":"qsnyder/
devcor-atlantis","pull":"1"}}
{"level":"info","ts":"2021-11-29T16:38:04.373-0700","caller":"events/project_com-
mand_context_builder.go:240","msg":"cannot determine which version to use from
terraform configuration, detected 0 possibilities.","json":{"repo":"qsnyder/
devcor-atlantis","pull":"1"}}
{"level":"info","ts":"2021-11-29T16:38:04.373-0700","caller":"runtime/
apply_step_runner.go:37","msg":"starting apply","json":{"repo":"qsnyder/
devcor-atlantis","pull":"1"}}
{"level":"info","ts":"2021-11-29T16:38:06.490-0700","caller":"terraform/terraform_
client.go:279","msg":"successfully ran \"/usr/local/bin/terraform apply -input=false
-no-color \\\"/Users/qsnyder/.atlantis/repos/qsnyder/devcor-atlantis/1/default/
default.tfplan\\\"\" in \"/Users/qsnyder/.atlantis/repos/qsnyder/devcor-atlantis/1/
default\"","json":{"repo":"qsnyder/devcor-atlantis","pull":"1"}}
{"level":"info","ts":"2021-11-29T16:38:06.490-0700","caller":"runtime/apply_step_
runner.go:56","msg":"apply successful, deleting planfile","json":{"repo":"qsnyder/
devcor-atlantis","pull":"1"}}
The result is a successful application within the PR window, as well as within the fabric, indi-
cating that a new tenant has been created per the Terraform configuration, with the resulting
creation similar to what is seen in Figures 7-5 and 7-6.
this lock is removed and the directory can now have additional changes made to it through
the GitOps workflow. This is reflected in the final webhook output, shown in Example 7-6.
Example 7-6 Atlantis Removing the Configuration Lock Within the Repository
{"level":"info","ts":"2021-11-29T16:38:06.490-0700","caller":"runtime/apply_step_
runner.go:56","msg":"apply successful, deleting planfile","json":{"repo":"qsnyder/
devcor-atlantis","pull":"1"}}
{"level":"info","ts":"2021-11-29T16:43:17.865-0700","caller":"events/events_
controller.go:318","msg":"identified event as type \"closed\"","json":{}}
{"level":"info","ts":"2021-11-29T16:43:23.720-0700","caller":"events/events_
controller.go:360","msg":"deleted locks and workspace for repo qsnyder/devcor-
atlantis, pull 1","json":{}}
Flux works by watching a VCS repository for application deployment manifests to be cre-
ated, modified, or removed. When a manifest is added, Flux deploys the application to the
cluster based on the configuration within the manifest file. If any changes are made to the
manifest (or through the kubectl CLI), Flux replicates those changes within the deployment
of that application (and if a change is made through kubectl, Flux reflects that change by
modifying the manifest and storing it within the VCS). This workflow not only allows for a
source of truth for all applications deployed within a cluster but also ensures that the VCS
remains the source of truth even through changes outside of the repository.
Flux also handles two challenges that arise as a consequence of continuous deployment: han-
dling application version upgrades and providing a staged rollout of applications to produc-
tion. Flux can be configured to periodically scan container registries for updated versions of
applications deployed to the cluster. When a version is incremented in the registry, Flux can
automatically update the tracked deployment manifest within the VCS and then perform a
redeploy action of that app using the updated version. Flux can also interact with Flagger to
provide a staggered rollout of any new or updated applications to the cluster.
A typical workflow using Flux would follow (assuming Flux is installed and bootstrapped):
1. The cluster administrator configures Flux to watch a VCS repository. This becomes the
single source of truth repository for all applications deployed by Flux.
2. The cluster administrator creates a source manifest within the tracked repository. This
manifest provides information to Flux about the application’s source repository, which
branch the application should be pulled from, and some other K8s-specific informa-
tion (like namespaces).
3. The source manifest is then committed to the VCS repository.
4. Flux is then used to create a Kustomize deployment manifest, referencing the source
manifest created in the previous step.
5. The deployment manifest is then committed to the VCS repository.
6. Flux detects the changes within the VCS repository and initiates the installation of
the apps based on the deployment manifest, by pulling the source from the repository
indicated in the source manifest.
Flux has a guide to quickly get started deploying a simple application written in Go on top
of an existing Kubernetes cluster by using an application manifest (Kustomization). However,
this assumes that the user has an existing cluster. Additionally, if an application is deployed
using a Helm chart, some changes are required to recognize the alternative application
description.
It is possible to create a small Kubernetes cluster on a development workstation using KIND
(Kubernetes in Docker). Using KIND, you are able to test Flux without needing a production
Kubernetes cluster.
NOTE KIND is bound by the same limitations as Docker within the host workstation.
As such, for this example, the target system may have insufficient memory to fully deploy
the “sock shop” app. If there is a resource or other contention, the containers are staged for
deployment and the application is partially functional, but the deployment is functionally
the same as one in a production cluster.
After the binaries for KIND and Docker are installed on the target host, you can create a
KIND cluster. To expose the NodePort from the sock shop app, you need to use a custom
KIND configuration at runtime. Example 7-7 should be saved as config.yml.
Example 7-7 Custom KIND Configuration
apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
- role: control-plane
extraPortMappings:
- containerPort: 30001
hostPort: 30001
listenAddress: "0.0.0.0"
protocol: tcp
- role: worker
- role: worker
The cluster can then be created. This command downloads the KIND containers and initial-
izes them such that the cluster will be functional:
--owner=$GITHUB_USER \
--repository=$GITHUB_REPO \
--path=clusters/my-cluster \
--personal
After Flux has been bootstrapped, clone the resulting directory to the local system. This
clone needn’t happen on the system in which Flux is running because the bootstrap process
adds an SSH key to the VCS account to ensure communication. After it is cloned, navigate
to the $GITHUB_REPO/clusters/my-cluster/ directory. Additionally, because the sock shop
app should be deployed within its own namespace, you need to create this on the cluster by
using kubectl:
location of the repository as well as the path within the repository to find the required Helm
chart to deploy the application. They can be given any arbitrary names, but, for ease of man-
agement, use the names sock-source.yml and sock-release.yml. Examples 7-8 and 7-9 pro-
vide examples for the source and release files, including the mapping of the sock shop app
Helm chart to a Flux release.
Example 7-8 sock-source.yml
apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: GitRepository
metadata:
name: sock-shop
namespace: flux-system
spec:
interval: 1m
url: https://github.com/microservices-demo/microservices-demo
ref:
branch: master
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: sock-shop
namespace: sock-shop
spec:
interval: 4m
chart:
spec:
chart: ./deploy/kubernetes/helm-chart/
sourceRef:
kind: GitRepository
name: sock-shop
namespace: flux-system
interval: 1m
values:
replicaCount: 1
After these files have been created, they can be committed and pushed to the repository
tracked by Flux. When you push them to the remote repository, Flux reads the configura-
tion and begins to reconcile the resources and then begins the deployment of the application
to the cluster:
git add -A && git commit -m "Pushing sock-shop config" && git push
After the files have been committed, the logs for Flux can be examined to ensure the appli-
cation will be deployed. By invoking flux logs, you can see the stages that Flux goes through
to deploy the application, the reconciliation of which is shown in Example 7-10.
Example 7-10 Flux Reconciling the Configuration Files Within the Repository
When the reconciliation is complete, the containers begin to deploy on the cluster. You can
examine the status using standard kubectl commands, and when the deployment is complete (or
as complete as possible given potential resource constraints on the end system), you can access 7
the sock shop app by opening a web browser to the KIND host on port 30001, seen in Figure 7-7.
Figure 7-7 The Sock Shop App Running on KIND Deployed via FluxCD
1. Clone source code from VCS using Concurrent Version System (CVS) or Subversion
(SVN).
In the mid-2000s, options like Mercurial and Git appeared.
2. Through documentation or by working with developers, understand any required
headers, libraries, or packages for the software to be compiled.
■ Most applications were written in C or C++ and were required to be compiled
before execution.
■ Java code came with its own dependencies, often requiring both compilation as well
as a JVM to process the runtime.
3. Compile the code on the machine, either using the language toolchain or using a
Makefile within the project repository.
■ This process may be iterative, depending on how exacting the dependencies are.
4. Execute the resulting binary executable file, ensuring runtime errors do not occur.
5. Validate the application function through manual testing, especially if required ser-
vices/connections exist on other servers.
6. Repeat for all devices that require this application.
A sample Makefile is shown in Example 7-11. This example builds an application called edit,
which depends on several other source files to be built prior to the main application build.
When you invoke the top-level item, it subsequently builds all of the other dependent items
as required.
Example 7-11 GNU Makefile
You start the build process by invoking make. This step compiles the required output objects
from source using the C code and header files. After each of the dependent output files
has been compiled, the main application (edit) can be built. The final target (clean) is just a
“phony” target that, when invoked, performs an action defined by the bash script following
the target name.
You can easily see how fragile this process could be, especially given the differences in
installed headers and compilers, requiring system administrators to keep close track of each
system. When the sysadmin team grew beyond a few people, this logging and tracking of
information became nearly impossible, because the time and effort required to document
and share information were not available given the demands of keeping the servers and soft-
ware functional for the end users.
This process was made marginally easier with Linux packages (.deb, .rpm, etc.) and package
management front ends (such as Debian’s apt and Yellow Dog’s yum) for packaged software
and dependencies (though circular-dependency references can make for a nightmarish expe-
rience). Any software needing to be compiled on the system still went through the same
progression.
It’s not difficult to see the burden this process placed on operations teams, who were respon-
sible for moving code into production, ensuring the proper function of the overall system/
application stack, and ensuring uptime and reliability during patching/bugfix/maintenance
events. Often, operations and systems personnel would create shell scripts to help automate
the process. Although this effort helped speed the time to deployment, these scripts could
be fragile, depending on configuration standardization across server fleets, both in base
operating system and installed packages. This fragility eventually became less and less of a
problem because many enterprises moved to Linux on x86 platforms, ultimately enabling the
next leap in application deployment automation.
1. A flat file is written in tooling DSL, defining the end state of the device.
■ Includes dependencies and versions
■ Contains edits to any configuration files (either edited or pulled from remote loca-
tions into proper directories)
2. The tool is executed using the configuration from the flat file.
3. Results of individual steps are reported back to the control station.
4. When complete, the server and application end in a valid state.
While this process does not detail the time that is spent to create the DSL flat file, the pro-
cess is much simpler to execute (and to repeat across tens or even hundreds of servers). There
are several added benefits to this approach as well, some of which align to DevOps (or SRE)
principles:
A sample playbook that performs an installation of Python through the operating system’s
package management system (yum), clones code from several SCM repositories, creates a
Python virtual environment, and then installs the Python packages within the virtual envi-
ronment is shown in Example 7-12.
Example 7-12 Ansible Playbook Driving Device Automation
---
- hosts: localhost
gather_facts: no
vars:
packages:
- python36
- python36-devel
- vim
repositories:
- { org: 'qsnyder', repo: 'arya' }
- { org: 'qsnyder', repo: 'webarya' }
- { org: 'datacenter', repo: 'acitoolkit' }
- { org: 'CiscoDevNet', repo: 'pydme' }
cobra_eggs:
- { url: 'https://d1nmyq4gcgsfi5.cloudfront.net/fileMedia/1f3d41ce-d154-44e3-
74c1-d6cf3b525eaa/', file: 'acicobra-4.2_3h-py2.py3-none-any.whl' }
- { url: 'https://d1nmyq4gcgsfi5.cloudfront.net/fileMedia/b3b69aa3-891b-41ff-
46db-a73b4b215860/', file: 'acimodel-4.2_3h-py2.py3-none-any.whl' }
virtualenv: py3venv
dev_folder: ~/aci
tasks:
- name: Create ACI code repository
file:
path: "{{ dev_folder }}"
state: directory
mode: '0755'
Configuration management applications helped drive the speed and agility that the busi-
ness drivers demanded from the infrastructure, but there was a finite limit on how elastic the
infrastructure could be. Even with infrastructure provisioning and application rollout auto-
mated, if there were more applications required than servers that could host them, purchas-
ing cycles would remove whatever time and energy savings were realized.
As the latter half of the 2010s arrived, elasticity was no longer a “nice to have,” but a require-
ment for many organizations, both startups looking for capital-light ways to scale and large
enterprises finding ways to burst periods of high demand without having to have idle infra-
structure. The problem of elasticity was solved through cloud computing, but this meant that
as cloud resources spun up to meet demand (which could be only a short duration), provi-
sioning routines would need to be run, causing delays in ability to use the new resources.
Elasticity was not the only concept to experience a renaissance; the concept of ephemeral
infrastructure also experienced a significant shift. As ideas like containerization became
popular, along with management and orchestration platforms like Kubernetes, the idea of
deployments requiring specific requirements of the infrastructure started to disappear. With
containers, the entire app and its dependencies could be packaged in a single file that could be
shipped to any host, which could run the container if it had the runtime installed. This model
greatly simplified individual host management because all requirements would be packaged in
the container at build (inside of the CI/CD pipeline) automatically. However, this new model of
application required new models of deployment options for utmost speed and elasticity.
GKE provides two different levels of administration requirements to the end user, standard
and autopilot, providing an additional layer of abstraction to the end resources. In autopilot,
everything from the size and number of compute nodes to the underlying node operat-
ing system, all the way down to the security measures at boot time, is handled by Google.
This comes at a cost of support for things like GPUs, Calico network overlays, or Istio
service-mesh, and even support for some of GCP’s other products, but serves as a way for
administrators to focus strictly on the K8s API and the outcomes delivered through their
microservices applications.
Access to GKE (and GCP in general) is provided either through a web UI, a web-based con-
sole, or locally installed console tools. API access is also supported, allowing for the use
of automation and orchestration tools (like Ansible or Terraform) to provision and manage
cloud infrastructure. As an example of using GKE, you can deploy the microservice demo
sock shop app to GKE using the cloud console.
Deploying an application to GKE is similar to any Kubernetes application deployment. The
majority of the configuration steps required are GCP-specific, ensuring that the specific
region and zone are set, as well as scaling the cluster size. To deploy the sock-shop applica-
tion, you need to create the GKE cluster:
Example 7-13 Determining External IP Addresses of GKE Cluster Running the Sock-
shop Application
Navigating to one of the external IP addresses on port 30001 brings up the sock-shop sam-
ple application, shown in Figure 7-8, using one of the IP addresses shown in Example 7-13.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "ecs-tasks.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
NOTE Be sure not to change the version value; this is not a date value but references a spe-
cific revision of IAM policy that can be applied within the AWS IAM universe.
You then can create the IAM policy and attach it to the account:
NOTE You need to perform this task only once—the first time you bring up the ECS cluster.
After this is done the first time, the IAM policy is persistent and tied to the account in use.
NOTE The example references the use of us-west-2 as the compute region for the ECS
cluster. You might need to change this depending on the desired AWS compute region.
After the IAM role has been attached, the ECS parameters can be defined. You can do this
through the CloudShell using ecs-cli (as shown in the following output) but can also define it
within the ~/.ecs/ directory of the CloudShell instance using YAML definitions:
This code creates a configuration profile for the ECS cluster. ECS supports multiple pro-
files, which can be referenced at cluster launch time. However, because only a single profile
is defined, this profile is used by default. You can create the AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY by navigating to the IAM settings window, clicking Users, the
desired username, and then the Security Credentials tab.
NOTE You need to apply this configuration only once. After it is applied, the ecs-cli
parameters are translated to a YAML file in ~/.ecs/. The credentials file is translated to
another YAML file in the same directory.
Now that the cluster parameters and keys are stored, the cluster can be created and an appli-
cation deployed to the cluster. In this example, a Jupyter notebook container is deployed
and exposed to the public Internet. The following output shows the result of bringing up the
ECS cluster using the ecs-cli command. As the command runs, information about the cre-
ated VPC and subnets is displayed on the CLI.
ecs-cli up
....
Example 7-15 Gathering the Security Group Information for the Deployed Cluster
“CidrIp”: “0.0.0.0/0”
}
],
"Ipv6Ranges": [],
"PrefixListIds": [],
"UserIdGroupPairs": []
}
],
"VpcId": "vpc-0f87e8258f2e01577"
}
]
}
Now that the security group ID value is known, the policy to allow ingress traffic to the ECS
cluster can be permitted. By default, the Jupyter notebook container has TCP/8888 exposed,
which needs to be allowed through the ACL applied. Ensure that the value of GroupId is
placed in the appropriate location within the command illustrated in Example 7-16.
Example 7-16 Allowing Inbound Traffic to the ECS Container
The infrastructure has been prepared to allow connections and traffic to the cluster. All
that remains is to deploy the application. Deployment requires two files to be created: one
defines the parameters for the task execution on ECS, and the other is the docker-compose.
yml file (shown in Example 7-18), which defines the application to be deployed. You need
to create them using CloudShell. Additionally, within the ecs-params.yml file (an example of
which is shown in Example 7-17), the subnets and security groups need to be filled in with
the values obtained when the cluster is created.
NOTE Filenames should follow the exact naming of ecs-params.yml and docker-compose.
yml because ECS looks to these files by default when instantiating an application. It is pos-
sible to reference others using command-line switches, but they are omitted for simplicity.
Ensure the files end in .yml.
Within the docker-compose.yml file, note the volume mapping. As ECS abstracts the under-
lying infrastructure from the end user, a host mapping is not required. Defining the volume is
sufficient for ECS to map a data storage volume to the container.
version: 1
task_definition:
task_execution_role: ecsTaskExecutionRole
ecs_network_mode: awsvpc
os_family: Linux
task_size:
mem_limit: 8192
cpu_limit: 2048
run_params:
network_configuration:
awsvpc_configuration:
subnets:
- "SUBNET_1"
- "SUBNET_2"
security_groups:
- "SECURITY_GROUP_ID" 7
assign_public_ip: ENABLED
version: "3"
services:
minimal-notebook:
image: jupyter/minimal-notebook
environment:
- NB_USER=USERNAME
- PASSWORD=cisco12345
- JUPYTER_TOKEN=cisco12345
volumes:
- work
ports:
- "8888:8888"
container_name: minimal-notebook-container
After creating the two files, you can now deploy the application by using ecs-cli. The output
of running the ecs-cli command should look similar to the output shown in Example 7-19.
Example 7-19 Instantiating the Application on the ECS Cluster
After the service has been created in ECS, you are able to gather the exposed IP address of
the application on the cluster:
...
Name
State Ports TaskDefinition Health
DEVCOR/60ab7a0ad7ee4a69a95d3caef79b409f/minimal-notebook RUNNING
54.190.18.166:8888->8888/tcp cloudshell-user:8 UNKNOWN
Finally, after you open a web browser and enter the IP address and port information, a win-
dow appears with the Jupyter notebook login. This password is what was defined within the
docker-compose.yml file. A Jupyter notebook is now running on serverless containers within
AWS. Figure 7-10 shows the login for the notebook, password protected using credentials in
the docker-compose.yml file, and Figure 7-11 illustrates the mapped data folder for gener-
ated notebook files.
Example 7-20 Stopping the Service and Destroying the ECS Cluster
ecs-cli down
...
Are you sure you want to delete your cluster? [y/N]
y
INFO[0002] Waiting for your cluster resources to be deleted...
INFO[0002] Cloudformation stack status stackStatus=DELETE_IN_PROG-
RESS
INFO[0063] Cloudformation stack status stackStatus=DELETE_IN_PROG-
RESS
INFO[0093] Deleted cluster cluster=DEVCOR
to the function using a variety of methods, including copy/paste, API, or some pipeline
within AWS that adds the code from a VCS code repository. This code can then be invoked
through a timed-based setting, through an API gateway provided by AWS, or through a
larger chain of events triggered through a messaging bus or service (such as Kafka).
Both ECS on Fargate and Lambda are referred to as serverless. The common question “if it’s
serverless, how does it run?” refers to the idea that there must be a server underneath the
abstraction that is performing the desired action. This misses the point because serverless is
not a technology that is devoid of servers themselves but references the removal of the man-
agement aspect of the compute infrastructure and applications/packages required to support
running the desired application. These application operating models enable developers to
focus on the code and application (and the intended outcomes of it) rather than the opera-
tions and maintenance aspects that were required in decades past.
All the abstraction and removal of infrastructure comes at a cost, however. By removing
the requirement for management of the servers, operating systems, and supporting applica-
tions, cloud providers also remove the level of customization and flexibility afforded to the
end user. Although many common use cases are covered, if an application or enterprise has
requirements outside of what is supported, then it must be deployed on products that pro-
vide a similar look and feel to on-premises infrastructure, as well as similar overhead of man-
agement, patching, and dependencies. As a result, organizations sometimes standardize on
what is possible within their development process in order to ensure they comply with the
requirements of the serverless platform to which they deploy.
Deploying a function to Lambda is slightly different from the other methods discussed thus
far; while other methods rely on external packages, manifests, or charts to run the applica-
tion, Lambda is completely bespoke and requires the end user to build the environment that
can be run on the Lambda platform. This means that if a developer builds a Python function
(making Lambda a Function as a Service [FaaS] platform), the developer would build and 7
test their code locally and then deploy that exact environment as a layer within the Lambda
platform.
This effort can easily be accomplished by packaging up the Python virtualenv in use as a ZIP
file. This requires navigation to the site-packages folder of the virtualenv and invoking the
zip command:
sounding-data » cd venv/lib/python3.8/site-packages/
After the function is created, a new window appears with a code editor. First, the code needs
to be added to the function. You can do this by creating a ZIP archive of the code, which
replaces what is currently within the editor:
site-packages » cd ../../../../
logging in a graceful manner, the ability for any team (no matter how DevOps-y) to operate,
maintain, and upgrade the application under the demands of the app economy is signifi-
cantly hampered.
To create a baseline of application development best practices, a team at Heroku (a Platform
as a Service, or PaaS, that supports application development and serverless hosting) created
the notion of a 12-factor application design and development methodology and released it
to the public around 2011. Although not entirely canonical across every development team
or platform, the 12 factors create a baseline to ensure an application is not the weakest link
in DevOps or SRE principles—namely, resilience, uptime, and an ability to run anywhere to
meet those needs. It is also important to realize that, although some of these factors are self-
evident today, in the era of development in which these factors were released, practices and
methodologies were much less rigid and prescriptive. For a deeper dive into the 12-factor
app, refer to The 12-Factor App (see the “References” section at the end of this chapter).
Factor 1: Codebase
Factor 1 is self-explanatory: everything must be in a VCS code repository (the original book
mentions others, but the de facto standard is Git-based). In addition, that codebase should be
the single source of truth for all deployments, whether in development, test, or production.
This is not to say that feature or bugfix branches within the codebase can’t occur or that
development, test, and production can’t be running at different points within the branches,
but that all branching, edits, and deployments should be done from the same repository.
Factor 2: Dependencies
Within the codebase, every application should explicitly define the required supporting
packages to be able to run. In fact, if you have downloaded a Python application recently,
you may find that it includes a requirements.txt file that can install all required dependencies
using pip (Ruby, Perl, and others also have similar mechanisms). In addition, these dependen-
7
cies must be able to be installed in a self-contained environment, like a Python virtualenv, to
ensure that any required system dependencies do not interfere with what is required for the
application to run. Both aspects make up the second factor and should be considered basic
hygiene for all written applications.
Additionally, all required dependencies should be explicitly declared as part of the codebase
to the point where there are no requirements of any system-installed applications; the code
should be able to be self-contained and functional through the explicit dependencies and the
language isolation mechanism.
Factor 3: Config
Limits of duplicate infrastructure exist within every organization. Under ideal circumstances,
it would be very beneficial for app developers to have the identical sets of IP addresses, cre-
dentials, or resources existing in development, test, and production environments. In the real
world, this is highly impractical, especially because limitations on the duplicity are placed
on the app from externalities, like the network, compute resources, cloud applications, and
the like.
To overcome this, external constants (also known as the apps “config”) should be abstracted
away from the code itself and either imported at runtime through environment variables or
through a secret store with an external API, like HashiCorp Vault. This has a two-factor side
effect in that (a) the code can be released (either intentionally or not) to the public without
compromise of sensitive information and (b) the same code can be used across development,
test, and production environments with a simple change of the configuration, while the code
is still the same.
Factor 6: Processes
Applications designed within the 12-factor process are meant to be stateless in nature. This
(loosely) means that any transaction that occurs between one or more parts of the applica-
tion is completely independent from another transaction. More importantly, this transaction
is not shared between different components of the app unless there is a specific request
that is invoked between components, which in and of itself is stateless. If any transaction or
record is required, it must be stored in a database or object store, because any other compo-
nent is not designed to store this state.
If you are familiar with microservices, or even have read the example provided by nginx,
this concept should be familiar. Because you should be able to be restart, upgrade, or scale
each component of the application independently of the other components, nothing should
Factor 8: Concurrency
In much the same way that port binding combines multiple aspects, concurrency is enabled
through the adherence of other previous factors. By ensuring that applications handle exter-
nalities as resources rather than built-in components, referenced by individual configuration
files, the applications become stateless in nature. This stateless nature enables the individual
components of a larger application (web, application, and database to use a common three-
tier app) to be scaled independently according to need at the time. These different compo-
nents should be able to be scaled in a nondaemonized fashion (that is, separate processes for
each invocation, rather than using a global parent process with unique children), leveraging
tools like systemd for process management, which is responsible for the lifecycle of the pro-
7
cess itself (restarts, crashes, and so on). The concurrent management of disparate numbers of
the application’s process should be transparent to both the operations team supporting the
application, as well as the end user.
Factor 9: Disposability
Disposability is a concept extended from the stateless nature of the application itself. If the
components that make up the application have no data or state, it becomes trivial if those
components scale upward (due to load) or are removed (due to a lull). The application should
gracefully handle the removal of the scaled services, through a standard SIGTERM on the
process managed by the process manager, assuming no work is destined for that scaled com-
ponent. Disposability also states that if some scaled worker process no longer responds or
times out, it should be removed from the worker queue (and ideally some queueing process
should requeue the work for a node that is alive).
may cause the developer to look for lightweight applications to use in substitution. While
completely well intentioned, the use of a single system to mock entire environments not only
exposes production to issues due to different application behavior but also makes trouble-
shooting an issue harder because there is no replicated environment to understand the code’s
steady-state operation.
Parity should also exist in two other areas that will seem familiar to principles within
DevOps methodology. The first is that there should be as little a gap as possible between
code being committed to the main repository and it being moved into a deployed state
(which could be a test environment; the code needn’t move directly into production). The
complete code parity aligns directly with the automated integration, build, and deploy
process advocated by DevOps and SREs. The second parity involves the folks tasked with
supporting the application as it has moved into a deployment stage; namely, those who built
the application should be the same people tasked with supporting the deployment when it is
live. While the 12-factor app is neither DevOps nor SRE, the concepts in this factor strongly
overlap.
which the application runs. Finally, any external or required configuration should be explic-
itly defined and stored outside of the administrative process functions to ensure consistency
and function across environments. In short, you should treat any administration processes
with the same discipline and rigor that you would when designing and developing the main
application.
Summary
This chapter focuses on the guiding principles, cultures, and methodologies of modern
application development teams. It provides some insight into how applications move from
repositories of code to compiled applications and binaries to being deployed on infrastruc-
ture and examples of leveraging the capabilities of such pipelines for both infrastructure and
applications. These pipelines are contrasted with some of the challenges that operations staff
faced in the early days of application deployment and support as well as how modern opera-
tions staff can be removed from supporting the infrastructure. Finally, the chapter discusses
the design principles that can serve as a baseline for the development of a modern web-based
or SaaS application.
DevOps, site reliability engineering (SRE), continuous integration (CI), continuous delivery
(CD), continuous deployment (CD), version control system (VCS), source code manager
(SCM), Infrastructure as Code (IaC), serverless, Kubernetes
References
URL QR Code
https://cloud.google.com/blog/products/devops-sre/
how-sre-teams-are-organized-and-how-to-get-started
https://sre.google/sre-book/introduction/
https://www.runatlantis.io/guide/testing-locally.html
https://kind.sigs.k8s.io/docs/user/quick-start/
https://www.tiobe.com/tiobe-index/
https://www.gnu.org/software/make/manual/html_node/Simple-
Makefile.html
URL QR Code
https://microservices-demo.github.io/docs/
https://12factor.net
■ Storing IT Secrets: In this section, you learn about the various types of “secrets” or
credentials used to authenticate and authorize any communication or transaction.
Passwords are a common example, but others also are described in this section.
■ Public Key Infrastructure (PKI): This section covers PKI, which is a type of asymmet-
ric cryptography algorithm that requires the generation of two keys. One key is secure
and known only to its owner; it’s the private key. The other key, called the public key,
is available and known to anyone or anything that wishes to communicate with the pri-
vate key owner.
■ Securing Web and Mobile Applications: This section provides a detailed look at the
Open Web Application Security Project (OWASP) application security verification
system and discusses a few common attacks and how to prevent them.
■ OAuth Authorization Framework: This section discusses how OAuth 2.0 works and
the most common flows for authorization.
This chapter maps to the first part of the Developing Applications Using Cisco Core Plat-
forms and APIs v1.0 (350-901) Exam Blueprint Section 4.0, “Application Deployment and
Security,” specifically subsections 4.9, 4.10, and 4.11.
This chapter focuses on various security aspects in application design and execution. As
applications become more distributed, business data is forced to cross multiple boundar-
ies outside the control of enterprise security systems. Throughout the chapter, we cover
a few issues and scenarios related to designing and building applications with security in
mind. In addition to security needed to protect assets and your business reputation, it has
also become an important focus area for regulations and compliance. There has been a rise
in multiple regulations and frameworks related to data privacy and sovereignty, such as the
General Data Protection Regulation (GDPR) in the European Union and the California Con-
sumer Protection Act (CCPA), a US-based state regulation example.
In the following sections, we run through a number of examples about security design in
application development.
5. GDPR is a data protection law in the EU; what does it stand for?
a. General Data Public Relations
b. General Data Protection Regulation
c. Generic Decision Probability Router
d. None of these answers are correct.
6. IT secrets are all the following except _________.
a. Passwords
b. API keys
c. Account credentials
d. Vehicle identification number (VIN)
7. What is a certificate authority’s main function?
a. Certifying authorities in data communication
b. Issuing and signing certificates
c. Certifying account security
d. Storing personal passwords
8. Which of the following is considered to be an injection attack? (Choose all that apply.)
a. SQL injection
b. LDAP
c. Operating system commands
d. All of these answers are correct.
9. Which of the following is not a cryptographic attack?
a. Brute-force attack
b. Implementation attack
c. Statistical attack
d. PoH attack
10. What is the difference between a two-legged authorization flow and a four-legged
one?
a. The two-legged one utilizes an authorization server.
b. The four-legged one takes longer to execute.
c. There is no such thing as a four-legged authorization.
d. They are identical.
Foundation Topics
Information system security can be summed up in three fundamental components: confiden-
tiality, integrity, and availability. This is sometimes referred to as the CIA triad, as shown in
Figure 8-1.
ty
ali
Int
nti
eg
de
rit
nfi
y
Co
Availability
Figure 8-1 Information System Security Triad
Simply put:
■ Authentication
■ Authorization
■ Auditing
■ Anomaly detection
Protecting Privacy
Privacy is one of the hottest issues in security today. The emergence of the digital economy,
e-commerce, and social media continues to put privacy issues at the forefront. Eventu-
ally, local (or global) governments and standard bodies had to step in and bring awareness
and regulations to protect individuals. But first they had to define what constitutes private
information.
■ Name(s)
■ Address
■ Photographs
■ The list goes on and could possibly depend on the context in which the data is being
used.
PII data is everywhere and is used in almost every transaction and must be protected. The
applications you’re writing must protect PII data and must pass periodic audits.
Data States
To protect data, you must first know the state it is in. Data is either in motion, at rest, or in
use:
■ Data in motion: The data is in transit or traveling between two nodes or across the
network. Data needs to move between nodes and applications to create transactions.
For example, data needs to move between a point-of-sale (POS) system and a credit
Transport Layer Security (TLS) is the most common encryption protocol and is
considered to be strong enough for web traffic. Commonly, web data is transmitted
over the Secure Sockets Layer (SSL) using TLS 1.1 or 1.2. TLS is used for ensuring
integrity and confidentiality over a network. It uses symmetric cryptography using a
shared key negotiated at the initialization of a session. It is also possible to use TLS
for authentication using public key–based authentication involving digital certificates.
■ Data at rest: When data is not being transmitted, then it is considered at rest. When
data is stored on a hard drive, tape, or any other media type, then it is at rest. A great
deal of sensitive data is at rest. Examples include password files, databases, and back-
up data. In today’s distributed system web applications, the definition is widened to
include data within personal hard drives, network-attached storage (NAS), storage-area
networks (SANs), and cloud-based storage. Similar to data in motion, encryption plays
a big role in protecting data at rest. Encryption can be applied to the entire disk or to
individual files. Encrypting the entire disk may tax performance, but it’s a small price
to pay to secure sensitive data.
■ Data in use: Some security and privacy standards define this third state. When data
is being processed, updated, or generated, then it is considered to be in use. You
could argue that when data is the in the in-use state, it is actually sitting in memory or
“swap” space somewhere and could be considered “at rest.” You would not be wrong.
Whether data is at rest or in motion, protecting the data with access control, encryption, or
other means is what we discuss in the next few sections.
■ Data sovereignty: Identifies who has power over the data. If data resides in the US, it
does not matter what entity or government owns it, the data is subject to US laws.
■ Data localization: Specifies where the data should be located. For example, all data
related to EU citizens must be located in the EU, even if the entity in possession of
the data is a US-based company.
The following are laws or regulations that you may see or hear about:
■ Health Insurance Portability and Accountability Act (HIPAA): This US federal law
creates national standards for protecting patient data. It controls the use and disclosure
of people’s health information. HIPAA not only is for all US health-care businesses
that have access to protected health information (that is, hospitals, private doctors,
health insurance companies) but also covers business associates that provide services
to those covered entities that process protected health information (PHI) on their
behalf (including cloud service providers hosting health-care applications).
■ Sarbanes-Oxley Act of 2002 (SOX): SOX might not seem to be a “privacy” law ini-
tially, but it does include some privacy clauses in it. It’s designed to protect investors
by improving the accuracy and reliability of corporate reporting.
■ Payment Card Industry Data Security Standard (PCI DSS): This standard aims to
secure credit card data and transactions against theft or fraud.
GDPR is the toughest of these examples and is much broader in scope than the other three;
it includes security, privacy, and data localization elements.
Storing IT Secrets
To move at the speed of business, today’s enterprises utilize commercial off-the-shelf
(COTS), home-grown, and open-source software to build applications that automate their
business and serve their customers. There are no magic applications; applications need to
interact with other applications and do so using credentials. These credentials are called
secrets; in an IT context, they’re called IT secrets, and are used as “keys” to unlock pro-
tected applications or application data.
The following examples are secrets exchanged between applications:
■ Passwords
■ API keys
■ Account credentials
■ Encryption keys
■ Securing CI/CD pipelines and tools (for example, Jenkins, Ansible, Puppet)
■ Securing containers
■ Improving portability
Managing and storing secrets can be easy or can be difficult. That ease or difficulty depends
on the scope of the applications and the boundaries that the application interaction must
cross. You can store secrets in the source code, and it can be seen by everyone who reviews
your code (this approach is not recommended). Secrets can also be stored directly in the
code but encrypted. You can also use an external secret management service.
When planning to protect application secrets, you should consider the following:
■ Make this a design decision. Take all applications (centralized or distributed), net-
works, and cloud interactions into consideration.
■ Update secrets before moving the application into production. Do not use the same
set of secrets for development, QA, and production.
■ Use multifactor authentication when possible and for critical information retrieval.
■ Using single sign-on is recommended whenever possible and whenever the capabilities
are available and simple to apply.
■ Embedded into the code: This strategy is not the best because anyone with access to
the code will have access to the secrets. You can change secrets after code reviews and
audits, if possible, but again, this is not a safe way. This approach is not recommended
and is shown in the following example:
# MyApp_example.py 8
MyApp_API_KEY = 'lkajdfuishglkjg' # Hello World, look at my
password
On the other hand, it can be used with encryption if you manage to store and use the
key safely.
■ In the environment: This strategy is safer than the method before it. It has more con-
trol as to who has access to secrets. However, it is also prone to user misconfigurations
because you have to update environments on every server. Automated orchestration of
passwords may not be a bad idea here:
# MyApp_example.py
MyApp_API_KEY = os.environ["MyApp_API_Key"]
# MyApp_example.py
db = MySQLdb.connect ("192.168.100.2","username","mypassword",
"MYDB")
■ At a syncing service: This strategy provides a centralized way for managing secrets.
You manage a single password to manage a service that manages all your secrets; how-
ever, all passwords and API keys need to be hosted with that service. Understanding
your application’s landscape and dependencies is essential at this stage.
The syncing service can also be hosted in the cloud as SaaS—for example, Amazon’s
AWS Key Management Service (KMS). Similarly, Microsoft Azure offers Vault as an
alternative. The advantages are clear:
HashiCorp Vault is widely used; it is a free and open-source product that can also
come with an enterprise offer.
A simple search for the top certification authorities revealed multiple “top 10” results. The
following names are common to the majority of the lists:
1. Symantec
2. GeoTrust
3. Comodo
4. DigiCert
5. Thawte
6. GoDaddy
7. Network Solutions
8. RapidSSLonline
9. SSL.com
10. Entrust Datacard
As you build your security architecture, it is important to know your CA landscape. In the
majority of cases, your organization has probably made the choice for you, so you don’t
have to know much above and beyond which CA you should direct your traffic to initially.
It is also worth mentioning that it is not uncommon to see private (organization-specific)
implementations of PKI.
Figure 8-2 demonstrates what’s referred to as the enrollment and verification process and
how a user requests and receives a CA’s identity certificate.
Certificate
Authority 2 Request Certificate
8
Registration
Authority
1 Registration
3 Verification
User Verifier
Figure 8-2 User Request and Verification
The steps are as follows:
1. Register with the CA or the registration authority (RA) (this can be a physical meeting
or other means of identity verification).
2. Request a certificate from the CA admin.
3. Independently verify the identity of the user and that the certificate planned for issue
is not on the certificate revocation list.
4. The CA admin issues and signs a certificate.
Digital certificates (or simply certificates) are identities verified and issued by the CA. The
format of the certificates is defined by the International Telecommunication Union (ITU) in
their X.509 standard, and it includes:
■ Version
■ Serial number
■ Signature (algorithm ID used to sign the certificate)
■ Name of issuer
■ Subject’s unique ID
Certificate Revocation
As you saw in the preceding section, certificates have a “validity” period or an expiration
date. When certificates expire, they become in doubt, or some foul play is expected, they
get revoked. Revoked certificates and their serial numbers are added to a certificate revoca-
tion list (CRL). Figure 8-3 shows a high-level illustration of the revocation process and the
storing of the CRL at the CA server.
Certificate revocation is a centralized function, providing push and pull methods to obtain a
list of revoked certificates periodically or on demand.
Client Certificate
Subject Name
Root CA Name
Sign
Root CA Public Key
Self-sign
Root CA Signature
Client Server
1 Client connects to website
4 Client generates
a shared secret key
5 Client Key Exchange: shared secret key encrypted
with public key of server
1. The user attempts to connect to a security training site (see Figure 8-6). When the
connection to the server is first initialized, the server provides its PKI certificate to the
client. This contains the public key of the server and is signed with the private key of
the CA that the owner of the server has used.
2. The signature is subsequently verified to confirm that the PKI certificate is trustwor-
thy. If the signature can be traced back to a public key that already is known to the
client, the connection is considered trusted. Figures 8-7 and 8-8 show an example that
was executed on a macOS and may look a little different than one executed on a
Windows or Linux operating system.
3. Now that the connection is trusted, the client can send encrypted packets to the
server. Figure 8-9 shows that after verification, the connection is trusted, and
encrypted traffic can be exchanged with the server.
Figure 8-9 Highlighting the Validity, Trust, and Encryption (Highlighted and Boxed)
4. Because public/private key encryption is one-way encryption, only the server can
decrypt the traffic by using its private key. Figure 8-10 displays further verification
that only the shown server can decrypt the traffic.
In Figure 8-10, it is possible to view the path to the issuing CA. Before a certificate is
trusted, the system (the client OS) must verify that the certificate comes from a trusted
source. This process is called certificate path validation (CPV), which says that every cer-
tificate’s path from the root CA to the client (or host system) is legitimate or valid. CPV can
also help ensure that the session between two nodes remains trusted. If there is a problem
with one of the certificates in the path, or if the host OS cannot find a certificate, the path
is considered “untrusted.” Typically, the certification path includes a root certificate and one
or more intermediate certificates.
■ Hostname/identity mismatch: URLs specify a web server name. If the name specified
in the URL does not match the name specified in the server’s identity certificate, the
browser displays a security warning. Hence, DNS is critical to support the use of PKI
in web browsing.
■ Validity date range: X.509v3 certificates specify two dates: not before and not after.
If the current date is within those two values, there is no warning. If it is outside
the range, the web browser displays a message. The validity date range specifies the
amount of time that the PKI will provide certificate revocation information for the
certificate.
■ Signature validation error: If the browser cannot validate the signature on the certifi-
cate, there is no assurance that the public key in the certificate is authentic. Signature
validation will fail if the root certificate of the CA hierarchy is not available in the
browser’s certificate store. A common reason may be that the server uses a self-signed
certificate.
■ Cryptographic attacks: There are many types of cryptographic attacks; they may or
may not look any different than other security-type attacks, which include
■ Statistical attacks: These attacks exploit statistical weaknesses in the crypto sys-
tem or algorithm.
■ A01:2021-Broken Access Control: Access control enforces policy such that users
cannot act outside of their intended permissions. Failures typically lead to unauthor-
ized information disclosure, modification, or destruction of all data or performing a
business function outside the user’s limits.
■ A08:2021-Software and Data Integrity Failures: This new category for 2021 focuses
on making assumptions related to software updates, critical data, and CI/CD pipelines
without verifying integrity.
In the next few sections, we provide a practical approach to dealing with a few of the items
on the list.
Injection Attacks
Injection attacks affect a wide range of attack vectors. In the 2017 OWASP Top 10 list,
injection attacks were classified as the top web application security risk. On the 2021 list, it
dropped to third place, possibly because of advanced tooling and monitoring allowing you
to detect injections or because the top risks have been observed more frequently. Regardless
of the ranking, injection remain a serious issue.
The most common injection attacks are related to SQL and common databases; however, the
following list will possibly grow: 8
■ LDAP injection
■ Code injection
■ CRLF injection
■ XPath injection
SQL attacks are the most common type of attack, partly due to the nature of the language
and the ability to build dynamic queries. SQL injection attacks happen when an application
is not able to differentiate trusted from untrusted.
■ Use prepared statements with variable binding (parameterize queries). Prepared state-
ments are specific. Using 1=1 does not have specificity and in essence means nothing.
These statements are also simple to write and easier to read than dynamic queries.
■ Allow lists are the next best alternative when you’re unable to use variable binding
because it is not always allowed or legal in SQL. Creating allow lists increases the dif-
ficulty for an attacker to inject queries. Allow lists match every user input to a list of
“allowed” or “acceptable” characters.
NOTE The OWASP website offers great information for understanding the attacks and
provides you with various ways to protect your code and applications. We found the “SQL
Injection Prevention Cheat Sheet” to be of great value:
https://cheatsheetseries.owasp.org/cheatsheets/SQL_Injection_Prevention_Cheat_Sheet.html
Cross-Site Scripting
Cross-site scripting (XSS) comes in various forms and outcomes and is also considered a
type of an injection attack. With XSS, attackers inject malicious scripts into a web applica-
tion to obtain information about the application or its users. The three main types of
XSS are
■ Stored XSS: In Figure 8-11, the attacker injects a script (a.k.a. a Payload) into a web
server, and the script is triggered every time a user visits the website.
Website
■ Reflected XSS: This attack or injection, detailed in Figure 8-12, is delivered to the
“victim” using a trusted email or URL. The injection can also be reflected off a web-
site like an error message or query result. The victim is then tricked into clicking or
responding, leading to the execution of the code.
Database
5 The website loads the 3 Malicious script stored in DB.
content from DB containing
the malicious script.
Website
6 Website sends
response to victim.
■ DOM-based XSS: The attacker injects a script or payload into the Document Object
Model (DOM), and the script is triggered through the console. The steps are clearly
detailed in Figure 8-13. Unlike the stored and reflected attacks, a DOM-based attack is
a client-based attack instead of a server-side attack.
Website
4 Website received request. The
response does not include malicious
content.
Victim
Attacker 7 The attacker accesses private and
sensitive data of the victim.
1 Attacker discovers a 5 The user's browser executes the licit script
vulnerable website. inside the response, which causes the
insertion of the malicious script in the page.
NOTE RFC 6749 is a good read, and we highly recommend that you get familiar with the
main concepts if your work responsibilities include building application security compo-
nents: https://datatracker.ietf.org/doc/html/rfc6749.
■ Resource owner (end user or thing): Normally, the end user but can also be any com-
pute entity.
■ Resource server: The host of the secured accounts. The server responds to the client.
■ Authorization server: The server issuing access tokens to the client after it verifies
identity.
OAuth also defines four grant type flows. A grant is a credential representing the end user’s
authorization used by the client application to obtain an access token. The four grant types
are as follows:
■ Implicit flow: This simplified authorization code flow is optimized for client applica-
tions implemented in a browser using a scripting language. Instead of issuing the client
and authorization code, the client is issued an access token directly.
■ Resource owner password credentials: Simply put, this one is more like a name and
password for the end user.
■ Client credentials: Client application credentials can be used when the client is acting
on its own behalf. This means when the client application is acting as a resource owner
(or end user) requesting access to protected resources with the authorization server.
Figure 8-14 details the exchange among the four defined roles.
5 Access Token
t
es
6 Protected Resource 8
qu
Re
ra
io
3
G
at
n
riz
tio
Au
o
iza
th
4
t
Au
ho
or
1
riz
th
Au
Ac
at
io
ce
2
n
ss
G
ra
To
nt
ke
n
Resource Owner
Authorization Server
Figure 8-14 OAuth 2.0 Protocol Flow
The flow steps illustrated in Figure 8-14 are as follows:
1. The client requests authorization from the resource owner (end user). The authoriza-
tion request can be made directly to the resource owner (as shown), or preferably indi-
rectly via the authorization server as an intermediary.
Second Flow
3
4
1
Client Application
Resource Server
First Flow
Authorization Server
Figure 8-15 Two-Legged OAuth Flow
Initially, the client application authenticates with the authorization server and requests an
access token. This action is done by making the request with grant_type = client_creden-
tials and providing the client_id and client_secret. The following code example illustrates
the point:
Host: server.devcor.com
Content-Type: application/x-www-form-urlencoded
Accept: application/json
grant_type=client_credentials
client_id=pwmdisdz78fs9dujtn35wps67
client_secret=45kj86985utjfk98u389o658ogwti
The authorization server then authenticates the request and, if valid, issues an access token
including any additional parameters:
HTTP/1.1 200 OK
Content-Type: application/json
"access_token":"2YotnFZFEjr1zCsicMWpAA",
"token_type":"bearer",
"expires_in":3600,
"example_parameter":"example_value"
Third Flow
6 4
5 8
Client Application 3
Resource Server
First Flow
Second Flow
1
Resource Owner 4
Authorization Server
Figure 8-16 Conceptual Diagram of the Three-Legged Authorization
Figure 8-16 provides a general representation of the three-legged authorization. The first step
is to retrieve the authorization code. The second step is to exchange the authorization code
for an access token. The third exchange is to use the token. The flows or exchanges repre-
sented in Figures 8-16 and 8-17 assume that the authorization server uses two authentication
protocols: OAuth and OpenID. The two protocols work well together to ensure a seamless
authorization process.
2 9
10
Client Resource Server
Application
7
3
4
5
Resource Owner Authorization Server
Figure 8-17 Detailed Workflow of the Three-Legged Authorization
Figure 8-17 illustrates the detailed exchange of the three-legged OAuth 2.0 authorization:
1. The end user (resource owner) sends a request to the OAuth client application.
2. The client application sends the resource owner a “redirect” to the authorization server.
3. The resource owner connects directly with the authorization server and authenticates.
4. The authorization server presents a form to the resource owner to grant access.
5. The resource owner submits the form to allow or to deny access.
6. Based on the response from the resource owner, the following processing occurs:
■ If the resource owner allows access, the authorization server sends the OAuth client
a redirection with the authorization grant code or the access token.
■ If the resource owner denies access, then the request is redirected to the client
without a grant.
7. The OAuth client sends the authorization grant code, client ID, and the certificate to
the authorization server.
8. When the information is verified, the authorization server sends the client an access
token and a refresh token (refresh tokens are optional).
9. The client sends the access token to the resource server to request protected resources.
10. If the access token is valid for the requested protected resources, then the OAuth client
can access the protected resources.
Host: authorization-server.example.com
Content-Type: application/x-www-form-urlecoded
grant_type=client_credentials // - Required
&scope={Scopes} // - Optional
client_id= -kUr9VhvB_tWvv2orfPsKHGz
client_secret= LSidh2foCzJngCJqLSElXIl5TchjvL9_2l7OzbRpEFW6RlNf
Content-Type: application/json;charset=UTF-8
Cache-Control: no-store
Pragma: no-cache
// requested ones.
4. The client application retrieves the access token and requests the resource from the
resource server.
5. The resource server requests information about the access token from the
authorization server, for verification purposes.
6. The authorization server returns the requested information to the resource server.
7. The resource server verifies the access token, and if valid, it returns the resource to the
client application.
8
273
19/05/22 5:52 PM
274 Cisco Certified DevNet Professional DEVCOR 350-901 Official Cert Guide
1. The client application requests access to a protected resource, and the resource owner
approves.
2. The client application prompts the resource owner to input credentials: ID and
password.
3. The resource owner inputs their ID and password.
4. The client application sends a token request to the authorization server token endpoint,
including the ID and password. The following request is made to the token endpoint:
POST /token HTTP/1.1
Host: https://authorization-server.example.com
Content-Type: application/x-www-form-urlecoded
grant_type=password
&username={User ID}
&password={Password}
&scope={Scopes}
5. The authorization server sends an access token to the client application. The following
is an example of a response from the token endpoint:
HTTP/1.1 200 OK
Content-Type: application/json;charset=UTF-8
Cache-Control: no-store
Pragma: no-cache{
6. The client application requests the resource from the resource server using the access
token.
7. The resource server requests information about the access token from the
authorization server, for verification purposes.
8. The authorization server returns the requested information to the resource server.
9. The resource server verifies the access token, and if valid, it returns the resource to the
client application.
1. The application requests the user to access protected resources, and the user approves.
2. Build the authorization URL and redirect the user to the authorization server:
https://authorization-server.example.com/authorize?
response_type=token
&client_id=-kUr9VhvB_tWvv2orfPsKHGz
8
&redirect_uri=https://example-app.com/implicit.html
&scope=photo
&state=RqQ-hXJYv3seO6Mp
access_token=V2tY2H9NSV3nlDIRUSN2CHUsbkH9Hn3Hx1jcvZ6N6v2
OYFWUoKhAqEHUxqc7r76FqVIDQHWE&token_type=Bearer&expires_
in=86400&scope=photos&state=RqQ-hXJYv3seO6Mp
NOTE The implicit flow is being deprecated in the Security BCP because there is no solu-
tion in OAuth for protecting the implicit flow, and it will not stop a malicious actor from
injecting an access token into your client.
access_token: c1CqK7fBYqg7XEno4H82wpw4P_
Pn3p7mUk4No05xOJ81x8kqgcq4B3HghETCBx3PizwZHLPF
token_type: Bearer
expires_in: 86400
scope: photos
1. The application requests the user to access protected resources, and the user approves.
2. Build the authorization URL and redirect the user to the authorization server:
https://authorization-server.example.com/authorize?
response_type=code
&client_id=-kUr9VhvB_tWvv2orfPsKHGz
&redirect_uri=https://www.example-app.com/authorization-code.html
&scope=photo+offline_access
&state=SzbChK9KrCGa8Gnp
3. Verification.
4. Display scope and prompt the user to log in if required.
5. The user authorizes access.
6. Redirect the user and issue a short-lived authorization code:
■ Verify the state parameter:
?state=SzbChK9KrCGa8Gnp&code=sescvu5DJd568Xso8z2RxgonSk9Ucl0MIe8JD5
sJ8Z_dojc
grant_type=authorization_code
&client_id=-kUr9VhvB_tWvv2orfPsKHGz
&client_secret=LSidh2foCzJngCJqLSElXIl5TchjvL9_2l7OzbRpEFW6RlNf
&redirect_uri=https://www.example-app.com/authorization-code.html
8
&code=sescvu5DJd56-8Xso8z2RxgonSk9Ucl0MIe8JD5sJ8Z_dojc
"token_type": "Bearer",
"expires_in": 86400,
"access_token": "kzTq84yQvIhjEgNlg7Jb3tOrbuydA0mpO4fSuDA-
xMzzhPOf-a4x9iF8wQF2PzSxCgIBkGVW",
"refresh_token": "a0fbHhsmcrV1VBNeimrJdNk8"
9. Extract the access token and request a resource with the access token.
10. The resource server requests information about the access token.
11. The authorization server returns information about the access token.
12. The resource owner verifies the access token and, if valid, it returns the requested
resource.
4 Display scopes and prompt user to log in if required 3 Verify client_id and redirect_uri
1. The application requests the user to access protected resources, and the user approves.
2. Create a code verifier and challenge; then build the authorization URL:
Code verifier: JrlOqZuBNMuA9vpwg49DgwcQaP4tMBKVgZUlOE__kbGAlAm3
https://authorization-server.example.com/authorize?
response_type=code
&client_id=-kUr9VhvB_tWvv2orfPsKHGz
&redirect_uri=https://www.example.com/authorization-code-with-
pkce.html
&scope=photo+offline_access
&state=nb3l0xXX4MlHR3k_
&code_challenge=QesOP7mDGTTFDzZgFTZAYMWIXv17SRE9G4i601mwE4M
&code_challenge_method=S256
The client includes the code_challenge parameter in this request, which the authoriza-
tion server stores and compares later during the code exchange step.
3. Verify the client_id and redirect_uri.
4. Display the scope and prompt the user to log in if required.
5. The user logs in and authorizes access.
6. The authorization server stores code_challenge and code_challenge_method and then
redirects the user and issues an authorization code:
■ Verify the state parameter.
The user is redirected back to the client with a few additional query parameters in the
URL:
?state=nb3l0xXX4MlHR3k_&code=70diP-rT46VdBFb2SSf_hrFxczgyO1QbH
pQmQQTdqoLPMY7_
The state value isn’t strictly necessary here because the PKCE parameters provide
CSRF protection themselves.
POST https://authorization-server.example.com/token
grant_type=authorization_code 8
&client_id=-kUr9VhvB_tWvv2orfPsKHGz
&client_secret=LSidh2foCzJngCJqLSElXIl5TchjvL9_2l7OzbRpEFW6RlNf
&redirect_uri=https://www.example.com/authorization-code-with-
pkce.html
&code=70diP-rT46VdBFb2SSf_hrFxczgyO1QbHpQmQQTdqoLPMY7_
&code_verifier=JrlOqZuBNMuA9vpwg49DgwcQaP4tMBKVgZUlOE__kbGAlAm3
The code_verifier is sent along with the token request. The authorization server checks
whether the verifier matches the challenge that was used in the authorization request.
This ensures that a malicious party that intercepted the authorization code is not able
to use it.
"token_type": "Bearer",
"expires_in": 86400,
"access_token": "8K63E0hHj_NdQokyP-1_awC8_pSwnJxnPKbmJEw98DHssRg
W7NU1XEAC2M2ZGF1pQJD4Ak2P",
9. Extract the access token and request a resource with the access token.
10. The resource server requests information about the access token.
11. The authorization server returns information about the access token.
12. The resource owner verifies the access token and, if valid, it returns the requested
resource.
Refresh Token Flow
Figure 8-23 illustrates the refresh token flow. Here, assume that the application has a refresh
token that was issued previously with an access token during an authorization request in the
past.
1. The client application sends a request to the authorization server token endpoint to
reissue the access token. The following request is made to the token endpoint:
POST {Token Endpoint} HTTP/1.1
Content-Type: application/x-www-form-urlecoded
grant_type=refresh_token
&refresh_token={Refresh Token}
&scope={Scopes}
2. The authorization server provides the access token to the client application. The fol-
lowing is an example of the response from the authorization server token endpoint:
HTTP/1.1 200 OK
Content-Type: application/json;charset=UTF-8
Cache-Control: no-store
Pragma: no-cache{
"scope": "{Scopes}"
3. The client application requests the resource from the resource server using the access 8
token.
4. The resource server requests information about the access token from the
authorization server, for verification purposes.
5. The authorization server returns the requested information to the resource server.
6. The resource server verifies the access token, and if valid, it returns the resource to the
client application.
1 Request device_code
client_id=https://www.oauth.com/example/
2. The authorization server returns a response that includes the device code, a code to
display to the user, and the URL the user should visit to enter the code.
{
"device_code": "NGU5OWFiNjQ5YmQwNGY3YTdmZTEyNzQ3YzQ1YSA",
"user_code": "BDWD-HQPK",
"verification_uri": "https://example.com/device",
"interval": 5,
"expires_in": 1800
3. Present the verification_uri and user_code to the user and instruct the user to enter
the code at the URL.
4. Poll the token endpoint.
While you wait for the user to visit the URL, sign in to their account, and approve
the request, you need to poll the token endpoint with the device code until an access
token or error is returned:
POST https://example.com/token
grant_type=urn:ietf:params:oauth:grant-type:device_code
&client_id=https://www.oauth.com/example/
&device_code=NGU5OWFiNjQ5YmQwNGY3YTdmZTEyNzQ3YzQ1YSA
5. Poll: Before the user has finished signing in and approving the request, the authoriza-
tion server returns a status indicating the authorization is still pending.
HTTP/1.1 400 Bad Request
"error": "authorization_pending"
}
6. The user authenticates and confirms user_code.
7. The user approves client access.
8. Poll the authorization server periodically until the code has been successfully entered.
When the user approves the request, the token endpoint responds with the access
token:
HTTP/1.1 200 OK
"token_type": "Bearer",
"access_token": "RsT5OjbzRn430zqMLgV3Ia",
"expires_in": 3600,
"refresh_token": "b7a3fac6b10e13bb3a276c2aab35e97298a060e0ed
e5b43ed1f720a8"
Now the device can use this access token to make API requests on behalf of the user.
8
Exam Preparation Tasks
As mentioned in the section “How to Use This Book” in the Introduction, you have a couple
of choices for exam preparation: the exercises here, Chapter 17, “Final Preparation,” and the
exam simulation questions on the companion website.
certificate authority (CA), data at rest, data in motion, digital certificate, General Data
Protection Regulation (GDPR), injection attack, Open Authentication (OAuth), Open Web
Application Security Project (OWASP), personally identifiable information (PII), public
key infrastructure (PKI), Transport Layer Security (TLS), cross-site scripting (XSS)
References
URL QR Code
RFC 6749:
https://datatracker.ietf.org/doc/html/rfc6749
URL QR Code
ITU X.509:
https://www.itu.int/rec/T-REC-X.509
GDPR:
https://gdpr.eu/
Infrastructure
■ Zero-Touch Provisioning (ZTP): This section covers the concept of zero-touch pro-
visioning or bootstrap configuring a device without physical access and manual
effort.
Foundation Topics
Network Management
Ahh, network management. The oft-maligned discipline of network IT. If you ever worked
in an environment where you were given marching orders, you might have cringed at the
boss’s directive: “I need you to work on network management.” If you were in an environ-
ment where you were able to pick your own priorities, you probably did not select network
management as a first choice. However, much has changed; there have been great strides and
improvements in network management and operations that can be attributed to advance-
ments with software-defined networks (SDN), DevOps, continuous integration/continuous
deployment (CI/CD), and site reliability engineering (SRE).
You can’t deal with network infrastructure without using proper planning, design, imple-
mentation, operation, and optimization (PDIOO) methodologies. The PDIOO model has
been used for many years to provide structure in the networking discipline. Implementation,
operations, and optimization have been most impacted by network programmability and
automation over the years. The other planning and design components have also seen ben-
efits from frameworks such as the Information Technology Information Library (ITIL), The
Open Group Architecture Framework (TOGAF), and Control Objectives for Information and
Related Technologies (COBIT). Whichever IT service management model is used, it is impor-
tant that automation be a prime consideration in order to reduce human effort and errors, to
increase speed of service delivery, and to gain operational efficiencies and repeatability of
services against the network infrastructure.
However, you might wonder why network management and operations were given short
shrift. Many times, it was the last consideration on the proposal/budget and the first to get
cut. It’s true that a well-engineered network could operate for a time without network man-
agement focus and still perform well. Over time that well-engineered network would undergo
changes. A few configuration changes here. A decommissioned syslog event server there. An
additional branch network here or 20 everywhere, and before you know it, serious problems
need to be addressed! How many times have you heard about someone walking through the
data center to see a red or amber light on a module, fan, or power supply, and then look at
the logs and see the alert was first registered months ago (something like Figure 9-1)? Surely
that has never happened to you!
Network management concepts provided strong roots for current alignments to automation,
orchestration, software-defined networking (SDN), DevOps, CI/CD, and site reliability
engineering (SRE). SDN is a mature technology by now, whereas DevOps, CI/CD, and SRE
may be new concepts to traditional network engineers. The collaboration among developers
and network operations personnel to achieve highly automated solutions in a quick fashion is
the foundation of DevOps. Continuous integration/continuous deployment (CI/CD)
refers to a combined practice in software development of continuous integration and either
continuous delivery or continuous deployment. Site reliability engineering is a practice
that brings together facets of software engineering and development, applying them to
infrastructure and operations. SRE seeks to create functional, scalable, and highly reliable
software systems. It is good to have foundational networking skills, but it is even better when
you can complement them with programming skills that can make monitoring and provision-
ing more automated and scalable.
As an engineer, like you, I experience a lot of personal satisfaction in creating practical solu-
tions that impact my company and the world.
So, network management is not the career-limiting role it was once considered. Those in that
discipline benefited by augmenting their skills with network programmability because it
felt like a logical extension of the work already being done. And it served as the foundation
for the skills professionals are aspiring to today. Don’t lose heart if you’re just joining us on
your own network programmability journey; you’re not too late! When you break free from
continuous fire-fighting mode and enjoy the benefits of automation and network program-
mability, there is a lot of satisfaction. Ideally, the network management, operations, or SRE
teams are doing their best work when network IT is transparent and accountable to the user
and customer.
Indeed, the automation of network management, including service provisioning, has been
a key differentiator for several large, successful companies. As an example, consider how
Netflix moved from private data centers to the public cloud between 2008 and 2016. The
company focused on automating its processes and tooling to give it a sense of assurance,
even though its service operated over physical infrastructure it did not control. The company
recognized that while it didn’t own the infrastructure, it still owned the customer experience.
Automation reduces effort for you and your customers. A customer-experience-focused
business is a differentiator. Remember Henry Ford’s 1909 comment about his Model T that
you could “have any color so long as it’s black”? That was not a customer-inclusive concept
then and remains so now. Obviously, the company has since transformed its business to
appeal to broader customer preferences.
Many companies differentiate themselves from competitors through automation. When they
automate more services, they save money and meet customer expectations faster.
CLI/Console
The original serial console connection methods morphed into solutions using terminal serv-
ers that could aggregate 8, 16, 32, or more terminal connections into one device that would
fan out to many other devices in the same or nearby racks. Figure 9-4 shows a logical net-
work representation illustrating where a multiplexing terminal server might reside.
P0 Sub-slot 0-7
Console
R2
gig 0/0
Console
Terminal
Server R3
P1 Sub-slot 8-15
Console
R4
P0 Sub-slot 8-15
Figure 9-4 A Typical Terminal Server Deployment
This out-of-band (OOB) management model separates a dedicated management network
from production traffic. OOB models persist in highly available or remotely accessible envi-
ronments today. In-band management models intermingle the administrative and management
traffic with the regular user and production traffic.
In-band management became de rigueur with insecure Telnet persisting for many years.
Eventually, slowly, Secure Shell (SSH) took over as the preferred mode to connect to a device
for configuration management or other interaction. The initial methods of configuration
management would entail typing entire configuration syntax. Only slightly more advanced
was using terminal cut-and-paste operations, if you had a well-equipped terminal emulator!
However, this method does not scale, nor is it efficient.
Let’s look at a classic “what not to do” user example from many years ago. In this situation, the
network team used a service request management (SRM) system to document, request, review,
and gain approvals for change requests. However, the SRM was not integrated with their change
management processes, which were PuTTY and HyperTerminal. The change one Friday night
was to implement an almost 10,000-line access control list (ACL). The network engineer dutifully
accepted the approved change, took a copy of the 10K-line ACL from the pages of a Microsoft
Word document detailing the change request, and then started PuTTY. The engineer SSH’d into
the device, entered configuration mode, and pasted the ACL. Because a command-line interface
(and paste) is a flow-control-based method, it took more than 10 minutes to get the ACL into
the system. When the engineer saw the device prompt, they copied the running configuration to
the startup configuration and went home at the end of their shift. Four hours later, the engineer
was called back in because network traffic was acting unexpectedly. Some desirable traffic was
being blocked, and other, undesirable traffic was being passed! Eventually, the root cause was
determined. The engineer had a 10K-line access control list, but the terminal emulator was con-
figured for a maximum of 5K lines. They pasted only half the total ACL!
Despite examples like this, the CLI model of managing a device has persisted for many years.
I often refer to it as finger-defined networking (FDN) in a snarky accommodation against the
preferred software-defined networking (SDN) model everyone aspires to, as you can see in
Figure 9-5.
The Cisco acquisition of Duo in 2018 provides MFA that enables varied and flexible alterna-
tive factor authentication methods, such as smartphone app, callback, SMS, and hardware
token generator, as shown in Figure 9-6.
Although some 2FA token generators are physical devices, recognize that alternative solu-
tions are necessary to generate tokens for automated and programmatic purposes.
Some environments implement 2FA for human user interaction and have alternative key-
based authentication for automated service accounts. Thankfully, the staunch dependency to
continue FDN and yet increase the security posture influenced the industry to create pro-
grammatic token generators for Python scripts and other automated and unattended uses.
The most desirable methods for interacting with a physical or virtual device, in order of
security posture, are
The use of insecure Trivial File Transfer Protocol (TFTP) transfer of a configuration with a
TFTP server is still popular. However, the better-intentioned Secure Copy Protocol is where
most discerning network operators focus.
SNMP
Simple Network Management Protocol (SNMP) continues to exist, seemingly off to the side.
NOTE SNMP is not part of the DEVCOR exam, so the information provided here is for
historical context.
Many network engineers repeat the joke that SNMP is not simple. Can you configure a
device with it? In some cases, many more MIB objects are readable versus those that are
writable. SNMP has been the standard method for collecting performance, configuration,
inventory, and fault information. It did not take off as well for provisioning. Indeed, the
closest that SNMP got to network provisioning for most situations was using an SNMP Set
operation against the CISCO-CONFIG-MAN-MIB to specify a TFTP server and a file to
transfer representing text of the configuration. A second SNMP Set operation triggered the
merge with a device’s running configuration.
Thankfully, the industry pushes toward streaming telemetry, a more efficient and pro-
grammatic method that we cover later. It’s notable that Google talked of its push
away from SNMP at the North American Network Operators’ Group (NANOG)
73 conference in the summer of 2018. Reference the video athttps://youtu.be/
McNm_WfQTHw and presentation at https://pc.nanog.org/static/published/meetings/
NANOG73/1677/20180625_Shakir_Snmp_Is_Dead_v1.pdf.
Then Google shared experiences with its three-year project to remove SNMP usage. Because
SNMP is still widely used in many environments and the core collection mechanism of most
performance management tools, we continue our review of it, limited to the more secure
SNMPv3, which you should be using if doing SNMP at all.
SNMPv3 brought new excitement to the network management space in the early 2000s.
SNMPv3 brought authentication and encryption along with other security features like
anti-replay protections. The criticisms about SNMPv1 and v2c being insecure finally could
be addressed. The use of MIB objects did not change, only the transport and packaging
of the SNMP packets. Figure 9-7 alludes to the promise of securing the sensitive manage-
ment traffic.
The Internet Engineering Task Force (IETF) defined SNMPv3 in RFCs 3410 to 3418, while
RFC 3826 brought advanced encryption. SNMPv3 provides secure access to devices
through a combination of authenticating and encrypting packets. The security features pro-
vided in SNMPv3 are
■ Message integrity: Ensuring that a packet has not been tampered with in transit
The initial IETF specification for SNMPv3 called for 56-bit DES encryption. For some
environments, this is enough. However, there was pressure from the financial industry to
embrace more sophisticated encryption models. The IETF complied through RFC 3826 by
adding AES-128; many devices and management tools likewise added support for the addi-
tional specification. Cisco went a step further with even more enhanced encryption models
to include 168-bit triple-DES: 128-, 168-, and even 256-bit AES. It is important to map the
device authentication and encryption capabilities with those of the management tools; oth-
erwise, the security will not be as desired.
the early 2000s might remember hearing about the integration adapters between Cisco-
Works 2000 and HP OpenView Network Node Manager. Other adapters became available
with popular EMSs like Concord eHealth, Tivoli OMNIbus Netcool, and Infovista. However,
these were still one-to-one pairings, as depicted in Figure 9-10.
Embedded Management
As centralized EMSs increased in prominence, the notion of a device’s “self-monitoring” was
considered. The benefits of distributing the management work or dealing with a network
isolation issue were found through embedded management techniques. Cisco’s Embedded
Event Manager (EEM) has been a popular and mature feature for many years in distributing
management functions to the devices themselves.
Some use cases with EEM have been
■ Sending an email or syslog event message when a CPU or memory threshold is reached
■ Clearing a terminal line that has been running over a defined time length
Often, having the device self-monitor with EEM could be more frequent than a centralized
monitoring solution that may be managing hundreds, thousands, or more devices. This is
especially true if the condition being monitored is not something that needs to be collected
and graphed for long-term reporting. Consider monitoring an interface or module health
state every 30 seconds. This rate may be unachievable with a centralized system that has tens
of thousands of devices to manage. However, a device may be able to use embedded man-
agement functions to self-monitor at this rate with little impact. It could then send notifica-
tions back to a centralized fault management system as necessary. 9
In IT, service management performance and fault management often have similar goals. Per-
formance management usually entails periodic polling or reception of streaming telemetry in
a periodic fashion. The performance information may be below thresholds of concern, and
the information is not especially useful. However, retaining the periodically gathered data
can be useful for trending and long-term analysis.
Conversely, the fault management function may collect data to compare against a threshold
or status and not retain it if the data doesn’t map to the condition. This function becomes
very ad hoc and may not align to standard polling frequencies. Fault management also ben-
efits from asynchronous alerts and notifications, usually through Syslog event messaging or
SNMP traps.
For CiscoLive events, the network operations center (NOC) has used advanced EEM scripts
to monitor CDP neighbor adjacency events; identify the device type; and configure the con-
nected port for the appropriate VLAN, security, QoS, and interface description settings.
Using Embedded Event Manager is beneficial for device-centric management; however, it
should not be relied on solely for health and availability management. If you didn’t get an
email or syslog event message from the device, does that mean all is healthy? Or does it
mean the device went offline and can’t report? It is wise to supplement embedded manage-
ment solutions with a robust availability monitoring process to ensure device reachability.
■ Obtaining and incorporating “Day-0” base config (core services such as NTP, SSH,
SNMP, Syslog; generating/importing certificates, or using Trusted Platform Module or
International Organization for Standardization and the International Electrotechnical
Commission (ISO/IEC) Standard 11889, and so on)
The ZTP process has most of its notable magic in the Obtaining Base Network Connectiv-
ity task. Consider the standard flow of ZTP for NX-OS, IOS-XR, and IOS-XE devices from
Figure 9-11.
Reverse
Administrator DNS lookup No
successful AutoInstall fails
connects to the
router remotely AutoInstall
to save the terminates Yes No
configuration
to the startup-
configuration file Networking device Networking Default
uses the hostname device’s IP address configuration file
Networking device mapped to hostname in No
the DNS server exists on TFTP
loads the network-config file
responded with server
configuration file
Yes Yes
No AutoInstall
terminates
AutoInstall fails
Administrator connects to
the router remotely to
AutoInstall
finish the configuration
terminates
and save it to the startup-
configuration file
NOTE Check your platform support with the Cisco Feature Navigator (https:
//cfnng.cisco.com/).
Of interest to network programmers is the enhanced AutoInstall Support for TCL Script 9
feature. It enhances AutoInstall by providing more flexibility in the installation process.
The administrator can program the device with TCL scripts to get information about
what to download and to choose the type of file server and the required file transfer
protocol.
Of special interest to network programmers is the even more advanced option to ZTP using
Python scripts, as found in the IOS-XE 16.6+ platforms. In this case, DHCP option 67 set-
tings are used to identify a Python script to download. The device executes the Python
script locally using Guest Shell on-box functionality, as depicted in Figure 9-12.
Step 1:
Booting Up
Seeking IP Address DHCP Server
via HDCP
Step 2:
Your IP is...
Unconfigured Your router-gw, DNS are...
Router/ Oh, and your TFTP server is 1.2.3.4 and
Switch pick up file bootstrap.py
Step 3:
(I’m configured enough to network)
Hey 1.2.3.4 give me bootstrap.py TFTP Server
1.2.3.4
Step 5: Step 4:
Executing bootstrap.py locally Here is bootstrap.py
With a script-based deployment, you can program and execute a dynamic installation pro-
cess on the device itself instead of relying on a configuration management function driven
by a centralized server.
As an example, consider the Python ZTP script in Example 9-1.
Example 9-1 Python ZTP Script
NOTE If you are using a platform with legacy Python 2.x support, the print statements
need to be modified as follows:
print “Example”
This simplistic Python script merely echoes the show version command (to the console,
when booting up), configures a Loopback 10 interface, and then echoes the show ip inter-
face brief command. Obviously, in its current form, the script is not portable, but with some
basic Python skills, you can easily enhance it to have conditional logic or even off-box,
alternate server interactions performing basic GET/POST operations to a REST API. This is
when the magic happens: the device could provide a unique identifier, such as serial number
or first interface MAC address, which could be used for more sophisticated and purposeful
device provisioning.
Management Tools
Treat Devices Data Plane (DP)
Atomically
CP DP CP DP
CP DP CP DP CP DP CP DP
If you find yourself running network discovery or entering device management addresses
and credentials with a network management tool, you are probably dealing with a traditional
EMS system. Systems like Prime Infrastructure, Data Center Network Manager (DCNM),
Prime Collaboration, Statseeker, and others are examples of atomic EMSs.
The advent of software-defined networking catalyzed the centralized controller management
model. In most cases, the managed devices seek out the controller that manages it. Mini-
mal, if any, initial configuration is necessary on the device. The controller identification may
occur by expected IP address or hostname, broadcast/multicast address, a DHCP option,
DNS resource record (SRV), or a Layer-2/3 discovery process.
After the device registers with the controller, it shows up in the inventory. Additional autho-
rization may have been performed with the device and/or controller. In the Meraki or Cisco
SD-WAN models, the devices register with a central, cloud-based controller, and administra-
tors must “claim” the device. Additional examples of SDN and controller-like management
systems are DNA Center and the ACI APIC controller.
You can look to the wireless industry for some examples of both atomic EMS and SDN con-
troller models. Initially, wireless access points (WAPs) operated autonomously and required
individual configuration. A centralized management tool, like Cisco Prime Network Control
System (NCS), which eventually converged with Prime Infrastructure, was used to manage
autonomous WAPs. Later, centralized controller-based WAPs, or lightweight WAPs (LWAPs),
were developed that used the wireless LAN controller (WLC) appliance for central manage-
ment. Yet another iteration of manageability occurred with a distributed model using wiring
closet-based controllers in the Catalyst 3850. The multiple options existed to suit customer
deployment preference and need.
In a more SDN literal model, the control plane function is separated from the device and is
provided by a centralized controller. The data plane function still resides on the device, as
seen in Figure 9-14.
CP DP CP DP
Controller
Central Point of CP DP CP DP
Management
CP DP CP DP CP DP CP DP
Figure 9-15 Population and Internet Growth (Source: Cisco IBSG 2011, Cisco VNI
results and forecasts 2010, 2015, 2018, 2020)
When you consider the growth of underpinning network devices necessary to accommodate
the mobile user, servers, and virtual machines, you can quickly understand how atomic EMS-
based management is truly untenable. To address the diversity of endpoints and functions,
the networking environment has become exponentially more complex. IBN transforms a
hardware-centric network of manually provisioned devices into a controller-centric network
that uses business intent translated into policies that are automated and applied consistently
across the network. The network continuously monitors and adjusts network configuration
and performance to achieve the specified business outcomes. Operational efficiencies are
realized when IBN models allow for natural language directions to be translated to native
controller-centric directives as conceptualized in Figure 9-16.
IBN augments the network controller model of software-defined networking. The network
controller is the central, authoritative control point for network provisioning, monitor-
ing, and management. Controllers empower network abstraction by treating the network
as an integrated whole. This is different from legacy element management systems where
devices are managed atomically, without functional or relational consideration. Cross-
domain orchestration is enabled when controllers span multiple domains (access, WAN, data
center, compute, cloud, security, and so on) or when they provide greater efficiencies in
programmability.
A closed-loop function of IBN ensures practical feedback of provisioning intent that reflects
actual implementation and operations. The activation function translates the intent into poli-
cies that are provisioning into the network. An assurance function continuously collects
analytics from the devices and services to assess whether proper intent has been applied and
achieved. Advanced solutions apply machine learning to accurately estimate performance or
capacity constraints before they become service impacting.
The DNA Center management solution and CX Cloud are solutions from Cisco that enable
intent-based networking.
Summary
This chapter focuses on capabilities in network management, infrastructure provisioning,
element management systems, zero-touch provisioning (ZTP), atomic or SDN-like/controller-
based networking, and intent-based networking. It depicts the growth and transformation in
the network industry over the past 30 years. As technological problems were identified, solu-
tions were made and transformation occurred. Each step of the way, refined principles have
improved the state of networking. If you find yourself implementing or sustaining the legacy
methods first described, you are highly encouraged to adopt the newer methods defined
later in the chapter.
References
URL QR Code
https://youtu.be/McNm_WfQTHw
https://pc.nanog.org/static/published/meetings/
NANOG73/1677/20180625_Shakir_Snmp_Is_Dead_v1.pdf
https://cfnng.cisco.com/
https://www.cisco.com/c/en/us/solutions/collateral/executive-
perspectives/annual-internet-report/white-paper-c11-741490.html
Automation
■ REST APIs: This section provides insights to the functionality and benefit of REST
APIs and how they are used.
This chapter maps to the second part of the Developing Applications Using Cisco
Core Platforms and APIs v1.0 (350-901) Exam Blueprint Section 5.0, “Infrastructure and
Automation.”
As we’ve learned about the infrastructure involved in network IT and see the continued
expansion, we also recognize that static, manual processes can no longer sustain us. When
we were managing dozens or hundreds of devices using manual methods of logging in to ter-
minal servers, through a device’s console interface, or through inband connectivity via SSH,
it may have been sufficient. However, now we are dealing with thousands, tens of thousands,
and in a few projects I’ve been on, hundreds of thousands of devices. It is simply untenable
to continue manual efforts driven by personal interaction. At some point, these valuable
engineering, operations, and management resources must be refocused on more impactful
activities that differentiate the business. So, automation must be embraced. This chapter
covers some key concepts related to automation: what challenges need to be addressed, how
SDN and APIs enable us, and the impact to IT service management and security.
1. When you are considering differences in device types and function, which technology
provides the most efficiencies?
a. Template-driven management
b. Model-driven management
c. Atomic-driven management
d. Distributed EMSs
2. The SRE discipline combines aspects of _______ engineering with _______ and
_______.
a. Hardware, software, firmware
b. Software, infrastructure, operations
c. Network, software, DevOps
d. Traffic, DevOps, SecOps
3. What do the Agile software development practices focus on?
a. Following defined processes of requirements gathering, development, testing, QA,
and release.
b. Giving development teams free rein to engineer without accountability.
c. Pivoting from development sprint to sprint based on testing results.
d. Requirements gathering, adaptive planning, quick delivery, and continuous
improvement.
4. Of the software development methodologies provided, which uses a more visual
approach to the what-when-how of development?
a. Kanban
b. Agile
c. Waterfall
d. Illustrative
Foundation Topics
Challenges Being Addressed
As described in the chapter introduction, automation is a necessity for growing sophis-
ticated IT environments today. Allow me to share a personal example: if you’ve been to a
CiscoLive conference in the US, it is common to deploy a couple thousand wireless access
points in the large conference venues in Las Vegas, San Diego, and Orlando. I’m talking a
million square feet plus event spaces.
Given that the network operations center (NOC) team is allowed onsite only four to five
days before the event starts, that’s not enough time to manually provision everything with
a couple dozen event staff volunteers. The thousands of wireless APs are just one aspect
of the event infrastructure (see Figure 10-1). There are still the 600+ small form-factor
switches that must be spread across the venue to connect breakout rooms, keynote areas,
World of Solutions, testing facilities and labs, the DevNet pavilion, and other spaces (see
Figure 10-2).
10
Connectivity technology morphed from more local-based technologies like token ring and
FDDI to faster and faster Ethernet-based solutions, hundred megabit and gigabit local inter-
faces, also influencing the speed of WAN technologies to keep up.
Switches gave advent to more intelligent routing and forwarding switches. IP-based telephony
was developed. Who remembers that Cisco’s original IP telephony solution, Call Manager,
was originally delivered as a compact disc (CD), as much software was?
Storage was originally directly connected but then became networked, usually with different
standards and protocols. The industry then accepted the efficiencies of a common, IP-based
network. The rise of business computing being interconnected started influencing home
networking. Networks became more interconnected and persistent. Dial-up technologies and
ISDN peaked and started a downward trend in light of always-on cable-based technologies
to the home. Different routing protocols needed to be created. Multiple-link aggregation
requirements needed to be standardized to help with resiliency.
Wireless technologies came on the scene. Servers, which had previously been mere end-
points to the network, now became more integrated. IPv6. Mobile technologies. A lot of
hardware innovations but also a lot of protocols and software developments came in parallel.
So why the history lesson? Take them as cases in point of why networking IT was slow in
automation. The field was changing rapidly and growing in functionality. The scope and pace
of change in network IT were unlike those in any other IT disciplines.
Unfortunately, much of the early development relied on consoles and the expectation of a
human administrator always creating the service initially and doing the sustaining changes.
The Information Technology Information Library (ITIL) and The Open Group Architecture
Framework (TOGAF) service management frameworks helped the industry define structure
and operational rigor. Some of the concepts seen in Table 10-2 reflect a common vocabulary
being established.
The full lifecycle of a network device or service must be considered. All too often the “spin-
up” of a service is the sole focus. Many IT managers have stories about finding orphaned
recurring charges from decommissioned systems. Migrating and decommissioning a service
are just as important as the initial provisioning. We must follow up on reclaiming precious
consumable resources like disk space, IP addresses, and even power. 10
In the early days of compute virtualization, Cisco had an environment called CITEIS—Cisco
IT Elastic Infrastructure Services, which were referred to as “cities.” CITEIS was built to pro-
mote learning, speed development, and customer demos, and to prove the impact of automa-
tion. A policy was enacted that any engineer could spin up two virtual machines of any kind
as long as they conformed to predefined sizing guidelines. If you needed something differ-
ent, you could get it, but it would be handled on an exception basis. Now imagine the num-
ber of people excited to learn a new technology all piling on the system. VMs were spun up;
CPU, RAM, disk space, and IP addresses consumed; used once or twice, then never accessed
again. A lot of resources were allocated. In the journey of developing the network program-
mability discipline, network engineers also needed to apply operational best practices. New
functions were added to email (and later send chat messages to) the requester to ensure the
resources were still needed. If a response was not received in a timely fashion, the resources
were archived and decommissioned. If no acknowledgment came after many attempts over
a longer period, the archive may be deleted. These kinds of basic functions formed the basis
of standard IT operations to ensure proper use and lifecycle management of consumable
resources.
With so many different opportunities among routing, switching, storage, compute, collabo-
ration, wireless, and such, it’s also understandable that there was an amount of specialization
in these areas. This focused specialization contributed to a lack of convergence because each
technology was growing in its own right; the consolidation of staff and budgets was not
pressuring IT to solve the issue by building collaborative solutions. But that would change.
As addressed later in the topics covering SDN, the industry was primed for transformation.
In today’s world of modern networks, a difference of equipment and functionality is to be
expected. Certainly, there are benefits recognized with standardizing device models to pro-
vide efficiencies in management and device/module sparing strategies. However, as network
functions are separated, as seen later with SDN, or virtualized, as seen with Network Func-
tion Virtualization (NFV), a greater operational complexity is experienced. To that end, the
industry has responded with model-driven concepts, which we cover in Chapter 11, “NET-
CONF and RESTCONF.” The ability to move from device-by-device, atomic management
considerations to more service and function-oriented models that comprehend the relation-
ships and dependencies among many devices is the basis for model-driven management.
automation and orchestration, there is reduced need for on-shift personnel. Consolidation of
support teams is possible. This pivot to a more on-call or exception-based support model is
desired. The implementation of self-healing networks that require fewer and fewer support
personnel is even more desirable. Google’s concept of site reliability engineering (SRE) is an
example of addressing the industry’s shortcomings with infrastructure and operations sup-
port. The SRE discipline combines aspects of software engineering with infrastructure and
operations. SRE aims to enable highly scalable and reliable systems. Another way of thinking
about SRE is what happens when you tell a software engineer to do an operations role.
Whatever model you choose, take time to understand the pros and cons and evaluate against
your organization’s capabilities, culture, motivations, and business drivers. Ultimately, the
right software development methodology for you is the one that is embraced by the most
people in the organization.
What are the options for extracting information like number of multicast packets output?
The use of Python scripts is in vogue, so let’s consider that with Example 10-2, which
requires a minimum of Python 3.6.
Example 10-2 Python Script to Extract Multicast Packets
import paramiko
import time
import getpass
import re
try:
devconn = paramiko.SSHClient()
devconn.set_missing_host_key_policy(paramiko.AutoAddPolicy())
devconn.connect(devip, username=username, password=userpassword,timeout=60)
chan = devconn.invoke_shell()
chan.send("terminal length 0\n")
time.sleep(1)
chan.send(f'show interface {devint}')
time.sleep(2)
10
cmd_output = chan.recv(9999).decode(encoding='utf-8')
devconn.close()
result = re.search('(\d+) multicast,', cmd_output)
if result:
print(f'Multicast packet count on {devip} interface {devint} is {result.
group(1)}')
else:
print(f'No match found for {devip} interface {devint} - incorrect
interface?')
except paramiko.AuthenticationException:
print("User or password incorrect - try again")
except Exception as e:
err = str(e)
print(f'ERROR: {err}')
There’s a common theme in methodologies that automate against CLI output which requires
some level of string manipulation. Being able to use regular expressions, commonly called
regex, or the re module in Python, is a good skill to have for CLI and string manipula-
tion operations. While effective, using regex can be difficult skill to master. Let’s call it
an acquired taste. The optimal approach is to leverage even higher degrees of abstraction
through model-driven and structure interfaces, which relieve you of the string manipulation
activities. You can find these in solutions like pyATS (https://developer.cisco.com/pyats/)
and other Infrastructure-as-Code (IaC) solutions, such as Ansible and Terraform.
Product engineers intend to maintain consistency across releases, but the rapid rate of
change and the intent to bring new innovation to the industry often result in changes to
the command-line interface, either in provisioning syntax and arguments or in command-
line output. These differences often break scripts and applications that depend on CLI; this
affects accuracy in service provisioning. Fortunately, the industry recognizes the inefficien-
cies and results of varying CLI syntax and output. Apart from SNMP, which generally lacked
a strong provisioning capability, one of the first innovations to enable programmatic interac-
tions with network devices was the IETF’s NETCONF (network configuration) protocol.
We cover NETCONF and the follow-on RESTCONF protocol in more detail later in this
book. However, we can briefly describe NETCONF as an XML representation of a device’s
native configuration parameters. It is much more suited to programmatic use. Consider now
a device configuration shown in an XML format with Figure 10-6.
Although the format may be somewhat unfamiliar, you can see patterns and understand
the basic structure. It is the consistent structure that allows NETCONF/RESTCONF and an
XML-formatted configuration to be addressed more programmatically. By referring to tags
or paths through the data, you can cleanly extract the value of a parameter without depend-
ing on the existence (or lack of existence) of text before and after the specific parameter(s)
you need. This capability sets NETCONF/RESTCONF apart from CLI-based methods that
rely on regex or other string-parsing methods.
A more modern skillset would include understanding XML formatting and schemas, along
with XPath queries, which provide data filtering and extraction functions.
Many APIs output their data as XML- or JSON-formatted results. Having skills with XPath
or JSONPath queries complements NETCONF/RESTCONF. Again, we cover these topics
later in Chapter 11.
Another way the industry has responded to the shifting sands of CLI is through abstracting
the integration with the device with solutions like Puppet, Chef, Ansible, and Terraform.
Scripts and applications can now refer to the abstract intent or API method rather than a
potentially changing command-line argument or syntax. These also are covered later in this
book.
Scale
Another challenge that needs to be addressed with evolving and growing network is scale.
Although early and even some smaller networks today can get by with manual efforts of a
few staff members, as the network increases in size, user count, and criticality, those models
break. Refer back to Figure 9-19 to see the growth of the Internet over the years.
Scalable deployments are definitely constrained when using CLI-based methodologies, espe-
cially when using paste methodologies because of flow control in terminal emulators and
adapters. Slightly more efficiencies are gained when using CLI to initiate a configuration file
transfer and merge process.
Let me share a personal example from a customer engagement. The customer was dealing
with security access list changes that totaled thousands of lines of configuration text and
was frustrated with the time it took to deploy the change. One easy fix was procedural: cre-
ate a new access list and then flip over to it after it was created. The other advice was show-
ing the customer the inefficiency of CLI flow-control based methods. Because the customer
was copying/pasting the access list, they were restricted by the flow control between the
device CLI and the terminal emulator.
deployment to gauge the next level of automation for scale. Here are some questions to ask
yourself:
■ Do I have dependencies among them that require staggering the change for optimal
availability? Any primary or secondary service relationships?
Increasingly, many environments have no maintenance windows; there is no time that they
are not doing mission-critical work. They implement changes during all hours of the day
or night because their network architectures support high degrees of resiliency and avail-
ability. However, even in these environments, it is important to verify that the changes being
deployed do not negatively affect the resiliency.
One more important question left off the preceding list for special mention is “How much
risk am I willing to take?” I remember working with a customer who asked, “How many
devices can we software upgrade over a weekend? What is that maximum number?”
Together, we created a project and arranged the equipment to mimic their environment as
closely as possible—device types, code versions, link speeds, device counts. The lab was
massive—hundreds of racks of equipment with thousands of devices. In the final analysis,
I reported, “You can effectively upgrade your entire network over a weekend.” In this case,
it was 4000 devices, which at the time was a decent-sized network. I followed by saying,
“However, I wouldn’t do it. Based on what I know of your risk tolerance level, I would sug-
gest staging changes. The network you knew Friday afternoon could be very different from
the one Monday morning if you run into an unexpected issue.” We obviously pressed for
extensive change testing, but even with the leading test methodologies of the time, we had
to concede something unexpected could happen. We saved the truly large-scale changes for
those that were routine and low impact. For changes that were somewhat new, such as new
software releases or new features and protocols, we established a phased approach to gain
confidence and limit negative exposure.
As you contemplate scale, if you’re programming your own solutions using Python scripts or
similar, it is worthwhile to understand multithreading and multiprocessing. A few definitions
of concurrency and parallelism also are in order.
An application completing more than one task at the same time is considered concurrent.
Concurrency is working on multiple tasks at the same time but not necessarily simultane-
ously. Consider a situation with four tasks executing concurrently (see Figure 10-7). If you
had a virtual machine or physical system with a one-core CPU, it would decide the switching
involved to run the tasks. Task 1 might go first, then task 3, then some of task 2, then all of
task 4, and then a return to complete task 2. Tasks can start, execute their work, and com-
plete in overlapping time periods. The process is effectively to start, complete some (or all)
of the work, and then return to incomplete work where necessary—all the while maintain-
ing state and awareness of completion status. One issue to observe is that concurrency may
involve tasks that have no dependency among them. In the world of IT, an overall workflow
to enable a new web server may not be efficient for concurrency. Consider the following
activities:
1. Configure Router-A to download new software update (wait for it to process, flag it to
return to later, move on to next router), then . . .
2. Configure Router-B to download new software update (wait for it to process, flag it to
return to later, move on to next router), then . . .
3. Configure Router-C to download new software update (wait for it to process, flag it to
return to later, move on to next router), then . . .
4. Check Router-A status—still going—move on to next router.
5. Configure Router-D to download new software update (wait for it to process, flag it to
return to later, move on to next router).
6. Check Router-B status—complete—remove flag to check status; move to next router.
7. Configure Router-E to download new software update (wait for it to process, flag it to
return to later, move on to next router).
8. Check Router-A status—complete—remove flag to check status; move to next router.
9. Check Router-C status—complete—remove flag to check status; move to next router.
10. Check Router-D status—complete—remove flag to check status; move to next router.
11. Check Router-E status—complete—remove flag to check status; move to next router.
Router-A
Router-B
Router-C
Router-D
Router-E
1. Core-1: Configure Router-A to download new software update (wait for it to process, flag
it to return to later, move on to next router), while at the same time on another CPU . . .
2. Core-2: Configure Router-B to download new software update (wait for it to process,
flag it to return to later, move on to next router).
3. Core-1: Configure Router-C to download new software update (wait for it to process,
flag it to return to later, move on to next router).
4. Core-1: Check Router-A status—still going—move on to next router.
5. Core-2: Configure Router-D to download new software update (wait for it to process,
flag it to return to later, move on to next router).
6. Core-2: Check Router-B status—complete—remove flag to check status; move to next
router.
7. Core-2: Configure Router-E to download new software update (wait for it to process,
flag it to return to later, move on to next router).
8. Core-1: Check Router-A status—complete—remove flag to check status; move to next 10
router.
9. Core-1: Check Router-C status—complete—remove flag to check status; move to next
router.
10. Core-1: Check Router-D status—complete—remove flag to check status; move to next
router.
11. Core-2: Check Router-E status—complete—remove flag to check status; move to next
router.
Router-A
Router-B
Router-C
Router-D
Router-E
Figure 10-9 Parallelism Example
Because two tasks are executed simultaneously, this scenario is identified as parallelism. Par-
allelism requires hardware with multiple processing units, cores, or threads.
To recap, a system is concurrent if it can support two or more tasks in progress at the same
time. A system is parallel if it can support two or more tasks executing simultaneously. Con-
currency focuses on working with lots of tasks at once. Parallelism focuses on doing lots of
tasks at once.
So, what is the practical application of these concepts? In this case, I was dealing with the
Meraki Dashboard API; it allows for up to five API calls per second. Some API resources
like Get Organization (GET /organizations/{organizationId}) have few key-values to return,
so they are very fast. Other API resources like Get Device Clients (GET /devices/{serial}/cli-
ents) potentially return many results, so they may take more time. Using a model of parallel-
ism to send multiple requests across multiple cores—allowing for some short-running tasks
to return more quickly than others and allocating other work—provides a quicker experience
over doing the entire process sequentially.
To achieve this outcome, I worked with the Python asyncio library and the semaphores feature
to allocate work. I understood each activity of work had no relationship or dependency on
the running of other activities; no information sharing was needed, and no interference across
threads was in scope, also known as thread safe. The notion of tokens to perform work was
easy to comprehend. The volume of work was created with a loop building a list of tasks; then
the script would allocate as many tokens as were available in the semaphore bucket. When the
script first kicked off, it had immediate access to do parallel processing of the four tokens I
had allocated. As short-running tasks completed, tokens were returned to the bucket and made
available for the next task. Some tasks ran longer than others, and that was fine because the
overall model was not blocking other tasks from running as tokens became available.
■ Impacting the networking industry, challenging the way we think about engineering,
implementing, and managing networks
■ Providing new methods to interact with equipment and services via controllers and
APIs
Approach
So, why the asterisk next to an approach to network transformation? Well, it wasn’t the first
attempt at network transformation. If we consider separation of the control plane and data
plane, we can look no further than earlier technologies, such as SS7, ATM LANE, the wire-
less LAN controller, and GMPLS. If we were considering network overlays/underlays and
encapsulation, the earlier examples were MPLS, VPLS, VPN, GRE Tunnels, and LISP. Finally,
if our consideration was management and programmatic interfaces, we had SNMP, NET-
CONF and EEM. Nonetheless, SDN was a transformative pursuit.
Nontraditional Entities
What about those nontraditional entities influencing the network? As new programmatic
interfaces were purposely engineered into the devices and controllers, a new wave of net-
work programmers joined the environment. Although traditional network engineers skilled
up to learn programming (and that may be you, reading this book!), some programmers who
had little prior networking experience decided to try their hand at programming a network.
Or the programmers decided it was in their best interests to configure an underpinning net-
work for their application themselves, rather than parsing the work out to a network provi-
sioning team.
Regardless of the source of interaction with the network, it is imperative that the new inter-
faces, telemetry, and instrumentation be secured with the same, if not more, scrutiny as the
legacy functions. The security policies can serve to protect the network from unintentional
harm by people who don’t have deep experience with the technology and from the inten-
tional harm of bad actors.
Industry Impact
The impact to the network industry with operations and engineering was greatly influenced
by control plane and data plane separation and the development of centralized controllers.
The network management teams would no longer work as hard to treat each network asset as
an atomic unit but could manage a network en masse through the controller. One touchpoint
for provisioning and monitoring of all these devices! The ACI APIC controller is acknowl-
edged as one of the first examples of an SDN controller, as seen in Figure 10-11. It was able
to automatically detect, register, and configure Cisco Nexus 9000 series switches in a data
center fabric.
Spine Switches
ACI ACI
VM1 VM2
APIC APIC
10
Figure 10-11 Cisco ACI Architecture with APIC Controllers
New Methods
With respect to new methods, protocols, and interfaces to managed assets, APIs became
more prolific with the SDN approach. Early supporting devices extended a style of REST-
like interface and then more fully adopted the model. First NETCONF and then RESTCONF
became the desired norm. Centralized controllers, like the wireless LAN controller, ACI’s
APIC controller, Meraki, and others, prove the operational efficiency of aggregating the
monitoring and provisioning of fabrics of devices. This model has coaxed the question
“What else can we centralize?”
Normalization
SDN’s impact on network normalization is reflected in the increasingly standardized inter-
faces. While SNMP had some utility, SDN provided a fresh opportunity to build and use
newer management technologies that had security at their core, not just a “bolt-on” consid-
eration. Although the first API experiences felt a bit like the Wild Wild West, the Swagger
project started to define a common interface description language to REST APIs. Swagger
has since morphed into the OpenAPI initiative, and specification greatly simplifies API
development and documentation tasks.
Enabling Operations
Network operations, service provisioning, and management were influenced with SDN
through the new interfaces, their standardization, and programmatic fundamentals. Instead
of relying on manual CLI methods, operators began to rely on their growing knowledge base
of REST API methods and sample scripts in growing their operational awareness and ability
to respond and influence network functions.
Besides the REST API, other influences include gRPC Network Management Interface
(gNMI), OpenConfig, NETCONF, RESTCONF, YANG, time-series databases, AMQP pub-
sub architectures, and many others.
adhered to well-defined routing protocol specifications, SDN was to help separate the cur-
rent norms from new, experimental concepts.
The massively scalable data center community appreciated SDN for the ability to separate
the control plane from the data plane and use APIs to provide deep insight into network traf-
fic. Cloud providers drew upon SDN for automated provisioning and programmable network
overlays. Service providers aligned to policy-based control and analytics to optimize and
monetize service delivery. Enterprise networks latched onto SDN’s capability to virtualize
workloads, provide network segmentation, and orchestrate security profiles.
Nearly all segments realized the benefits of automation and programmability with SDN:
Several protocols and solutions contributed to the rise of SDN. See Table 10-4 for examples.
■ DNAC software controller API call to Cisco Support API endpoint for opening cases:
software to software
■ Cisco Intersight with UCS IMC for device registration, monitoring, and provisioning
from the cloud: software to hardware
To use an API, you must know the way to make requests, how to authenticate to the service,
how to handle the return results (data encoding), and other conventions it may use, such as
cookies. Public APIs often involve denial-of-service protections beyond authentication, such
as rate limiting the number of requests per time period, the number of requests from an IP
endpoint, and pagination or volume of data returned.
For the purposes of network IT, we mostly focus on web APIs, as we discuss in the next
section on REST APIs, but other common APIs you may experience are the Java Database
Connectivity (JDBC) and Microsoft Open Database Connectivity (ODBC) APIs. JDBC and
ODBC permit connections to different types of databases, such as Oracle, MySQL, and
Microsoft SQL Server, with standard interfaces that ease application development.
The Simple Object Access Protocol (SOAP) is also a well-known design model for web ser-
10
vices. It uses XML and schemas with a strongly typed messaging framework. A web service
definition (WSDL) defines the interaction between a service provider and the consumer. In
Cisco Unified Communications, the Administrative XML Web Service (AXL) is a SOAP-
based interface enabling insertion, retrieval, updates, and removal of data from the Unified
Communication configuration database.
Because a SOAP message has the XML element of an “envelope” and further contains
a “body,” many people draw the parallel of SOAP being like a postal envelope, with the
necessary container and the message within, to REST being like a postcard that has none of
the “wrapper” and still contains information. Figure 10-12 illustrates this architecture.
POST /v1/devices
REST APIs
RESTful APIs (or representational state transfer APIs) use Web/HTTP services for read and
modification functions. This stateless protocol has several predefined operations, as seen
in Table 10-5. Because it’s a stateless protocol, the server does not maintain the state of a
request. The client’s request must contain all information needed to complete a request, such
as session state.
API Methods
Another important aspect of a RESTful API is the API method’s idempotency, or capabil-
ity to produce the same result when invoked, regardless of the number of times. The same
request repeated to an idempotent endpoint should return an identical result regardless of
two executions or hundreds.
API Authentication
Authentication to a RESTful API can take any number of forms: basic authentication, API
key, bearer token, OAuth, or digest authentication, to name a few. Basic authentication is
common, where the username is concatenated with a colon and the user’s password. The
combined string is then Base64-encoded. You can easily generate the authentication string
on macOS or other Linux derivatives using the built-in openssl utility. Windows platforms
can achieve the same result by installing OpenSSL or obtaining a Base64-encoding utility.
Example 10-3 shows an example of generating a basic authentication string with openssl on
macOS.
Example 10-3 Generating Base64 Basic Authentication String on MacOS
This method is not considered secure due to the encoding; at a minimum, the connection
should be TLS-enabled so that the weak security model is at least wrapped in a layer of
transport encryption.
Either API key or bearer token is more preferable from a security perspective. These models
require you to generate a one-time key, usually from an administrative portal or user profile
page. For example, you can enable the Meraki Dashboard API by first enabling the API for
your organization: Organization > Settings > Dashboard API access. Then the associated
Dashboard Administrator user can access the My Profile page to generate an API key. The
key can also be revoked and a new key generated at any time, if needed.
API Pagination
API pagination serves as a method to protect API servers from overload due to large data
retrieval requests. An API may limit return results, commonly rows or records, to a specific
count. For example, the DNA Center REST API v2.1.2.x limits device results to 500 records 10
at a time. To poll inventory beyond that, you would use pagination:
GET /dna/intent/api/v1/network-device/{index}/{count}
For example, if you had 1433 devices in inventory, you would use these successive polls:
GET /dna/intent/api/v1/network-device/1/500
GET /dna/intent/api/v1/network-device/501/500
GET /dna/intent/api/v1/network-device/1000/433
Other APIs may provide different cues that pagination was in effect. The API return results
may include the following parameters:
Records: 2034
First: 0
Last: 999
Next: 1000
XML
The Extensible Markup Language (XML) is a markup language and data encoding model
that has similarities to HTML. It is used to describe and share information in a programmatic
but still humanly readable way.
XML documents have structure and can represent records and lists. Many people look at
XML as information and data wrapped in tags. See Example 10-4 for context.
Example 10-4 XML Document
<Document>
<Nodes>
<Node>
<Name>Router-A</Name>
<Location>San Jose, CA</Location>
<Interfaces>
<Interface>
<Name>GigabitEthernet0/0/0</Name>
<IPv4Address>10.1.2.3</IPv4Address>
<IPv4NetMask>255.255.255.0</IPv4NetMask>
<Description>Uplink to Switch-BB</Description>
</Interface>
<Interface>
<Name>GigabitEthernet0/0/1</Name>
<IPv4Address>10.2.2.1</IPv4Address>
<IPv4NetMask>255.255.255.128</IPv4NetMask>
<Description />
</Interface>
</Interfaces>
</Node>
</Nodes>
</Document>
In this example, the structure of this XML document represents a router record. <Docu-
ment>, <Nodes>, <Node>, <Name>, and <Location> are some of the tags created by the
document author. They also define the structure. Router-A, San Jose, CA, and GigabitEther-
net0/0/0 are values associated with the tags. Generally, when an XML document or schema
is written, the XML tags should provide context for the value(s) supplied. The values associ-
ated with the tags are plaintext and do not convey data type. As a plaintext document, XML
lends well to data exchange and compression, where needed.
XML has a history associated with document publishing. Its functional similarity with
HTML provides value: XML defines and stores data, focusing on content; HTML defines
format, focusing on how the content looks. The Extensible Stylesheet Language (XSL) pro-
vides a data transformation function, XSL Transformations (XSLT), for converting XML
documents from one format into another, such as XML into HTML. When you consider
that many APIs output results in XML, using XSLTs to convert that output into HTML is an
enabling feature. This is the basis for simple “API to Dashboard” automation.
Referring to Example 10-4, you can see that XML documents contain starting tags, such as
<Node>, and ending (or closing) tags, such as </Node>. There is also the convention of an
empty element tag; note the <Description /> example. All elements must have an end tag or
be described with the empty element tag for well-formed XML documents. Tags are case
sensitive, and the start and end tags must match case. If you’re a document author, you are
able to use any naming style you wish: lowercase, uppercase, underscore, Pascal case, Camel
case, and so on. It is suggested that you do not use dashes (-) or periods (.) in tags to prevent
misinterpretation by some processors.
All elements must be balanced in nesting, but the spacing is not prescribed. A convention
of three spaces aids the reader. It is acceptable for no spacing in a highly compressed docu-
ment, but the elements must still be nested among start and end tags.
XML can have attributes, similar to HTML. In the HTML example <img src="devnet_
logo.png" alt="DevNet Logo" />, you can recognize attributes of src and alt with values of
"devnet_logo.png" and "DevNet Logo". Similarly, in XML, data can have attributes—for
example, <interface type="GigabitEthernet">0/0/0</interface>.
Attribute values, such as “GigabitEthernet”, must be surrounded by double quotes. Values
between tags, such as 0/0/0, do not require quotes.
XML documents usually start with an XML declaration or prolog to describe the version
and encoding being used, but it is optional:
XML documents can be viewed in browsers, typically through an Open File function. The
browser may render the XML with easy-to-understand hierarchy and expand or collapse
functions using + and -or ^ and > gadgets. See Figure 10-13 for another example of ACI
XML data, rendered in a browser.
JSON
JavaScript Object Notation (JSON) is a newer data encoding model than XML and is
growing in popularity and use with its more compact notation, ease of understanding, and
closer integration with Python programming. It is lightweight, self-describing, and program-
ming language independent. If your development includes JavaScript, then JSON is an easy
choice for data encoding with its natural alignment to JavaScript syntax.
The JSON syntax provides data being written as name-value pairs. Data is separated by com-
mas. Records or objects are defined by curly braces { }. Arrays and lists are contained within
square brackets [ ].
The name of a name-value pair should be surrounded by double quotes. The value should
have double quotes if representing a string. It should not have quotes if representing a
numeric, Boolean (true/false), or null value. See Example 10-5 for a sample JSON record.
{
"Document": {
"Nodes": {
"Node": {
"Name": "Router-A",
"Location": "San Jose, CA",
"InterfaceCount": 2,
"Interfaces": {
"Interface": [
{
"Name": "GigabitEthernet0/0/0",
"IPv4Address": "10.1.2.3",
"IPv4NetMask": "255.255.255.0",
"Description": "Uplink to Switch-BB",
"isConnected": true
},
{
"Name": "GigabitEthernet0/0/1",
"IPv4Address": "10.2.2.1",
"IPv4NetMask": "255.255.255.128",
"Description": null,
"isConnected": false
}
]
}
}
}
}
}
Using this example, you can compare the structure with the previous XML representation.
There is a list of interfaces; each interface is a record or object.
With APIs, the system may not give you a choice of data formatting; either XML or JSON
may be the default. However, content negotiation is supported by many APIs. If the server
drives the output representation, a Content-Type header shows “application/xml” or “appli-
cation/json” as the response body payload type.
If the requesting client can request what’s desired, then an Accept header specifies similar 10
values for preference. With some APIs, appending .xml or .json to the request URI returns
the data with the preferred format.
10-14. This example represents, essentially, a mashup of key health metrics from several tools:
Prime Infrastructure, DNA Center, vCenter, Prime Network Registrar, Hyperflex, and so on.
■ Financial: Pull in ATM (cash, not legacy networking!), vault/deposit box status.
■ Retail: Fork lifts, credit card and point-of-sale terminals.
If you add “business care-abouts” to the network IT perspectives, does that allow you to see
contribution and impact of the supporting infrastructure to the broader company? Sure, it
10
does!
were configured; logging and accounting were enabled; possibly two-factor or multifactor
authentication was provisioned. In any case, security was given a strong consideration.
So now that network devices, management applications, and controllers have program-
matic interfaces to extract and change functions of networks, are you continuing the same
scrutiny? Are you the main source of API integration, or were other people with strong
programming experience brought in to beef up automation? Do they have strong network-
ing experience in concert with their programming skills? Are they keeping in touch with you
about changes? Oh no! Did that network segment go down?
Okay, enough of the histrionic “what if” scenario. You just need to make sure the same rigor
applied to traditional network engineering and operations is also being applied to newer,
SDN, and programmatic environments.
What are the leading practices related to programmable networks? First, consider your risk.
What devices and services are managed through controllers? They should be secured first
because they have the broadest scope of impact with multiple devices in a fabric. Enable all
the security features the controller provides with the least amount of privileges necessary to
the fewest number of individuals (and other automated systems). If the controller has limited
security options, consider front-ending it with access lists or firewall services to limit access
and content. Remember to implement logging and accounting; then review it periodically.
The next order of business should be high-priority equipment where the loss of availability
has direct service, revenue, or brand recognition impact. It’s the same activity: tighten up
access controls to the newer programmatic interfaces and telemetry.
Finally, go after the regular and low-priority equipment to shore up their direct device man-
agement interfaces in a similar fashion.
References
URL QR Code
https://developer.cisco.com/pyats/
10
This chapter maps to the Developing Applications Using Cisco Core Platforms and APIs
v1.0 (350-901) Exam Blueprint Section 5.2, “Utilize RESTCONF to configure a network
device including interfaces, static routes, and VLANs (IOS XE only).”
NETCONF and RESTCONF are relatively new but mature management protocols that are
well suited for network programmability. This chapter covers the basic theory behind NET-
CONF and RESTCONF. It shows examples of how to use them so that you can apply them
in your Cisco network environment.
Foundation Topics
Catalyst for NETCONF
As the number of network equipment manufacturers increased, so did the variety of con-
figuration syntaxes and parameters to provision and monitor those systems. Sometimes, as
it was for Cisco, network equipment suppliers had multiple product lines that used different
software—essentially network operating systems. These different operating systems often
meant different command syntax and provisioning arguments. Even considering a funda-
mental feature like Border Gateway Protocol (BGP) meant different “golden configs” or tem-
plates across operating systems for Cisco IOS, Cisco IOS XE, Cisco NX-OS, Cisco IOS-XR,
Juniper Junos OS, HPE Comware, Arista EOS, and so on. This difference in configuring the
same function across different device vendors, models, and operating systems led to many
network engineers maintaining multiple configuration standards. Having multiple configura-
tion standards meant a higher probability of variance, especially when periodic review and
synchronization were not part of a standard process. The industry, most notably the service
provider segment, sought a way to have “one configuration to rule them all.” We’re sure this
would make Frodo the network engineer happy!
In May 2003, the Internet Engineering Task Force (IETF) formed the NETCONF (network
configuration) working group. Their work was first published as RFC 4741 in December
2006. Cisco provided initial support all the way back in IOS 12.4(9)T and presented the
NETCONF topic at the CiscoLive 2009 event in San Diego. Several updates and enhance-
ments have been developed over the years; the foundational protocol was updated as RFC
6241 in June 2011.
Figure 11-1 shows the layered model for NETCONF, which includes a bottoms-up transport
layer, messaging container, operational directives, and content payload.
Layer Example
Configuration Notification
Content
Data Data
<get>,
Operations
<edit-config>, …
<rpc>,
Messages <notification>
<rpc-reply>
Secure
SSH, TLS, BEEP/TLS, SOAP/HTTP/TLS, ...
Transport
Content
Originally, the content of NETCONF operations was founded on the Extensible Markup
Language (XML). XML provided structure and syntax for encoding data in a format that is
both human-readable and machine-readable. This provision served the needs of both tradi-
tional network engineers who wanted more control over the provisioning and monitoring of
their network and devices with those who wanted more programmatic options. The NET-
CONF protocol has also been enhanced to support JavaScript Object Notation (JSON)
(note RFC 7951), but it remains common to see XML in implementations.
Later in this chapter, you learn how the NETMOD (network modeling) working group aimed
to define a modeling language that was also easy to use for humans. It would define seman-
tics of operational data, configuration data, notifications, and operations in a standard called
YANG.
XML has its origins in Standard Generalized Markup Language (SGML) and has similarities
to Hypertext Markup Language (HTML). Markup is a type of artificial language used to
annotate a document’s content to give instructions regarding structure or how it is to be dis-
played (rendered). XML is used to describe data, whereas HTML is used to display or render
the data.
XML provides many benefits: it promotes the capability to use data or text in many ways
from one original input source. This capability promotes data exchange and enables cross-
platform data sharing. XML provides a software- and hardware-independent markup 11
language for structuring and transmitting data. So the early days of NETCONF drove mar-
ket trends leaning to it for configuration management and provisioning, verification, fault,
and operational state monitoring. Later and more currently, JSON is being used more promi-
nently for data encoding in the network IT discipline.
Operations
To effectively manage device configurations, different tasks and operation types have to be
accommodated. The standard editing and deleting of a configuration are nearly universal
across vendors and products. The support of more sophisticated methods, such as locking a
configuration, varies in device implementation.
The base NETCONF protocol defines several operation types, as seen in Table 11-2.
Messages
The NETCONF messages layer is a simple, transport-independent framing mechanism for
encoding directives. Remote-procedure calls (RPCs) are declared with <rpc> messages. The
RPC results are handled with <rpc-reply> messages, and asynchronous notifications or alerts
are defined by <notification> message types.
Every NETCONF message is a well-formed XML document. An original <rpc> message
provides an association to <rpc-reply> with a message-id attribute. Multiple NETCONF
messages can be sent without waiting for sequential RPC result messages. Figure 11-2 shows
a simple XML-encoded NETCONF message. Note the structure and message-id linkage. We
cover more specifics in the “How to Implement NETCONF” section.
Transport
When you are dealing with the transfer of critical infrastructure configurations, it is manda-
tory to keep the information private and secure. This was a shortcoming in version 1 and
2c of the Simple Network Management Protocol (SNMP). SNMP did not have inherent
security for the authentication of management systems or the encryption of the data being
transferred. SNMPv3 transformed the protocol by bringing both authentication and encryp-
tion. The IETF NETCONF working group planned security as a foundational requirement
for NETCONF. The initial RFC 4741 stipulated that NETCONF connections must provide
authentication, data integrity, and confidentiality. It further provided that connections may
be encrypted in Transport Layer Security (TLS) or Secure Shell (SSH). The most widely used
implementations were the transport layer for NETCONF using SSHv2. An additional option
was the Blocks Extensible Exchange Protocol (BEEP); however, it was not widely supported
and is not suggested for future implementations.
RFC 7589 has brought additional functionality in TLS with Mutual X.509 Authentication.
interface GigabitEthernet##INT_NUM##
description ##DESCRIPTION##
switchport access vlan ##VLAN_NUM##
switchport mode access
switchport port-security maximum 10
switchport port-security
switchport port-security aging time 10
switchport port-security aging type inactivity
no logging event link-status
load-interval 30
storm-control broadcast level pps 100
storm-control multicast level pps 2k
storm-control action trap
spanning-tree portfast
Figure 11-3 Sample Configuration Template
Or maybe you’ve deployed configurations using Configuration Management functions in a
templatized sense that were part of commercial tools, like Prime Infrastructure, as in
Figure 11-4.
slightly different configuration template for the standby router(s). An atomic configura-
tion management solution would not recognize the interdependencies and would allow
you to push the changes in a way that may not function or could cause a larger service
outage.
Conversely, model-driven configuration management strives to abstract the individual
device configurations in favor of the deployment of a service or feature. The result-
ing configuration(s) generated may involve multiple devices as they form the basis for
deploying the service/feature. The model is either fully deployed successfully, or it is
rolled back or removed entirely so that no partial configuration fragments are left on
any devices.
Another well-known but more advanced example would be deploying Multiprotocol Label
Switching (MPLS). Figure 11-5 depicts a standard architecture for MPLS.
Site 1 Site 3
P P
CE CE
PE P PE
Site 2 P P Site 4
P
Figure 11-5 Standard MPLS Architecture
Consider how a virtual routing and forwarding (VRF) feature is deployed. It involves several
device types and functions. In a typical MPLS deployment, a customer edge (CE) router per-
forms local routing and shares routing information with the provider edge (PE) where routing
tables are virtualized. PE routers encapsulate traffic, marking it to identify the VRF instance,
and transmit it across the service provider backbone network over provider devices to the
destination PE router. The destination PE router then decapsulates the traffic and forwards it
to the CE router at the destination. The backbone network is completely transparent to the
customer equipment, allowing multiple customers or user communities to use the common
backbone network while maintaining end-to-end traffic separation.
Now consider the previous architecture with a bit more configuration definition.
In Figure 11-6 note the commonalities across the highlighted devices.
11
configure terminal
ip vrf v11 Site 1
rd 800:1 configure terminal
route-target export 800:1 ip vrf v1
route-target import 800:1 g5/0/0
rd 100:1
exit route-target export 100:1
g3/0/0
interface fastethernet5/0/0 route-target import 100:1
ip vrf forwarding v11 g1/0/0
CE exit
ip address 10.0.0.8 255.255.255.0 router bgp 100
exit address-family ipv4 vrf vl
router ospf 1 vrf v11 neighbor 10.0.0.8 remote-as 800
network 10.0.0.0 255.255.255.0 area 0 neighbor 10.0.0.8 activate
exit PE neighbor 10.0.0.8 send-label
router bgp 800 end
address-family ipv4 vrf v11 interface gigabitethernet1/0/0.10
neighbor 10.0.0.3 remote-as 100 ip vrf forwarding v1
neighbor 10.0.0.3 activate ip address 10.0.0.3 255.255.255.0
neighbor 10.0.0.3 send-label mpls bgp forwarding
redistribute ospf 1 match internal exit
end
interface gigabitethernet3/0/0.10
ip vrf forwarding v11
ip address 10.0.0.8 255.255.255.0
mpls bgp forwarding
exit
TIP If you want to get some hands-on experience with the feature but would rather not use
your own equipment (or don’t have dev-test equipment), then check out the DevNet Always
On Labs and Sandbox environments at https://developer.cisco.com/site/sandbox/. There are
shared Always On environments for IOS XE, IOS XR, and NX-OS to use.
Let’s get started by enabling the feature and work from there.
You can obtain detailed NETCONF session data with the command shown in Example 11-2.
Example 11-2 Using show netconfig-yang to Get Detailed Session Data
Number of sessions : 1
session-id : 10
transport : netconf-ssh
username : netconf-admin
source-host : 2001:db8::110
login-time : 2021-04-15T10:22:13+00:00
in-rpcs : 0
in-bad-rpcs : 0
out-rpc-errors : 0
out-notifications : 0
global-lock : None
Finally, if you’re interested in the amount of NETCONF feature use, you can use the com-
mand shown in Example 11-3.
Example 11-3 Using show netconf-yang to Get Feature Statistics
in-rpcs : 0
in-bad-rpcs : 0
out-rpc-errors : 0
out-notifications : 15
in-sessions : 124
dropped-sessions : 0
in-bad-hellos : 0
The VRF and access-list definition should be defined to suit your environment. The last com-
mand, ssh server netconf port, can be omitted if you use the default port 830; otherwise,
include the custom port desired.
To view the NETCONF (and YANG) statistics, you can use the command shown in Example 11-4.
Example 11-4 Using show netconfig-yang to Extract Statistics
Additionally, you can obtain detailed session and client information with the following
command:
<capability>http://cisco.com/ns/cisco-xe-ietf-ip-
deviation?module=cisco-xe-ietf-ip-deviation&revision=2016-08-10</
capability>
<capability>http://cisco.com/ns/cisco-xe-ietf-ipv4-unicast-routing-
deviation?module=cisco-xe-ietf-ipv4-unicast-routing-deviation&revis
ion=2015-09-11</capability>
<capability>http://cisco.com/ns/cisco-xe-ietf-ipv6-unicast-routing-
deviation?module=cisco-xe-ietf-ipv6-unicast-routing-deviation&revis
ion=2015-09-11</capability>
. . . (MANY lines omitted) . . .
<capability>urn:ietf:params:xml:ns:yang:ietf-netconf-with-
defaults?module=ietf-netconf-with-defaults&revision=2011-06-01</
capability>
<capability>
urn:ietf:params:netconf:capability:notification:1.1
</capability>
</capabilities>
<session-id>21</session-id></hello>]]>]]>
Some items of note: the usual XML declaration line is followed by a <hello> element, fol-
lowed by a <capabilities> list element of many <capability> child elements. Finally, there’s a
session identifier (21 in this case), some closing tags, and a unique session ending sequence:
]]>]]>
This sequence is important for both manager and managed device to recognize the end of an
exchange.
To (manually) interact with the NETCONF agent, you must send your capabilities also, or
the agent will not respond to any other input. For example, you can send into the SSH termi-
nal the text (pasted) shown in Example 11-6.
Example 11-6 NETCONF Hello from Management Station
You do not see any results from the device, but it registers your capabilities (NETCONF
base 1.0) and awaits your next directives.
NETCONF uses RPCs to wrap the directives. Let’s assume the next thing you want to do is
obtain the running configuration from the device. Using this method is a good way to under-
stand the structure and syntax of an XML-encoded configuration. You would send the code
shown in Example 11-7 in your SSH session.
11
NOTE The device may time out after a few minutes of nonactivity. If so, reconnect and
resend the capabilities exchange.
In this case, the device immediately responds with the running configuration in a compact
XML form, as shown in Example 11-8.
Example 11-8 NETCONF Response from Device
Running this response through an XML formatting utility may be useful to discern the
structure.
NOTE Don’t copy and paste the XML-formatted configuration of one of your devices into
an online XML utility. It is hard to say whether those free online services are retaining your
data input. One of the easiest ways to view this without an external or a third-party resource
is to save the output, minus the ]]>]]> closing sequence, to a text file with an .xml file exten-
sion and then load that text file into your browser. The browser renders it formatted and
enables you to expand or close branches of configurations. Figure 11-7 shows the results of
a browser processing the compact XML output from Example 11-8.
11
3
4
<native xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-native">
<interface>
<GigabitEthernet>
<name>2</name>
<description>
The intermediary parts are nonessential and can be omitted. You only need to send in the
hierarchy and the element being changed. Example 11-10 shows how to change Gigabit-
Ethernet2’s port description.
When you push that into an active NETCONF session (with the capabilities exchange
already being complete), you get the following result from the device:
Or, more appropriately to your network programmability interest, you could execute the
same <get-config> operation as before and get the configuration back in XML form, as seen
in Figure 11-9, rendered in an XML editor.
11
YANG Models
NETCONF provides the protocol layer for managing device configurations in a program-
matic and consistent way, but to be more effective, a data modeling language needed to
be paired with NETCONF. The Yet Another Next Generation, or YANG, data modeling
language provides a standardized way to represent operational and configuration data of a
network device. YANG is protocol independent and can be converted into any encoding for-
mat, such as XML or JSON. You may also hear YANG referred to as a data model or even
device data.
Similar to NETCONF, YANG was born from an IETF working group. The NETMOD work-
ing group was charged with creating a “human-friendly” modeling language to define
semantics of operational data, configuration data, notifications, and operations. The original
version of YANG was defined in RFC 6020 with an update (version 1.1) in RFC 7950. RFC
6991 defined “Common YANG Data Types,” such as seen in Table 11-3. Although SNMP is
not a foundational component of the DEVCOR exam, the SMIv2 types are shown for com-
parison for those with SNMP background.
11
Separate Internet types, ietf-inet-types, were also defined and can be seen in Table 11-4.
NETMOD also worked on foundational configuration models such as system, interface, and
routing. These models are part of a class of open models that are meant to be independent
from the platform. The intent is also to normalize configuration syntax across vendors. Open
YANG models are developed by the IETF, the ITU agency, the OpenConfig consortium, and
other standards bodies. Conversely, native models are developed by equipment manufactur-
ers. The models may be specialized to features or syntax specific to that device type, model,
or platform.
Besides being the data modeling language for NETCONF, YANG also became associated
with RESTCONF and gRPC, which is covered in Chapter 12, “Model-Driven Telemetry,” for
streaming telemetry.
Let’s look at some of the structure of YANG modules by using a familiar construct
such as device interfaces. Review the IETF’s implementation of the interface’s YANG
model at https://github.com/YangModels/yang/blob/master/standard/ietf/RFC/
ietf-interfaces.yang.
As we review Figure 11-10, you’ll see some easy-to-understand constructs.
In this figure, you can see a module name, ietf-interfaces, with version, namespace, and pre-
fix. The standard ietf-yang-types, mentioned earlier, are imported for use in this model, so
data types are recognized. If you navigate down to the interfaces section as in Figure 11-11,
you see more familiar structure.
11
After installing the pyang utility, you can use it to review the ietf-interfaces.yang module.
First, you need to download the YANG module; then you use the utility to read it in a tree
format. Because you’re looking at the GitHub ietf-interfaces YANG model, note that the
official reference of https://github.com/YangModels/yang/blob/master/standard/ietf/RFC/
ietf-interfaces.yang is actually a symbolic link to a date-referenced version that aligns to RFC
updates. As of this writing, the latest correct reference is https://raw.githubusercontent.com/
YangModels/yang/master/standard/ietf/RFC/ietf-interfaces%402018-02-20.yang; this is what
you should reference to download with wget, as shown in Example 11-13.
Example 11-13 Using the wget Utility to Download the YANG Model
[email protected] 100%[===============================================
=======>] 38.44K --.-KB/s in 0.008s
Looking at Figure 11-12 with annotations, you can more easily understand the structure,
syntax, and data-type representations.
Read-Writable
List of “interface” Objects
Keyed by “name”
Layer Example
Configuration or XML
Content
Operational Data JSON
GET,
Operations Actions POST, PUT,
PATCH, DELETE
RESTCONF Operations
Similar to RESTful APIs, RESTCONF uses a familiar request/response model where the
management tool or application makes a request to the network device that provides the
response. Ideally, you get a familiar HTTP 200 OK response!
The use of REST APIs also parallels create, retrieve, update, delete (CRUD)–style operations.
Table 11-5 shows a mapping of NETCONF operations to HTTP operation and RESTCONF
equivalents.
Content-Type: application/yang-data+json
Content-Type: application/yang-data+xml
And here are the options to define the desired data type to be returned:
Accept: application/yang-data+json
Accept: application/yang-data+xml
Example 11-14 shows an example of generating a basic authentication string with openssl on
macOS. Try it on your system.
Example 11-14 Generating Base64 Basic Authentication String on a Mac
The basic authentication mechanism does not provide confidentiality of the transmitted
credentials. Base64 is only encoding data; it is not equivalent to encryption or hashing. So,
basic authentication must be used in conjunction with HTTPS to provide confidentiality
through an encrypted transport session, as is required in RESTCONF.
RESTCONF URIs
All RESTCONF URIs follow this format:
https://<DeviceNameOrIP>/<ROOT>/data/<[YANGMODULE:]CONTAINER>/
<LEAF>[?<OPTIONS>]
The components are explained in Table 11-6.
Component Explanation
[YANG The base model container being used; inclusion of the module name is
MODULE:] optional.
CONTAINER
LEAF An individual element from within the container.
[?<OPTIONS>] Query parameters that modify or filter returned results; see Table 11-7.
Optional query parameters or options are explained in Table 11-7.
In your situation, use the --insecure option if using a self-signed certificate on the managed
device, or remove it if using registered certificates. For the DevNet Sandbox environment,
the NETCONF agent is purposely registered on port 9443, which is different than yours if
using defaults. Also, your specific username-password combination for basic authentication
will be different.
Note that the device returns XML-formatted data with an XRD element. There also may
be multiple Link child elements; if so, the one of note is the entry with the rel attribute of
restconf. In this case, the href attribute value of /restconf/ is the information you need,
matching the normal convention.
<interface>
<name>GigabitEthernet3</name>
<description>Configured by RSTconf</description>
<type xmlns:ianaift="urn:ietf:params:xml:ns:yang:i
ana-if-type">ianaift:ethernetCsmacd</type>
<enabled>true</enabled>
<ipv4 xmlns="urn:ietf:params:xml:ns:yang:ietf-ip">
<address>
<ip>10.255.244.14</ip>
<netmask>255.255.255.0</netmask>
</address>
</ipv4>
<ipv6 xmlns="urn:ietf:params:xml:ns:yang:ietf-ip">
</ipv6>
</interface>
<interface>
<name>Loopback1000</name>
<description>Added with Restconf</description>
<type xmlns:ianaift="urn:ietf:params:xml:ns:yang:i
ana-if-type">ianaift:softwareLoopback</type>
<enabled>true</enabled>
<ipv4 xmlns="urn:ietf:params:xml:ns:yang:ietf-ip">
<address>
<ip>1.1.5.5</ip>
<netmask>255.255.255.255</netmask>
</address>
</ipv4>
<ipv6 xmlns="urn:ietf:params:xml:ns:yang:ietf-ip">
</ipv6>
</interface>
<interface>
<name>Loopback1313</name>
<description>'test interface'</description>
<type xmlns:ianaift="urn:ietf:params:xml:ns:yang:i
ana-if-type">ianaift:softwareLoopback</type>
<enabled>false</enabled>
<ipv4 xmlns="urn:ietf:params:xml:ns:yang:ietf-ip">
<address>
<ip>13.13.13.13</ip>
<netmask>255.255.255.255</netmask>
</address>
</ipv4>
<ipv6 xmlns="urn:ietf:params:xml:ns:yang:ietf-ip">
</ipv6>
</interface>
</interfaces>
How many interfaces do you see in the DevNet Sandbox IOS XE on CSR Recommended
Code Always On lab device? How many interfaces are enabled?
If you had another device with many interfaces, this could be a much larger result! You
can get a subset of this information by specifying a desired leaf, as with Example 11-17,
using the following URI: https://sandbox-iosxe-latest-1.cisco.com:443/restconf/data/
ietf-interfaces:interfaces/interface=GigabitEthernet2.
Example 11-17 Querying for Interface-Specific Results with RESTCONF and cURL
Excellent! Now you can try your hand at modifying a configuration using RESTCONF.
11
On the Authorization tab, you should see Inherit Auth from Parent. The operation should be
GET, and the request URL should be https://sandbox-iosxe-latest-1.cisco.com:443/restconf/
data/ietf-interfaces:interfaces/interface=GigabitEthernet2.
Now, if you execute the request by clicking the Send button, you get results similar to
Figure 11-16.
import requests
url = "https://sandbox-iosxe-latest-1.cisco.com:443/restconf/data/
ietf-interfaces:interfaces/interface=GigabitEthernet2"
payload={}
headers = {
'Authorization': 'Basic ZGV2ZWxvcGVyOkMxc2NvMTIzNDU='
}
print(response.text)
11
How cool is that!? This code can serve to seed a more sophisticated Python script. Of
course, you should do some appropriate security cleanup, such as removing the credentials
from being fixed in your script, but it’s a start!
Let’s do one more GET operation to exercise your understanding of using query options
(parameters). Now create a copy of the earlier Postman request, but this time onto the end of
the URI, add ?fields=name;description;enabled. Your entry (and execution) should be simi-
lar to Figure 11-17.
Figure 11-17 Executing a Postman Request to GET Interface with Query Option
Now that you’ve rebuilt the GET method in Postman, you can pivot to doing a PUT opera-
tion and pushing an updated interface description into the device, as you did earlier with
NETCONF means.
Create a copy of the GET Interface request, but save it as PUT Interface Description. In the
new request, change the request type from GET to PUT. Select the Headers tab and create
or modify the Content-Type header to a value of application/yang-data+xml. On the Body
tab, select the Raw option and paste the following:
<interface xmlns="urn:ietf:params:xml:ns:yang:ietf-interfaces" xml
ns:if="urn:ietf:params:xml:ns:yang:ietf-interfaces">
<name>GigabitEthernet2</name>
<type xmlns:ianaift="urn:ietf:params:xml:ns:yang:iana-if-
type">ianaift:ethernetCsmacd</type>
</interface>
You can make the description anything you like. After you click Send, the results should
appear similar to Figure 11-18.
Figure 11-18 Creating and Executing a Postman Request to PUT Interface Description
Again, you can use the Postman Code feature to generate a Python-Requests code snippet,
allowing you to create more robust scripts that would programmatically change the device
configuration with RESTCONF via Python.
Try your hand at getting information in JSON format. Remember that you’ll need to change
the Accept header for a GET operation to application/yang-data+json. See Figure 11-19 as
the JSON equivalent to the operation performed in Figure 11-16.
11
Hopefully, you’re seeing the prospects of using RESTCONF to programmatically obtain and
change device settings.
■ community.yang.fetch
■ community.yang.get
■ community.yang.configure
■ community.yang.generate_spec
In the summer of 2020, work was started on the ansible.netcommon.restconf HttpApi plug-
in for devices supporting RESTCONF API.
If you are a Puppet user, check out this DevNet Code Exchange entry that uses Puppet and
NETCONF to manage IOS XR-based devices: https://developer.cisco.com/codeexchange/
github/repo/cisco/cisco-yang-puppet-module/.
The previous examples of using Postman and the Code generator, especially for Python-
Requests code snippets, should be helpful to get you started with RESTCONF options. Con-
sider the next time you have a need to change all Syslog and SNMP trap receivers in your
environment. In the past, have you simply jammed the new ones in the configurations and
held off on removing the old ones? Were you sure you got all decommissioned options out
of the configuration? By using the newer manageability protocols and some simple Python
scripting, you will be able to execute mass configuration updates and do it quickly and
effectively!
References
URL QR Code
https://developer.cisco.com/site/sandbox/
https://github.com/YangModels/yang/blob/master/standard/ietf/
RFC/ietf-interfaces.yang
https://github.com/mbj4668/pyang
https://devnetsandbox.cisco.com
https://www.postman.com
https://www.postman.com/ciscodevnet
https://www.cisco.com/c/en/us/products/cloud-systems-
management/network-services-orchestrator/index.html
URL QR Code
https://developer.cisco.com/site/nso/
https://developer.cisco.com/yangsuite/
https://developer.cisco.com/codeexchange/github/repo/cisco/
cisco-yang-puppet-module/
11
Model-Driven Telemetry
■ Scaling with the Push Model: This section covers the intentions of model-driven
telemetry (MDT).
■ How to Implement Model-Driven Telemetry: This section covers how MDT is config-
ured and used on various Cisco platforms.
■ Picking Sensor Paths and Metrics: This section covers the methodology in determin-
ing sensor paths and metrics used to monitor an MDT-enabled environment.
This chapter maps to the Developing Applications Using Cisco Core Platforms and APIs
v1.0 (350-901) Exam Blueprint Section 5.1, “Explain considerations of model-driven telem-
etry (including data consumption and data storage).”
Network engineering and operations have seen many changes over the years. Despite the
standardization and adoption of the Simple Network Management Protocol (SNMP), which
was often mocked as being anything but simple, change needed to happen. Even as early
as mid-2002, the IETF recognized that the current protocols were insufficient for future
growth and innovation. The concerns were documented in Informational RFC 3535 (https://
tools.ietf.org/html/rfc3535). To that end, innovation has happened in this space, which can
be seen in the development and adoption of model-driven telemetry (MDT). MDT pro-
vides a fresh perspective at obtaining network inventory, performance, and fault informa-
tion using an efficient push model over the traditional request-response methods seen in
SNMP. Indeed, some innovators in the cloud space largely bypassed SNMP in lieu of MDT
approaches. At NANOG 73 (June 2018), Google shared its experience moving away from
SNMP to MDT, as seen in its presentation link (https://pc.nanog.org/static/published/
meetings//NANOG73/daily/day_2.html#talk_1677).
In this chapter, first we review the benefits of adopting model-driven telemetry and then
weigh options for our purposes.
1. When you are comparing the different capabilities of SNMP with model-driven telem-
etry, which is true?
a. SNMPv1 data security is similar to MDT data security.
b. SNMPv2c data security is similar to MDT data security.
c. SNMPv1 and 2c data security is similar to MDT data security.
d. SNMPv3 data security is similar to MDT data security.
2. MDT gRPC uses which application protocol?
a. HTTP
b. HTTP/2
c. SPDY
d. SCP
e. sFTP
3. Considering MDT dial-out mode, which statement is true?
a. The device is the client and source pushing data; the telemetry receiver is the
server.
b. The device is passive; the telemetry receiver initiates the connection, requesting
the data.
c. The device is active; the telemetry receiver initiates the connection, requesting the
data.
Foundation Topics 12
Transformation of Inventory, Status, Performance, and
Fault Monitoring
As discussed earlier, the stone age of network IT involved manual effort by logging in to a
device through a serial console cable and using Telnet or (eventually) SSH. The process of
executing show commands worked fine for checking device inventory, status, and perfor-
mance and for fault monitoring in smaller environments. However, as network device counts
increased, the need for automation and network programmability became clear. Would you
want to manage a network of 230,000 devices using command-line methods of SSH and
show commands? Of course not!
Some improvements were gained over using the CLI with management tools; a progression
was to use SNMP to access similar instrumentation. For example, the ENTITY-MIB could
identify much of a device’s inventory. The IF-MIB could show interface statistics. But a more
effective approach for fault monitoring was needed—ideally, in a “push me a notification”
model.
SNMP notifications (traps) and syslog event messages were useful for fault monitoring.
Unfortunately, their scope and information depth were different. Let’s consider a popular
legacy platform: the Catalyst 6500. It supported about 100 SNMP notifications, but over
8,000 syslog event messages. Environments that were focused on just SNMP notifications
missed a lot of event visibility. Additionally, SNMP notifications did not have severity identi-
fiers unless the Management Information Base (MIB) developer intentionally put a varbind
(variable binding) into a notification type that included a severity declaration.
Syslog event messages were great; they had severity identifiers and were written with gener-
ally readable facility, severity, mnemonic, and message text, as in this example:
All of these issues were unacceptable to growing networks that needed automation and
network programmability. Addressing the need for network monitoring, streaming telem-
etry provided an alternative to SNMP. The generic term streaming telemetry is used when
referring to model-driven telemetry (MDT) or event-driven telemetry (EDT). With
streaming telemetry, the network element pushes, or “streams,” the instrumentation or data
measurement to a streaming telemetry collector (or controller). Model-driven telemetry
uses YANG models that define the structure and instrumentation of a device or service
model. MDT is commonly sample-interval-based, also known as cadence-based. EDT also
depends on YANG models, but rather than sample-interval-based, its data is pushed upon
change of an object or data measurement. Both employ the concept of sensor path,
which defines the object, instrumentation, or data measurement of concern. This is an ana-
log to SNMP’s MIB object.
Telemetry applications can subscribe to specific data items they need, by using standards-
based YANG data models. The transport can be based on NETCONF, RESTCONF, gRPC,
or gRPC Network Management Interface protocols. Subscriptions can also be created by
using CLIs if they are configured subscriptions.
Google Remote Procedure Call (gRPC) is a remote-procedure call (RPC) system devel-
oped by Google in 2015. gRPC uses HTTP/2 for transport with TLS and token-based authentica-
tion. The HTTP/2 support provides low latency and scalability beyond SNMP implementations.
With support of the OpenConfig community in 2017, Google also introduced Google Remote
Procedure Call (gRPC) Network Management Interface (gNMI). gNMI focuses on config-
uration management and state retrieval. It complements telemetry in that regard. The expectation
is that gNMI will take more prominence as more platforms adopt support.
Within telemetry, structured data is published at a defined cadence, or on-change, based on
the subscription criteria and data type. Streaming telemetry has several benefits, as seen in
Table 12-2.
When you have a high-level understanding of streaming telemetry being pushed from a net-
work element, the next question is “What is the telemetry receiver or controller it is pushing
to?” Several options are available. On the commercial software side, the Cisco Crosswork
solution provides a telemetry receiver (https://www.cisco.com/c/en/us/products/collateral/
cloud-systems-management/crosswork-network-automation/datasheet-c78-743287.html ).
A fine open-source solution for consuming streaming telemetry is the TIG stack—Telegraf,
InfluxDB, and Grafana. We cover the TIG stack in a later section with practical examples. 12
Other options are Apache Kafka and the Cloud Native Computing Foundation’s hosted
project named Prometheus.
operational configuration and state can be shown, if needed. The telemetry subscriptions
are created and managed in a central location, characteristic of one of the foundational prin- 12
ciples of software-defined networking—centralized controllers. This centralization of the
configuration management function serves to benefit scale and reduce operational burden.
Network Element
(Provider)
Telemetry Receiver/Controller
(Subscriber)
2 2 Network Element
(Provider)
1 1
3
Telemetry Receiver/Controller
(Subscriber)
■ Session: One or more telemetry receivers collect the streamed data using a dial-in or
dial-out mode.
■ Dial-in mode: The receiver “dials in” to the network element and subscribes dynami-
cally to one or more sensor paths. This process effectively creates a configuration
change on the network element (although it may not be rendered in the device’s run-
ning configuration). The network element acts as the provider, and the receiver is the
subscriber (client). The network element streams telemetry data through the same
session. The dial-in mode of subscriptions is dynamic. This dynamic subscription
terminates when the receiver-subscriber cancels the subscription or when the session
terminates. Figure 12-4 depicts a dial-in mode, compared with a dial-out mode.
Dial-Out Dial-In
SYN SYN
SYN-ACK SYN-ACK
ACK ACK
■ Sensor path: The sensor path describes a YANG path or a subset of data definitions in
a YANG model with a container. In a YANG model, the sensor path can be specified 12
to end at any level in the container hierarchy (for example, Cisco-IOS-XR-infra-statsd-
oper:infra-statistics/interfaces/interface/latest/generic-counters).
■ Subscription: A configuration session that binds one or more sensor paths to destina-
tions and specifies the criteria to stream data. In time- or cadence-based telemetry,
data is streamed continuously at a configured frequency.
■ Transport and encoding: The network element streams telemetry data using a trans-
port mechanism. The generated data is encapsulated into the desired format using
encoders.
Encoding (Serialization)
Several encoding, or data serialization, mechanisms are used in telemetry. XML is a legacy
and has inefficient encoding due to its verbosity with open and close tags. It has a strong
history in dynamic content management, especially with web publishing because of its align-
ments with HTML and CSS.
JSON is widely adopted in software development projects because of its ease of human
readability, coupled with excellent programmatic use. It is an efficient encoding method that
is seen across many aspects of IT for configuration, inventory, and other basic data encoding.
Google Protocol Buffer (GPB), sometimes colloquially called protobufs, is another data
encoding method to serialize structured data. Google created it with a design intended for
simplicity and performance. Its encoding and efficiency beat XML in all regards and do
well against JSON. However, the efficiency brings operational complexity because it is not
very human-readable. Interpreting GPB-encoded structured data requires a type of “secret
decoder ring” in the form of a .proto file that defines the mappings and data types. Com-
pact GPB was the original format seen with Cisco IOS XR 6.0.0. kvGPB is a derivative—a
self-describing key-value pair Google Protocol Buffer. kvGPB is trending to be the de facto
encoding of choice. It is easier to interpret because the key-value pairings are typically
human-readable.
Figure 12-5 shows these data encodings side by side for comparison.
Protocols
The next topic of consideration is the type of protocol used. The protocol that is used for
the connection between a telemetry publishing device and the telemetry receiver decides
how the data is sent. This protocol is referred to as the transport protocol and is indepen-
dent of the management protocol for configured subscriptions. The transport protocol also
impacts the encoding used. Telemetry implementations can span several options, depend-
ing on platform support—NETCONF, RESTCONF, HTTP, TCP, UDP, secure UDP (DTLS),
gRPC, and gNMI.
NETCONF and RESTCONF are typically associated with configuration management func-
tions, but they can also be used to extract operational data. The NETCONF protocol is
available on IOS XE platforms for the transport of dynamic subscriptions and can be used
with yang-push and yang-notif-native streams using XML encoding. 12
HTTP is observed as an option since NX-OS 7.0 with JSON encoding. HTTPS is also sup-
ported if a certificate is configured.
TCP is supported for dial-out only on Cisco IOS-XR platforms since Release 6.1.1 on the
64-bit Linux-based platforms, such as NCS5500, NCS5000, and ASR9000, and the 32-bit
IOS XR platforms, such as CRS and legacy ASR9000.
UDP is supported also for dial-out only on IOS-XR platforms since Release 6.1.1 on the
64-bit Linux-based platforms, such as NCS5500, NCS5000, and ASR9000. Starting with
Cisco NX-OS Release 7.0(3)I7(1), UDP and secure UDP (DTLS) are supported as telemetry
transport protocols. You can add destinations that receive UDP. The encoding for UDP and
secure UDP can be GPB or JSON.
gRPC is a preferred transport protocol and is supported on IOS-XR platforms since Release
6.1.1 on the 64-bit Linux-based platforms, such as NCS5500, NCS5000, and ASR9000. It is
also broadly supported on the common IOS XE-based platforms, such as the Catalyst 9000
Series.
Because of the variety of support and dependencies, Table 12-4 represents a matrix of
mode, encoding, and telemetry protocols.
NOTE The yang streaming table row references are specific to IOS XE platforms.
As previously mentioned, the sensor path, transport, and encoding are key considerations in
enabling MDT. The mode, whether dial-out or dial-in, must also be determined.
NOTE From IOS-XR Release 6.1.1, Cisco introduced support for the 64-bit Linux-based
IOS XR operating system. The 64-bit platforms, such as NCS5500, NCS5000, and ASR9000,
support gRPC, UDP, and TCP protocols. All 32-bit IOS XR platforms, such as CRS and
legacy ASR9000, support only the TCP protocol. TCP, in this sense, uses the familiar three-
way handshaking without the benefits of gRPC, which also uses TCP, but with HTTP/2 for
advanced traffic forwarding.
■ Create a subscription.
telemetry model-driven
destination-group <group-name>
vrf <vrf-name>
address family ipv4 <IP-address> port <port-number>
encoding <encoding-format>
protocol <transport>
commit
NOTE Your implementation may not require virtual routing and forwarding (VRF), so the 12
next command may not apply in your implementation.
Example 12-2 shows the destination group DestGroup1 created for TCP dial-out configura-
tion with encoding of self-describing-gpb (reflecting key-value Google Protocol Buffers).
Example 12-2 Destination Group for TCP Dial-Out
RP/0/RP0/CPU0:ios# run
Mon Feb 24 13:10:13.713 UTC
[xr-vm_node0_RP0_CPU0:~]$ls -l /misc/config/grpc/dialout/
total 4
-rw-r--r-- 1 root root 4017 Feb 19 14:22 dialout.pem
[xr-vm_node0_RP0_CPU0:~]$
The CommonName (CN) used in the certificate must be configured similarly to the previ-
ous example with an additional protocol parameter defined as protocol grpc tls-hostname
<CommonName>.
For dev-test environments or where you want to bypass the TLS option, modify the protocol
parameter to use protocol grpc no-tls.
telemetry model-driven
sensor-group <group-name>
sensor-path <XR YANG model>
commit
NOTE See “Picking Sensor Paths and Metrics” later in this chapter for additional guidance
on identifying YANG models and sensor paths.
telemetry model-driven
subscription <subscription-name>
sensor-group-id <sensor-group> sample-interval <interval>
destination-id <destination-group>
source-interface <source-interface>
commit
Example 12-7 shows the subscription Subscription1 that is created to associate the sensor
group and destination group and to configure an interval of 20 seconds to stream data.
Time-0 Interface: Up
Time+1 Interface: Up
Time+2 Interface: Up
Time+3 Interface: Up
Consider Example 12-8, where event-based dial-out configuration is used for more efficiency
in monitoring the interface state.
Example 12-8 Configuration Template for Event-Based Dial-Out
telemetry model-driven
destination-group DestGroup1
address family ipv4 192.168.1.10 port 5432
encoding self-describing-gpb
protocol grpc tls-hostname telem-receiver.example.com
!
!
sensor-group SensorGroup1
sensor-path Cisco-IOS-XR-pfi-im-cmd-oper:interfaces/interface-xr/interface/state
!
subscription Subscription2
sensor-group-id SensorGroup1 sample-interval 0
destination-id DestGroup1
!
If you use gRPC dial-out, the results would be slightly different, as seen in Example 12-10.
Example 12-10 Validation for gRPC Dial-Out
■ Enable gRPC.
■ Create a subscription.
Router# configure
Router(config-grpc)# commit
Example 12-11 shows the output of the show grpc command. The sample output displays
the gRPC configuration when TLS is enabled on the network element.
telemetry model-driven
sensor-group <group-name>
sensor-path <XR YANG model>
commit
Example 12-13 shows the sensor group SensorGroup3 created for gRPC dial-in configura-
tion with the OpenConfig YANG model for interfaces. 12
Example 12-13 Sample Configuration of a Sensor Group for gRPC Dial-In
telemetry model-driven
subscription <subscription-name>
sensor-group-id <sensor-group> sample-interval <interval>
destination-id <destination-group>
commit
Example 12-15 shows the subscription Subscription3 that is created to associate the sensor
group with an interval of 20 seconds to stream data.
Example 12-15 Subscription for gRPC Dial-In
Destination Groups:
Group Id: DialIn_1005
Destination IP: 192.168.1.10
Destination Port: 44841
Encoding: self-describing-gpb
Transport: dialin
State: Active
Total bytes sent: 13909
Total packets sent: 14
Last Sent time: 2020-02-26 14:29:25.131464901 +0000
Collection Groups:
------------------
Id: 2
Sample Interval: 30000 ms
Encoding: self-describing-gpb
Num of collection: 5
Collection time: Min: 24 ms Max: 41 ms
Total time: Min: 24 ms Avg: 41 ms Max: 54 ms
Total Deferred: 0
Total Send Errors: 0
Total Send Drops: 0
Total Other Errors: 0
Last Collection Start: 2020-02-26 14:20:25.134464823 +0000
Last Collection End: 2020-02-26 14:29:24.032887433 +0000
Sensor Path: openconfig-interfaces:interfaces/interface
openconfig-interfaces:interfaces/interface
Cisco-IOS-XR-pfi-im-cmd-oper:interfaces/interface-xr/interface/
state
https://github.com/YangModels/yang/tree/master/vendor/cisco
https://yangcatalog.org
<capability>http://cisco.com/ns/yang/Cisco-IOS-XE-bba-group?module=Cisco-IOS-XE-
bba-group&revision=2019-07-01</capability>
12
<capability>http://cisco.com/ns/yang/Cisco-IOS-XE-bfd?module=Cisco-IOS-XE-bfd&
revision=2020-07-01</capability>
<capability>http://cisco.com/ns/yang/Cisco-IOS-XE-bfd-oper?module=Cisco-IOS-XE-
bfd-oper&revision=2019-05-01</capability>
<capability>http://cisco.com/ns/yang/Cisco-IOS-XE-bgp?module=Cisco-IOS-XE-bgp&
revision=2020-07-01</capability>
<capability>http://cisco.com/ns/yang/Cisco-IOS-XE-bgp-common-oper?module=Cisco-
IOS-XE-bgp-common-oper&revision=2019-05-01</capability>
<capability>http://cisco.com/ns/yang/Cisco-IOS-XE-bgp-oper?module=Cisco-IOS-XE-bgp-
oper&revision=2019-11-01</capability>
<capability>http://cisco.com/ns/yang/Cisco-IOS-XE-bgp-route-oper?module=Cisco-IOS-
XE-bgp-route-oper&revision=2019-05-01</capability>
<capability>http://cisco.com/ns/yang/Cisco-IOS-XE-bridge-domain?module=Cisco-IOS-XE-
bridge-domain&revision=2020-03-01</capability>
[ . . . TRIMMED . . .]
<capability>http://openconfig.net/yang/bgp-policy?module=openconfig-bgp-policy
&revision=2016-06-21&deviations=cisco-xe-openconfig-bgp-policy-deviation</
capability>
<capability>http://openconfig.net/yang/bgp-types?module=openconfig-bgp-types&
revision=2016-06-21</capability>
<capability>http://openconfig.net/yang/cisco-xe-openconfig-if-ip-deviation?module=
cisco-xe-openconfig-if-ip-deviation&revision=2017-03-04</capability>
<capability>http://openconfig.net/yang/cisco-xe-openconfig-interfaces-deviation?
module=cisco-xe-openconfig-interfaces-deviation&revision=2018-08-21</capability>
<capability>http://openconfig.net/yang/cisco-xe-routing-csr-openconfig-platform-
deviation?module=cisco-xe-routing-csr-openconfig-platform-deviation&revis
ion=2010-10-09</capability>
<capability>http://openconfig.net/yang/cisco-xe-routing-openconfig-system-deviation?
module=cisco-xe-routing-openconfig-system-deviation&revision=2017-11-27</
capability>
<capability>http://openconfig.net/yang/fib-types?module=openconfig-aft-types&
revision=2017-01-13</capability>
[ . . . TRIMMED . . . ]
<capability>
urn:ietf:params:netconf:capability:notification:1.1
</capability>
</capabilities>
<session-id>201</session-id></hello>]]>]]>
In the highlights in Example 12-17, you can see that many Cisco vendor-specific YANG mod-
els are supported—more than 100. The highlighted samples should be familiar networking
functions. The search criteria to filter more could be as simple as
<capability>http://cisco.com/ns/yang/Cisco-*
Many OpenConfig YANG models also are supported—almost 60. You can find them with
the following search criteria:
<capability>http://openconfig.net/yang/*
Next, the Python script in Example 12-19 helps extract the YANG module capabilities from
a device. Here, you can use the DevNet IOS XE on CSR Latest Code Always On sandbox
lab environment again.
Example 12-19 Simple Python Script to Extract NETCONF Capabilities
Now, execute this Python script, using the process shown in Example 12-20.
Example 12-20 Executing pynetconf.py Script
http://openconfig.net/yang/bgp?module=openconfig-bgp&revision=2016-06-21
http://openconfig.net/yang/bgp-policy?module=openconfig-bgp-policy&revision=2016-
06-21&deviations=cisco-xe-openconfig-bgp-policy-deviation
http://openconfig.net/yang/bgp-types?module=openconfig-bgp-types&revision=2016-06-21
[ . . . TRIMMED . . . ]
http://openconfig.net/yang/interfaces?module=openconfig-interfaces&revision=2018-
01-05&deviations=cisco-xe-openconfig-if-ip-deviation,cisco-xe-openconfig-interfaces-
deviation,cisco-xe-routing-openconfig-vlan-deviation
http://openconfig.net/yang/interfaces/aggregate?module=openconfig-if-
aggregate&revision=2018-01-05
http://openconfig.net/yang/interfaces/ethernet?module=openconfig-if-
ethernet&revision=2018-01-05
http://openconfig.net/yang/interfaces/ip?module=openconfig-if-ip&revision=2018-
01-05&deviations=cisco-xe-openconfig-if-ip-deviation,cisco-xe-openconfig-interfaces-
deviation
http://openconfig.net/yang/interfaces/ip-ext?module=openconfig-if-ip-
ext&revision=2018-01-05
[ . . . TRIMMED . . . ]
urn:ietf:params:netconf:capability:notification:1.1
In this example, you can again see the Cisco vendor-specific and OpenConfig models. To
obtain telemetry, you map the function—BGP, interfaces, AAA, ARP, and so on—to the
YANG model name as a reference. The *-oper YANG models are focused on operational sta-
tistics, so you can start there when looking for metrics you might associate with dashboards
when showing volume, consumption, performance, and so on.
The following list provides some popular YANG models you might be interested in trying
out when getting started. A mixture of IOS-XR and IOS-XE models is listed. First, note the
model name before the colon (:) and the subtree path afterward. This combination is supplied
in the sensor path (or filter path) definition when creating a telemetry configuration. The pre-
ceding section includes references on configuration templates.
■ Health
■ Cisco-IOS-XR-wdsysmon-fd-oper:system-monitoring/cpu-utilization
■ Cisco-IOS-XR-nto-misc-oper:memory-summary/nodes/node/summary
■ Cisco-IOS-XR-shellutil-oper:system-time/uptime
■ Cisco-IOS-XR-telemetry-model-driven-oper:telemetry-model-driven
■ Cisco-IOS-XE-process-cpu-oper: cpu-usage/cpu-utilization/five-seconds
■ Cisco-IOS-XE-environment-oper
■ Cisco-IOS-XE-memory-oper: memory-statistics
■ Cisco-IOS-XE-platform-software-oper
■ Cisco-IOS-XE-process-memory-oper
■ Interfaces
12
■ Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/latest/generic-
counters
■ Cisco-IOS-XR-ipv6-ma-oper:ipv6-network/nodes/node/interface-data/vrfs/vrf/
global-briefs/global-brief
■ Cisco-IOS-XR-pfi-im-cmd-oper:interfaces/interface-summary
■ Cisco-IOS-XR-pfi-im-cmd-oper:interfaces/interface-xr/interface
■ Cisco-IOS-XR-ethernet-lldp-oper:lldp/global-lldp/lldp-info
■ Cisco-IOS-XR-ethernet-lldp-oper:lldp/nodes/node/interfaces/interface
■ Cisco-IOS-XR-ethernet-lldp-oper:lldp/nodes/node/neighbors/details/detail
■ ietf-interfaces
■ Openconfig-interfaces
■ Openconfig-network-instance
■ Inventory
■ Cisco-IOS-XR-plat-chas-invmgr-oper:platform-inventory/racks/rack
■ Optics
■ Cisco-IOS-XR-controller-optics-oper:optics-oper/optics-ports/optics-port/
optics-info
■ Routing
■ Cisco-IOS-XR-clns-isis-oper:isis/instances/instance/levels/level/adjacencies/
adjacency
■ Cisco-IOS-XR-clns-isis-oper:isis/instances/instance/statistics-global
■ Cisco-IOS-XR-ip-rib-ipv4-oper:rib/vrfs/vrf/afs/af/safs/saf/ip-rib-route-table-names/
ip-rib-route-table-name/protocol/isis/as/information
■ Cisco-IOS-XR-ipv4-bgp-oper:bgp/instances/instance/instance-active/default-vrf/
process-info
■ ietf-ospf
■ ietf-routing
■ Openconfig-routing-policy
■ Cisco-IOS-XR-mpls-te-oper:mpls-te/tunnels/summary
■ Cisco-IOS-XR-ip-rsvp-oper:rsvp/interface-briefs/interface-brief
■ Cisco-IOS-XR-ip-rsvp-oper:rsvp/counters/interface-messages/interface-message
Verify Docker is working properly by pulling down and running a test container, as shown in
Example 12-21. 12
Example 12-21 Verifying Docker Operation with Hello World
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
Next, you generate self-signed certificates by running a supplied shell script, as described in
Example 12-23.
Example 12-23 Generating Self-Signed Certificates in YANG Suite
mylinuxvm$ cd yangsuite/docker/
mylinuxvm$ ./gen_test_certs.sh
################################################################
## Generating self-signed certificates... ##
## ##
## WARNING: Obtain certificates from a trusted authority! ##
## ##
## NOTE: Some browsers may still reject these certificates!! ##
################################################################
Finally, you bring up the docker container by using docker-compose, with Example 12-24
providing the guidance.
Example 12-24 Starting YANG Suite from the Docker Container
mylinuxvm$ docker-compose up
Creating network "docker_default" with the default driver
Creating volume "docker_static-content" with default driver
Creating volume "docker_uwsgi" with default driver
Building yangsuite
Sending build context to Docker daemon 12.29kB
Step 1/19 : FROM ubuntu:18.04
18.04: Pulling from library/ubuntu
4bbfd2c87b75: Pull complete
d2e110be24e1: Pull complete
889a7173dcfe: Pull complete
Digest: sha256:67b730ece0d34429b455c08124ffd444f021b81e06fa2d9cd0adaf0d0b875182
Status: Downloaded newer image for ubuntu:18.04
[ . . . TRUNCATED . . .]
Downloading the container’s operating system (Ubuntu 18.04) and other dependencies, such
as Python3.6 and nginx, may take several minutes.
After all the downloads are finished and activated, leave the terminal session running and
access the console/desktop of the VM to run a browser. Access YANG Suite via the locally
running nginx web server at http://127.0.0.1.
You are prompted to accept the End User License Agreement and Privacy Policy, as seen in
Figure 12-7.
12
Figure 12-7 Cisco YANG Suite Web Portal and User Agreement
After you accept the agreement and policy, you are presented with a login screen, as shown
in Figure 12-8.
Upon successful login, you see the YANG Suite portal, as shown in Figure 12-10.
You should fill in the New Device Profile fields as shown in Table 12-6.
12
Table 12-6 Device Profile Field Settings
Field Value Notes
Profile Name DevNet IOS XE
Description Always On Sandbox Device
Address sandbox-iosxe-latest-1.cisco.com
Username developer
Password C1sco12345
Device supports NETCONF (selected)
Skip SSH key validation for this device (selected)
Figure 12-12 Cisco YANG Suite Device Profile Entry for DevNet Always On Sandbox
Device
Additionally, you can select the Check Connectivity button and get a result, as shown in
Figure 12-13.
The system then retrieves the YANG model information. This process may take several
minutes and show a progress bar along the top, as seen in Figure 12-15. 12
Figure 12-15 Cisco YANG Suite Progressing Through Device Information Capture
Eventually, when the process is complete, you see a screen with a success pop-up similar to
that shown in Figure 12-16.
set you just created, probably named devnet-ios-xe-default-yangset. In the Select a YANG
Module(s) menu, pick something like the Cisco-IOS-XE-bgp-oper model. Then click the
Load Module(s) button on the right side, as shown in Figure 12-17.
Note that when individual leaves are selected, the Node Properties are populated on the
right-hand side. The Xpath and Prefix properties are very helpful in forming the streaming 12
telemetry sensor paths for the device or collector configurations.
You now have a solid tool and process for exploring YANG models and developing your
desired sensor paths.
Figure 12-20 Navigating the Statistics Container in the Interfaces YANG Model
To build the sensor path reference, you take the prefix, interface-ios-xe-oper, and the Xpath,
/interfaces/interface/statistics, to result in the following combined XPath filter:
/interface-ios-xe-oper:interfaces/interface/statistics
This path would provide telemetry of all nodes below the statistics container, which includes
the in-octets and out-octets counters you’re interested in.
To build a cadence-based configuration in an IOS XE-based device, such as a Cisco
CSR1000v, you can enter a configuration similar to that shown in Example 12-25.
Example 12-25 Configuration Template for Interface Statistics Telemetry
Example 12-25 uses Google protobufs (GPBs) as key-value pairs, noted as encoding encode-
kvgpb. It has the filter xpath you extracted earlier and uses a yang-push streaming model.
You define the frequency in centiseconds; for this example, 1000 would be every
10 seconds. Finally, the streaming telemetry receiver IP address and protocol are defined
using TCP-based Google RPCs.
If you don’t like the preceding configuration template approach, you can push the following
XML payload shown in Example 12-26 in a NETCONF session, achieving the same effect.
Example 12-26 XML Payload for NETCONF Session Creating Interface Statistics
Telemetry 12
<mdt-config-data xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-mdt-cfg">
<mdt-subscription>
<subscription-id>100</subscription-id>
<base>
<stream>yang-push</stream>
<encoding>encode-kvgpb</encoding>
<source-address>10.1.1.20</source-address>
<period>1000</period>
<xpath>/interface-ios-xe-oper:interfaces/interface/statistics</xpath>
</base>
<mdt-receivers>
<address>10.1.1.100</address>
<port>57000</port>
<protocol>grpc-tcp</protocol>
</mdt-receivers>
</mdt-subscription>
</mdt-config-data>
You can check the status of the streaming telemetry with the show telemetry ietf subscrip-
tion all detail command, as seen in Example 12-27.
Example 12-27 Validating Streaming Telemetry Subscriptions
Receivers:
Address Port Protocol Protocol Profile
--------------------------------------------------------------------------------
10.1.1.100 57000 grpc-tcp
CSR100V#
Installing InfluxDB
Start by installing InfluxDB. This example uses Ubuntu, specifically Ubuntu Server 20.04.2
LTS, but any Linux distribution should have similar install methods.
First, import the influxdata key to the Ubuntu registry:
You’re specifically looking for the port listeners on ports 8086 and 8088.
Next, create an InfluxDB database and database user. Use the influx command to access the
InfluxDB console:
mdtuser@tig-stack:~$ influx
>
Issue the following commands in the influx shell to create a database and user:
name: databases
name
----
_internal
telegraf
user admin
---- -----
telegrafuser false
>
Use exit to leave the influx shell.
Installing Telegraf
Telegraf comes from the same organization as InfluxDB. Earlier, you imported the influxdata
key and repository definitions, so you can reuse them now for Telegraf. Install the Telegraf
package:
Jun 20 23:20:42 tig-stack systemd[1]: Started The plugin-driven server agent for
reporting metrics into InfluxDB.
Jun 20 23:20:42 tig-stack telegraf[20504]: time="2021-06-20T23:20:42Z" level=error
msg="failed to create cache directory. />
Jun 20 23:20:42 tig-stack telegraf[20504]: time="2021-06-20T23:20:42Z" level=error
msg="failed to open. Ignored. open /etc/>
Jun 20 23:20:42 tig-stack telegraf[20504]: 2021-06-20T23:20:42Z I! Starting Telegraf
1.19.0
Jun 20 23:20:42 tig-stack telegraf[20504]: 2021-06-20T23:20:42Z I! Loaded inputs:
cpu disk diskio kernel mem processes swap>
Jun 20 23:20:42 tig-stack telegraf[20504]: 2021-06-20T23:20:42Z I! Loaded
aggregators:
Jun 20 23:20:42 tig-stack telegraf[20504]: 2021-06-20T23:20:42Z I! Loaded
processors:
Jun 20 23:20:42 tig-stack telegraf[20504]: 2021-06-20T23:20:42Z I! Loaded outputs:
influxdb
Jun 20 23:20:42 tig-stack telegraf[20504]: 2021-06-20T23:20:42Z I! Tags enabled:
host=tig-stack
Jun 20 23:20:42 tig-stack telegraf[20504]: 2021-06-20T23:20:42Z I! [agent] Config:
Interval:10s, Quiet:false, Hostname:"tig>
mdtuser@tig-stack:~$
Next, you need to configure Telegraf. First, you can configure it for standard functionality;
then you can enhance for handling streaming telemetry functions: 12
mdtuser@tig-stack:~$ cd /etc/telegraf/
# Input Plugins
[[inputs.cpu]]
percpu = true
totalcpu = true
collect_cpu_time = false
report_active = false
[[inputs.disk]]
ignore_fs = ["tmpfs", "devtmpfs", "devfs"]
[[inputs.io]]
[[inputs.mem]]
[[inputs.net]]
[[inputs.system]]
[[inputs.swap]]
[[inputs.netstat]]
[[inputs.processes]]
[[inputs.kernel]]
Add the following to the telegraf.conf file to enable gRPC Dial-Out Telemetry and enable a
listener port:
[[inputs.cisco_telemetry_mdt]]
transport = "grpc-dialout"
service_address = ":57000"
Save the file and then restart the Telegraf service:
mdtuser@tig-stack:~$ ss -plntu
Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port
Process
udp UNCONN 0 0 127.0.0.53%lo:53 0.0.0.0:*
tcp LISTEN 0 4096 127.0.0.53%lo:53 0.0.0.0:*
tcp LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
tcp LISTEN 0 4096 127.0.0.1:8088 0.0.0.0:*
tcp LISTEN 0 4096 *:57000 *:*
tcp LISTEN 0 4096 *:8086 *:*
tcp LISTEN 0 128 [::]:22 [::]:*
tcp LISTEN 0 4096 *:3000 *:*
The last entry in this list shows the port listener, so you can be confident the app is running.
Access Grafana through the Ubuntu console and a local web browser at http://localhost. It
can also be accessed through another local system with the Ubuntu system’s IP address or
registered DNS hostname.
The Grafana login portal appears in the browser, as shown in Figure 12-21; use admin
/admin for the initial login.
12
12
12
In this figure, you see the oper-status enumeration resolves to several predefined enum
types, such as if-oper-state-invalid, if-oper-state-ready, if-oper-state-dormant, and 12
if-oper-state-ready.
Indeed, you can update the CSR1000v configuration with a new subscription that captures
this oper-status node and exports to the Telegraf instance. The following configuration
should be pushed to the device:
encoding encode-kvgpb
source-address 10.1.1.10
stream yang-push
mdtuser@tig-stack:~$ influx
Connected to http://localhost:8086 version 1.8.6
InfluxDB shell version: 1.8.6
> use telegraf
Using database telegraf
> SELECT "name", "oper_status" from "Cisco-IOS-XE-interfaces-oper:interfaces/inter-
face" where "name" = 'GigabitEthernet1'
name: Cisco-IOS-XE-interfaces-oper:interfaces/interface
time name oper_status
---- ---- -----------
1624280416736000000 GigabitEthernet1 if-oper-state-ready
1624280426736000000 GigabitEthernet1 if-oper-state-ready
1624280436736000000 GigabitEthernet1 if-oper-state-ready
1624280446736000000 GigabitEthernet1 if-oper-state-ready
1624280456736000000 GigabitEthernet1 if-oper-state-ready
1624280466736000000 GigabitEthernet1 if-oper-state-ready
1624280476736000000 GigabitEthernet1 if-oper-state-ready
1624280486736000000 GigabitEthernet1 if-oper-state-ready
[ . . . a lot more info trimmed . . . ]
As you can see in the output in this example, the GigabitEthernet1 interface has a status of
if-oper-state-ready, and it repeats over and over. Because you configured the subscription for
a push every 10 seconds, you get a lot of the same information repeating with little gain.
As a matter of fact, you’re being inefficient because of so many repeat values. It makes sense
to preserve disk usage. So, how do you do that?
Event-driven telemetry.
Although cadence-driven telemetry is the de facto usage of model-driven telemetry, you can
pivot to an event-driven method where you receive a push only when the value changes. This
way, you preserve disk space and unnecessary processing overhead.
With the Cisco IOS XE platform, you can display the list of YANG models that support on-
change subscriptions, using the show platform software ndbman switch {switch-number |
active | standby} models command.
With Cisco IOS XR platforms, you can determine the YANG models that support event-
driven telemetry where the model is annotated with xr:event-telemetry. For example, Cisco-
IOS-XR-ip-rib-ipv6-oper has this for IPv6 RIB operational monitoring:
leaf v6-nexthop {
type inet:ipv6-address;
description
"V6 nexthop";
}
To configure event-driven telemetry on a subscription, set the update policy definition to
update policy on-change. See the full configuration below. You can also remove the previous
cadence-based example with a new sensor path that targets the Cisco-IOS-XE-ios-events-
oper YANG model and specifically the /ios-events-ios-xe-oper:interface-state-change xpath.
For good measure, Figure 12-29 shows the YANG Suite representation of that node.
encoding encode-kvgpb
source-address 10.1.1.20
stream yang-notif-native
update-policy on-change
Now if you return to the Ubuntu VM and the InfluxDB shell, you can see the event change
recorded without the cadence-driven pushes. Example 12-34 shows the process.
Notice the timing isn’t spaced 10 seconds (other otherwise) between state transitions. The
times correspond to the instance of enabling or disabling the port.
The next option would be to create a new Grafana dashboard as a panel that would show the
last value of new_state. You might even alias the interface-notif-state-down and -up values
to toggle to more friendly versions.
Table 12-7 Comparing Cadence and Event-Based Policies for Metrics/Sensor Selection
Cadence-Based (Policy Periodic) Event-Based (Policy On-change)
Interface statistics (octets, packets) Interface state (admin/oper)
Routing table size Routing adjacencies/peer count
CPU percentage
Memory percentage
CDP/LLDP neighbor adjacencies
Temperature Fan status (running, failure)
Optics power (dbm) Optics status (on, off)
MAC Address / CAM table size
Last config change date/time
A common question when using the cadence-driven method (policy) is “How much disk
space will I use?” This is not an easy question to answer because several levers are at play: 12
■ Number of devices
References
URL QR Code
https://tools.ietf.org/html/rfc3535
https://pc.nanog.org/static/published/meetings//NANOG73/daily/
day_2.html#talk_1677
https://www.cisco.com/c/en/us/products/collateral/cloud-systems-
management/crosswork-network-automation/datasheet-c78-743287.
html
URL QR Code 12
https://github.com/YangModels/yang/tree/master/vendor/cisco
https://yangcatalog.org/
https://github.com/openconfig/public/tree/master/release/models
https://devnetsandbox.cisco.com/RM/Diagram/Index/7b4d4209-
a17c-4bc3-9b38-f15184e53a94?diagramType=Topol
ogy
https://github.com/YangModels/yang/blob/master/vendor/cisco/
xe/1681/Cisco-IOS-XE-interfaces-oper.yang
Open-Source Solutions
■ Differences Between Agent and Agentless Solutions: This section discusses the dif-
ferences between agent and agentless solutions and includes benefits and resource
requirements.
■ Cisco Solutions Enabled for IaC: Finally, this section provides some closing thoughts
on Cisco’s solutions that are enabled for Infrastructure-as-Code projects.
This chapter maps to the Developing Applications Using Cisco Core Platforms and APIs
v1.0 (350-901) Exam Blueprint Section 5.3, “Construct a workflow to configure network
parameters with: Ansible playbook and Puppet manifest.”
Open-source solutions are a popular option for network programmability projects. Although
commercial network management applications and controllers offer an “all batteries
included” approach that may be specialized to a specific vendor or technology, open-source
solutions often provide broad device-type capabilities and function extensibility through
code modification. Open-source solutions that have extensive community support may also
reduce the risk associated with supportability. This chapter addresses several open-source
concepts and solutions. We encourage you to approach open-source projects with discern-
ment and a critical view. Although the notion of “free” is appealing, there are always costs
to any project, even if the source code is freely available. Ideally, before implementing such
projects, you are fully apprised of all functionality and are comfortable with reviewing code
to ensure “fit for purpose” and security intents are followed. Open source works best when
you contribute back to the community. If you are new to open source, even helping to write
documentation and test is appreciated.
5. Which is true about open-source solutions using agent or agentless models? (Choose
three.)
a. Chef is agent-based.
b. Chef is agentless.
c. Puppet supports both agent and agentless modes.
d. Ansible is agent-based.
e. Ansible is agentless.
6. The Puppet Facter tool outputs data in which default data exchange method?
a. ANSI
b. JSON
c. XML
d. YAML
7. What is the configuration file for a Puppet operation called?
a. Configuration file
b. Data Definition Language (DDL)
c. Manifest
d. Playbook
8. With which data exchange format can an Ansible inventory source file be written?
(Choose two.)
a. INI
b. JSON
c. YAML
d. zsh
9. Which Ansible inventory format allows for inline references of Ansible Vault
definitions?
a. INI
b. JSON
c. YAML
d. XML
10. What sequence of commands is used to deploy a Terraform configuration from start
to finish?
a. terraform add config_file.tf, terraform deploy config_file.tf
b. terraform create, terraform deploy config_file.tf
c. terraform init, terraform plan, terraform apply
d. terraform add ., terraform build ., terraform deploy
Foundation Topics
Infrastructure-as-Code (IaC) Concepts
When networks were small, services few, and users more limited, the notion of manual,
interactive provisioning was not resource demanding. Operational expenses (OpEx) were not
13
regularly considered. However, organizations have all experienced the immense growth of
an ever-growing, ever-evolving network that provides more services that ebb and flow on a
regular basis. Users are no longer constrained to an office or specific geography. Equipment
can exist in any location, even in the most inaccessible places where there are no users.
Most network engineers have IT war stories of supporting a device that required hours of
travel to access. How many have needed specialized training to even access the device?
There have been times at CiscoLive conferences when accessing a wireless access point in a
ceiling space 30 feet above the event floor required specialized lift training and certification.
Figure 13-1 suggests only those without a fear of heights should apply.
scenario of configuration and policy management. Chef also has a higher licensing cost than
Puppet.
To align with the DevNet certification blueprint, we focus on Puppet here. However, if you
find that your organization has more resources and investments in Chef, it may be beneficial
to assess expanding that investment into the networking domain.
13
Again, considering that Puppet is a server-to-agent based solution, the obvious first depen-
dency check is “Does this equipment have an embedded Puppet agent, or can I install it?”
We’ve talked about Puppet being agent-based. And it was, initially. The great news is that, since
2019, the ciscopuppet module enables many NX-OS devices to be supported in agentless mode
(see Figure 13-3). Table 13-2 reflects the latest (summer 2021) support matrix from Puppetlabs.
If you wish to install the Puppet agent on a guestshell or containerized environment, follow
this git repository: https://github.com/cisco/cisco-network-puppet-module/blob/develop/
docs/README-agent-install.md.
However, going forward, we suggest you use agentless as the preferred mode of interaction.
Let’s start with some high business-impact, low device-impact actions, such as extracting
device information, and then move to an activity that could modify a device configuration.
The Puppet framework requires a server; the legacy name was Puppet Master, but now it is
known as the Puppet Server.
Puppet enables you to define the desired end state of your infrastructure devices; it does
not require you to define the sequence of actions necessary to achieve the end state. This
makes it effectively a declarative solution. Puppet supports a variety of devices, servers, and
network components across many operating systems. Puppet code is used to write the infra-
structure code in Puppet’s domain-specific language (DSL). The Puppet primary server then
automates the provisioning of the devices to obtain your defined state. In a legacy agent-
based mode, the Puppet primary server uses the code to direct the Puppet agent on the
device to translate the directives into native commands, arguments, and syntax. Since 2019
the ciscopuppet module supports an agentless mode that leverages the NX-OS NX-API
feature and does not require a Puppet agent. In this mode, the Puppet primary server talks
directly to the switch’s NX-API service. In either mode, the execution of the Puppet process
is called a Puppet run.
Figure 13-4 shows the simplified server-agent architecture.
Let’s assume you’re starting from scratch. You need a puppetserver component installed.
puppetserver is supported on several Linux distributions. The server install includes a Pup-
pet agent for the local system. It’s good to know that agents can be installed on a variety
of systems; more than 30 operating systems are supported! We do not have you install the
agent on a Nexus switch because you’ll use agentless mode. But first, you can start with a
vanilla Ubuntu Server install. For this exercise, you can install Ubuntu 20.04. 13
NOTE Currently, the grpc module required for agentless NX-API use requires a Ruby ver-
sion between 2.0 and 2.6. The latest Puppet 7 installs Ruby version 2.7. Please follow the
GitHub issue at https://github.com/grpc/grpc/issues/21514 to see when the grpc support
catches up to Puppet 7 and Ruby 2.7. Workarounds exist to allow Puppet 7, but for purposes
of a simpler install in this use case with agentless NX-API, we suggest you install the earlier
Puppet 6 release.
Step 1. Download puppet software using wget or curl (see Example 13-1).
Example 13-1 Using wget to Download Puppet Software
puppet6-release-focal.deb 100%[============================================
====================>] 11.48K --.-KB/s in 0s
puppetadmin@puppetserver:~$
Step 3. Update the package information from all configured repositories (see
Example 13-3).
Example 13-3 Performing Repository Updates
done.
done.
Step 5. Start and enable the Puppet Server service for restart and check its status (see
Example 13-5).
Example 13-5 Starting and Enabling the Puppet Server Service
Installing puppetserver also installs the puppet-agent locally for the server.
However, if you need to install the agent separately (or on another server), the
package manager can perform this task, as shown in Example 13-6.
Example 13-6 Using Package Manager to Install the Puppet Agent
In the Ubuntu environment, the package installer puts the contents into /opt/
puppetlabs. The puppet-agent installer also provides a useful shell script to add
the package binaries to the executable path. Run the following command and
consider adding it to the user’s startup shell script: $HOME/.bashrc.
Step 6. Execute the puppet agent profile script (see Example 13-7).
Example 13-7 Enabling the Puppet Environment Variables
Now you can verify the executable path, as shown in Example 13-8.
Example 13-8 Reviewing the Executable Path
Defaults secure_path="/usr/local/sbin:/usr/
local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin:/opt/
puppetlabs/bin"
Next, you can verify the installation with version checks, as in Example 13-9.
puppetadmin@puppetserver:~$ puppetserver -v
puppetserver version: 6.16.1
puppetadmin@puppetserver:~$ puppet -V
6.24.0 13
puppetadmin@puppetserver:~$ puppet agent -V
6.24.0
The next activity is to configure the server setting that defines which Puppet
server the Puppet client accesses. It is the only mandatory setting. You can set
this option by editing the /etc/puppetlabs/puppet/puppet.conf file directory or
by using the puppet config set command-line interface, as shown in Example
13-10. Because you are configuring the local agent to the local server, you
define the server as the puppet server’s DNS name.
Step 7. Configure the Puppet server location on the local agent (see Example 13-10).
Example 13-10 Configuring the Puppet Server Setting
Then you can verify the setting by looking at the puppet.conf file, as shown in
Example 13-11.
Example 13-11 Reviewing the puppet.conf Configuration File
Step 8. Start and enable the Puppet agent service (see Example 13-12).
An optional task is to install PuppetDB. For the purposes of this chapter, you
can use the embedded database, but you should consider PuppetDB and Post-
gresQL integration for a large, production implementation.
Because Puppet uses certificates and TLS, check and sign for any outstanding
certificates, as shown in Example 13-13.
Example 13-13 Reviewing Puppet Certificates
Next, test the agent connectivity, as shown in Example 13-14; because you’re
running the server and agent on the same system, this test is trivial.
Example 13-14 Testing Puppet Agent Connectivity
Now that you have the Puppet server and agent on the same system, you can do
a low-risk, medium-impact activity: extract information from the environment.
Because only the local agent is registered, you get only the Puppet server’s
“facts.” You will expand to a network device later.
Puppet has a module called facter that deals with collecting facts about the managed
devices, servers, network elements, and so on. So one of the first IoC actions can be to
extract this information for review and reuse. You can start at the command line for familiar-
ity and branch out from there.
The facter command with the p flag loads Puppet libraries, providing Puppet-specific facts.
These facts may or may not be useful in your implementation, so you can assess and decide.
13
The j flag is suggested because it produces the output in JSON (see Example 13-15). With-
out the j flag, the output appears in a key-value pair form.
Example 13-15 Getting Puppet Node Inventory Information Via facter
puppetadmin@puppetserver:~$ facter -p -j
{
"aio_agent_version": "7.10.0",
"augeas": {
"version": "1.12.0"
},
"disks": {
"sda": {
"model": "Virtual disk",
"size": "40.00 GiB",
"size_bytes": 42949672960,
"type": "hdd",
"vendor": "VMware"
},
"sr0": {
"model": "VMware SATA CD00",
"size": "1.00 GiB",
"size_bytes": 1073741312,
"type": "hdd",
"vendor": "NECVMWar"
}
},
[. . . TRIMMED OUTPUT . . . ]
},
"system_uptime": {
"days": 0,
"hours": 2,
"seconds": 9854,
"uptime": "2:44 hours"
},
"timezone": "EDT",
"virtual": "vmware"
}
puppetadmin@puppetserver:~$
There’s a lot of output to consider. If you have the Linux jq utility installed, you can use it to
identify the major output categories. If you don’t have the jq utility on your system, follow
Example 13-16 to install it.
Example 13-16 Installing the jq Utility
Continue on with the extraction of major output categories, as keys, by using the jq utility.
You can see the results in Example 13-17.
Example 13-17 Using the jq Utility with facter to Extract Keys
"partitions",
"path",
"processors",
"puppetversion",
"ruby",
13
"ssh",
"system_uptime",
"timezone",
"virtual"
]
The Puppetlabs documentation site has some good references on the facter core facts; see
https://puppet.com/docs/puppet/7/core_facts.html.See also Figure 13-5 for a portion.
"network6": "fe80::",
"primary": "ens160",
"scope6": "link"
}
}
13
puppetadmin@puppetserver:~$
Likewise, being comfortable using the jq utility and JSON path queries affords benefits
across many programming activities.
So what other ways might be interesting to use Puppet and read-only access to facts? How
about as an availability monitor backup? For instance, you can show the uptime on a device
to be alerted to a reboot, whether planned or unplanned, as in Example 13-19.
Example 13-19 Using Puppet facter to Extract system_uptime
By extracting that seconds key and its value, you can check for devices that have less than
120 seconds of uptime.
Now let’s approach integration from a Python script perspective. Moving forward, let’s use
Puppet against a Cisco Nexus switch. To continue, first install the ciscopuppet module. You
can do it from within Puppet, as in Example 13-20.
Example 13-20 Installing the ciscopuppet Module
Next, edit the Puppet device.conf file to specify device type and configuration file location,
as in Example 13-21.
Example 13-21 Editing the Puppet device.conf File
[sandbox-nxos-1.cisco.com]
type cisco_nexus
url file:////etc/puppetlabs/puppet/devices/sandbox-nxos-1.cisco.com.conf
The next step is to create a device-specific configuration file in the puppet/devices directory.
It is helpful to maintain the device name for quick identification. Call this one sandbox-
nxos-1.cisco.com.conf to align with the DevNet always-on NX-OS device you’re using. Edit
the file using Example 13-23 as a guide.
Example 13-23 Editing the Puppet Device Configuration File
At this point, edit the file to appear as shown in Example 13-24. These credentials map to
the Sandbox documentation defined at https://devnetsandbox.cisco.com/RM/Diagram/
Index/dae38dd8-e8ee-4d7c-a21c-6036bed7a804?diagramType=Topology.
Example 13-24 Modified Puppet Device Configuration File
host: sandbox-nxos-1.cisco.com
user: admin
password: "Admin_1234!"
port: 443
transport: https
Now trigger Puppet to connect to the device by using the puppet device --target command,
as in Example 13-25.
Example 13-25 Triggering Puppet to Connect to a Device
You might need to install the cisco_node_utils Ruby gem. You can use the Puppet-supplied
gem command to install from public repositories. Example 13-27 shows the process.
Example 13-27 Installing the cisco_node_utils Ruby gem
Now let’s pivot to using Puppet with an NX-OS based device. For this exercise, I used the
Nexus 9000v virtual appliance as an OVA image integrated with my VMware environment.
After booting the image and providing an IP address for the management interface I enabled
the feature nxapi and saved the configuration. I updated the Puppet device.conf and the
device-specific credentials file, as seen previously using the device IP and credentials I
configured.
To follow along, tell Puppet to attempt a connection to obtain the certificate; then press
Ctrl+C when it starts looping. Example 13-28 reflects the process.
Now tell the Puppet server to retrieve the certificates list, as in Example 13-29.
Example 13-29 Listing Puppet Certificates
Note the outstanding signing request from the new device, n9kv-nxapi-1.cisco.com.
Now, have the Puppet server use its intermediary certificate authority (CA) to sign them.
Signing the certificate is shown in Example 13-30.
Example 13-30 Signing Certificates with the Puppet Server
Next, repeat the connection to the new device and note that a lot more output is generated
with the successful connection. Example 13-31 should reflect your experience in a similar
fashion.
Now, collect Puppet facts from this new remote Nexus device, as in Example 13-32.
Example 13-32 A More Complete Output of Puppet Facts
"packages": {
}
},
"hardware": {
"type": "cisco Nexus9000 C9500v Chassis",
"cpu": "Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz",
"board": "9F3I6MFDAW2",
"last_reset": "",
"reset_reason": "Unknown",
"memory": {
"total": "10211572K",
"used": "5171128K",
"free": "5040444K"
},
"uptime": "1 days, 4 hours, 37 minutes, 35 seconds"
},
"inventory": {
"chassis": {
"descr": "Nexus9000 C9500v Chassis",
"pid": "N9K-C9500v",
"vid": "",
"sn": "9WYQFON66B3"
},
"Slot 1": {
"descr": "Nexus 9000v 64 port Ethernet Module",
"pid": "N9K-X9564v",
"vid": "",
"sn": "904TSSBDH5Q"
},
"Slot 27": {
"descr": "Supervisor Module",
"pid": "N9K-vSUP",
"vid": "",
"sn": "9F3I6MFDAW2"
}
},
"interface_count": 65,
"interface_threshold": 9,
"virtual_service": {
},
"feature_compatible_module_iflist": {
"fabricpath": [
]
}
},
"hostname": "n9kv-nxapi-1",
"operatingsystemrelease": "10.2(1)",
"clientcert": "n9kv-nxapi-1.cisco.com",
"clientversion": "6.24.0",
"clientnoop": false
13
},
"timestamp": "2021-08-29T19:40:11.902778717-04:00",
"expiration": "2021-08-29T20:10:11.902898379-04:00"
}
puppetadmin@puppetserver:~$
Using that same process with the jq utility, as described previously, you can extract the
device uptime. Example 13-33 provides guidance.
Example 13-33 Using Puppet and the jq Utility to Extract Device Uptime
Unpacking that, you use the puppet device command with options, redirecting standard err
(STDERR) or file handle #2 to /dev/null, effectively ignoring errors. The output of the com-
mand is fed with a pipe (|) into the Linux sed (stream editor) utility to capture any output
that starts with an opening brace ({) and ends with a closing brace (}). This effectively cap-
tures the JSON output of the puppet command, ignoring other output. That output is fed
into the jq utility, which filters for the hierarchy of values, cisco, hardware, and uptime.
Now that you have an idea about how to programmatically access device information or
facts, you can move on to more sophisticated examples, such as programmatically changing
a device configuration.
To set device parameters or configurations, you still use the puppet device command, but
this time you use the --apply option and specify a Puppet manifest that generally ends with
a .pp extension.
What is a Puppet manifest? Manifests are a collection of resource definitions (configurable
items) and their variables/parameters, collected into a single “Puppet policy” file with the
.pp extension. Typically, these files exist on the Puppet server in the /etc/puppetlabs/code/
environments/production/manifests directory. Manifests support the use of variables, loops,
conditionals, and importing of other files and classes from separate Puppet files. These capa-
bilities allow you to make complex provisioning scenarios, if needed.
In Example 13-34, a manifest completes many actions. It defines a specific Ethernet1/61 port
for an access port and labels it to be used for a server. It provisions three different VLANs,
also pulling in their configuration parameters from another file. Then it requires a specific
NTP server to not exist, as if you are unconfiguring a decommissioned server. Then it config-
ures a new one. On the Puppet server, this is the /etc/puppetlabs/code/environments/produc-
tion/manifests/site.pp file.
node 'n9kv-nxapi-1.cisco.com' {
cisco_interface { 'Ethernet1/61' :
shutdown => false,
switchport_mode => access,
description => 'Puppet managed - server port',
access_vlan => 100,
}
include vlan_data
$vlan_data::vlans.each |$vlanid,$value|
{
$vlanname = "${value[name]}"
$intfName = "Vlan${vlanid}"
#Create VLAN
cisco_vlan {"${vlanid}":
vlan_name => $vlanname,
ensure => present
}
ntp_server { '1.2.3.4':
ensure => 'absent',
}
ntp_server { '64.100.58.75':
ensure => 'present',
prefer => true,
vrf => 'management',
minpoll => 4,
maxpoll => 10,
}
}
Example 13-35 contains the associated vlan_data.pp file, which is called with the include
vlan_data directive seen in Example 13-34.
class vlan_data {
$vlans = {
100 => { name => "Production" }, 13
200 => { name => "Site_Backup" },
300 => { name => "Development" },
}
}
Example 13-36 shows how you can trigger Puppet to pull down this policy and reconfigure
the device.
Example 13-36 Triggering Puppet to Connect to a Device and Run a Policy
puppetadmin@puppetserver:/etc/puppetlabs/code/environments/production/manifests$
sudo puppet device --verbose --target n9kv-nxapi-1.cisco.com
[sudo] password for puppetadmin:
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: starting applying configuration to n9kv-nxapi-1.cisco.com at
file:////etc/puppetlabs/puppet/devices/n9kv-nxapi-1.cisco.com.conf
Info: Using configured environment 'production'
Info: Caching catalog for n9kv-nxapi-1.cisco.com
Info: Applying configuration version '1630358256'
Info: Puppet::Type::Cisco_interface::ProviderCisco: [prefetch each interface
independently] (threshold: 9)
Notice: /Stage[main]/Main/Node[n9kv-nxapi-1.cisco.com]/Cisco_
interface[Ethernet1/61]/ensure: created (corrective)
Notice: /Stage[main]/Main/Node[n9kv-nxapi-1.cisco.com]/Cisco_vlan[100]/ensure:
created (corrective)
Notice: /Stage[main]/Main/Node[n9kv-nxapi-1.cisco.com]/Cisco_interface[Vlan100]/
ensure: created (corrective)
Notice: /Stage[main]/Main/Node[n9kv-nxapi-1.cisco.com]/Cisco_vlan[200]/ensure:
created (corrective)
Notice: /Stage[main]/Main/Node[n9kv-nxapi-1.cisco.com]/Cisco_interface[Vlan200]/
ensure: created (corrective)
Notice: /Stage[main]/Main/Node[n9kv-nxapi-1.cisco.com]/Cisco_vlan[300]/ensure:
created (corrective)
Notice: /Stage[main]/Main/Node[n9kv-nxapi-1.cisco.com]/Cisco_interface[Vlan300]/
ensure: created (corrective)
Notice: /Stage[main]/Main/Node[n9kv-nxapi-1.cisco.com]/Ntp_server[64.100.58.75]/
ensure: defined 'ensure' as 'present' (corrective)
Notice: ntp_server[64.100.58.75]: Creating: Setting '64.100.58.75' with
{:name=>"64.100.58.75", :ensure=>"present", :maxpoll=>10, :minpoll=>4,
:prefer=>true, :vrf=>"management"}
Notice: ntp_server[64.100.58.75]: Creating: Finished in 0.146842 seconds
Info: Node[n9kv-nxapi-1.cisco.com]: Unscheduling all events on Node[n9kv-nxapi-1.
cisco.com]
Notice: Applied catalog in 18.20 seconds
puppetadmin@puppetserver:/etc/puppetlabs/code/environments/production/manifests$
Now you can look at the configuration changes of the n9kv-nxapi-1.cisco.com node using a
method similar to Example 13-37.
Example 13-37 Puppet-Influenced Configuration Change Results
n9kv-nxapi-1# sh running-config
feature nxapi
feature interface-vlan
no password strength-check
username admin password 5 $5$***w2 role network-admin
ip domain-lookup
copp profile strict
snmp-server user admin network-admin auth md5 207E4F*** priv aes-128 205E0B***
localizedV2key
ntp server 64.100.58.75 prefer use-vrf management maxpoll 10
system default switchport
vlan 1,100,200,300
vlan 100
name Production
vlan 200
name Site_Backup
vlan 300
name Development
interface Vlan1
interface Vlan100
description Production
no shutdown
interface Vlan200
13
description Site_Backup
no shutdown
interface Vlan300
description Development
no shutdown
interface Ethernet1/1
interface Ethernet1/2
interface Ethernet1/60
interface Ethernet1/61
description Puppet managed - server port
switchport access vlan 100
interface Ethernet1/62
interface Ethernet1/63
interface Ethernet1/64
interface mgmt0
vrf member management
ip address 172.31.0.170/20
icam monitor scale
line console
line vty
no system default switchport shutdown
n9kv-nxapi-1#v
How cool is that? By programmatically creating the site.pp and device.conf files, you can
perform highly automated provisioning. This is Infrastructure as Code, or IaC.
NOTE Feel free to use a desktop operating system like macOS or Windows if you have the
Linux subsystem enabled.
You can choose a simple package manager install or a more involved installation from a
virtualized Python package installer (pip) install. Pick your preference from the next two
sections. Note that the package manager install provides a version that lags the latest. The
virtualized Python pip environment installs a more recent version without the complexity of
doing a source-code-based install.
NOTE As of 2021, the Ansible project is in a transitionary period with version numbering.
With Ansible 3.0 and later, the project is separated into two packages:
■ ansible-core (previously known as ansible-base in ansible 3.0), the runtime, and a built-in
Collection (select core modules, plug-ins, and so on)
■ ansible, community-curated Collections
The ansible-core package maintains the existing versioning scheme (like the Linux Kernel),
whereas the ansible package is adopting semantic versioning.
The current Ansible 4.0 comprises two packages: ansible-core 2.12 and ansible 4.0.
That installation is painless. You can check the version installed for your information.
Observe the operation in Example 13-39.
As of summer 2021, Ansible 2.9.6 version is getting a bit dated, so if you want a newer ver-
sion, you need to forgo the package manager version and install in a different manner. Follow
the steps in the next section for the most recent version.
You can check which specific version of Python you have installed, as in Example 13-41.
Example 13-41 Checking Python Version
ansibleadmin@ansibleserver:~$ python3.9 -V
Python 3.9.5
ansibleadmin@ansibleserver:~$
Now prepare for a project directory for Ansible under /opt, where optional software usually
is installed. This setup can be found in Example 13-42.
Example 13-42 Setting Up a Project Directory
ansibleadmin@ansibleserver:/opt/ansible$ cd /opt/ansible
ansibleadmin@ansibleserver:/opt/ansible$ python3.9 -m venv env
Note that the virtual environment is activated by the inclusion of env at the command-line
prompt.
You can also validate where the Python executable is referenced and which version the envi-
ronment is running by using the steps in Example 13-45.
Example 13-45 Validating Python Path and Version
It is a good idea to ensure that the latest version of the pip is available. Example 13-46 iden-
tifies the actions to validate and update.
Example 13-46 Validating and Updating the Python pip Utility
Now you can install the latest Ansible using Python pip. This process is shown in
Example 13-47.
Looking through the output, you can see that at the time of this installation Ansible 4.5.0
with ansible-core package version 2.11.4 was installed.
Another quick confirmation at the command-line interface is shown in Example 13-48.
Example 13-48 Reviewing Ansible Version
A quick verification on the Ansible release summary website confirms that this release is the
latest.
One last thing to do is to install the paramiko module, which is useful for SSH-based proj-
ects. Example 13-49 depicts the pip install process.
Example 13-49 Python pip Installing the paramiko Module
Now that the foundational software is laid, you can start your Ansible IaC journey by creat-
ing an inventory file to define devices and their credentials.
Next, assume you have a device password that you want to encrypt, so the encrypted string
can be included in a version-controlled Ansible playbook, thus reducing your security
exposure. You can choose your own variable name; the standard ansible_password variable
is used in Example 13-51 to represent the password for infrastructure devices in a DMZ
environment.
Example 13-51 Encrypting Strings for Device Passwords in Ansible
Now you can take that ansible_pass variable with the trailing data to represent the password
in the Ansible playbook without exposing the device password(s).
Let’s progress further by creating a basic project-level inventory file and validate the
connectivity.
https://www.w3schools.io/file/yaml-introduction/
https://docs.ansible.com/ansible/latest/reference_appendices/YAMLSyntax.html
Now you can create a file called inventory by using YAML syntax as shown in Example 13-52.
Example 13-52 Ansible Inventory File in YAML Syntax
---
all:
vars:
ansible_connection: ansible.netcommon.network_cli
infra:
hosts:
n9kv-nxapi-1:
ansible_host: 172.31.0.170
n9kv-nxapi-2:
ansible_host: 172.31.0.171
vars:
ansible_become: yes
ansible_become_method: enable
ansible_network_os: cisco.ios.ios
ansible_user: admin
ansible_password: !vault |
$ANSIBLE_VAULT;1.2;AES256;dmz
36343665373737326533646435363164363336353732393135336636386366663434323636
316130
3535336335646136643338333631353136326661323164630a396330613639346264636531
363931
33373332653064666134653034343433383636393161353839336566326634653730346464
623234
6238343235376265660a396266663339626238346664323562363761666136393337626433
616331
6364
Now that you’ve defined an inventory file, you can validate connectivity from the Ansible
server to the managed devices by using the Ansible ping module. Example 13-53 depicts the
process and results.
Example 13-53 Validating Ansible Connectivity with a Vault-Enabled Device
What happens here is that you tell Ansible to run the ping module (-m ping) against all hosts
in the inventory file (-i inventory) and to use the DMZ vault ID with the my_vault vault
password file (--vault-id dmz@my_vault). This effectively decrypts the ansible_password
inline vault reference to obtain the correct device password for connectivity. The command
results show each device is sent a ping, and the response of the device is pong, or reachable.
Let’s move on to a nondestructive, read-only type use case and then later do configuration
changes.
---
- name: Create output directory
hosts: localhost
tasks:
- name: Create directory
file:
path: ./job-output
state: directory
To create a directory, you use the file module and specify the path and state. Ansible has
many modules; you can find documentation at https://docs.ansible.com/ansible/latest/collec-
tions/index_module.html.
The file module is specifically defined as a presupplied built-in at https://docs.ansible.com/
ansible/latest/collections/ansible/builtin/file_module.html#ansible-collections-ansible-
builtin-file-module.
The next play needs to capture some command output and store it in files. You can capture
the output of show version, show ip route, and show mac address-table. You also can get
the running-config stored in JSON format so that you can use it programmatically later if
you wish.
Continuing to append the earlier playbook, you can add additional content, as described in
Example 13-55.
Example 13-55 Additional Ansible Playbook Content
tasks:
- name: Execute show commands
cisco.nxos.nxos_command:
commands:
- show version
- show ip route
- show mac address-table
register: showcmds
The name of the play is now Capture show output. The hosts for this play are the infra
device group. If you recall the inventory file, you created an infra group of the two Nexus
9K switches. Next, you can use the cisco.nxos.nxos_command module of Ansible to
execute a list of commands. The ios_command module is documented at https://
docs.ansible.com/ansible/latest/collections/cisco/nxos/nxos_command_module.html.
13
NOTE Ansible v2.9 is still commonly supplied in many Linux distributions. You may see
the unqualified module names in older releases, such as nxos_command:. Ensure you under-
stand the version you are running and reference the documentation for proper references.
The module requires you to specify the commands, and you do so as a list. In YAML syntax,
a list is prepended with a dash (-). Finally, you log the output using the register argument,
calling the output variable showcmds, which is referenced later.
The next task is named Save show commands output to local directory and uses the built-
in copy module. This module is documented at https://docs.ansible.com/ansible/latest/
collections/ansible/builtin/copy_module.html#ansible-collections-ansible-builtin-
copy-module.
The content and dest (destination) parameters define the content of the file and where it is
saved. This task is a little more tricky. It uses the standard output (stdout) of the showcmds
variable you previously defined. It passes that output via a pipe (|) to the replace function.
The replace function takes all '\\n' strings and replaces them with '\n', effectively giving you
real newline breaks instead of one long string output. Likewise, the dest parameter defines
the file as going into the job-output directory with the filename of the currently processed
inventory hostname appended with .out.
The next two tasks are pretty similar—capturing some output and storing in a file—as seen
in Example 13-56.
Example 13-56 Using Ansible to Obtain JSON-Formatted Device Configuration
One notable callout is the show run | json-pretty command. The Nexus 9ks in the environ-
ment supports NX-OS features of redirecting command output to JSON in a “pretty” format.
This is convenient for programmatic use. The output is stored in a device-hostnamed file
with -config.json appended.
Now you can execute the playbook, with Example 13-57 as your guide.
You can see the various plays and tasks executing and the recap showing if there were any
device configurations changed or if there were reachability issues and so on.
If you navigate on the Ansible server’s CLI to the project directory and the job-output sub-
directory, you see the files created, as depicted in Example 13-58.
Example 13-58 Reviewing Ansible Output
In a simple activity like the ones shown in Examples 13-55 and 13-56, this syntax is easily
understood and used. However, often, parsing the output requires more complex logic—
potentially loops and conditionals. In these instances, filters using Ansible built-in function-
ality or Jinja2 are preferred. RedHat suggests these references for Jinja2:
https://docs.ansible.com/ansible/latest/user_guide/playbooks_templating.html
https://jinja.palletsprojects.com/en/3.0.x/templates/#builtin-filters
Let’s try an advanced method that uses the built-in Ansible ansible.netcommon.parse_xml
filter, which is documented at https://docs.ansible.com/ansible/latest/user_guide/playbooks_
filters.html#network-xml-filters.
This filter requires a path to a YAML-formatted spec file that defines how to parse the
XML output. Example 13-60 shows the syntax.
Example 13-60 ansible.netcommon.parse_xml Syntax
{{ output | ansible.netcommon.parse_xml('path/to/spec') }}
After looking at the show vlan | xml output from an NX-OS based Nexus 9000 series
switch, you can define this nxos-vlan.yml file as shown in Example 13-61.
Example 13-61 Sample nxos-vlan.yml File
---
keys:
vlans:
value: "{{ vlanid }}"
top: nf:rpc-reply/nf:data/show/vlan/__XML__OPT_Cmd_show_vlan___readonly__/__
readonly__/TABLE_vlanbrief/ROW_vlanbrief/
items:
vlanid: vlanid
name: name
state: state
adminstate: adminstate
intlist: intlist
vars:
vlan:
key: "{{ item.vlanshowbr-vlanid }}"
values:
vlanid: "{{ item.vlanshowbr-vlanid }}"
name: "{{ item.vlanshowbr-vlanname }}"
state: "{{ item.vlanshowbr-vlanstate }}"
adminstate: "{{ item.vlanshowbr-shutstate }}"
intlist: "{{ item.vlanshowplist-ifidx }}"
Now using an Ansible playbook with a task that registers show vlan | xml to an output vari-
able and defines another action as debug or set_fact with {{ output | ansible.netcommon.
parse_xml('/path/to/nxos-vlan.yml') }} renders back structured JSON that you can use in
other ways.
To wrap a bow on Ansible, let’s create a new playbook that changes device configurations.
Let’s add some VLANs to these two switches.
and community members. Many modules are built-ins, or part of the core project—over 700!
Even more modules are available from the community repository, called Galaxy, athttps://
galaxy.ansible.com/.
To create a playbook that creates VLANs, you need to find a Cisco IOS or NX-OS module
that interacts with the VLAN feature. At the most simplistic level, you can use a module that
pushes commands you provide, but that approach might not provide the level of abstraction
13
you’re looking for. Many modules handle the translation of abstract functions into the device
native syntax. These kinds of modules are optimal for programmatic use and in diverse
device-type environments. You can research them further in the Ansible collections library
at https://docs.ansible.com/ansible/latest/collections/ansible/index.html. Searching for cisco
nx-os vlan reveals that the cisco.nxos.nxos-vlans module is the optimal fit. It is documented
at https://docs.ansible.com/ansible/latest/collections/cisco/nxos/nxos_vlans_module.html.
Let’s create three new VLANs as done for the previous Puppet example, providing the appro-
priate names. To start, create a new Ansible playbook called create-vlans.yaml:
---
- name: Create VLANs
hosts: infra
gather_facts: no
vars:
ansible_connection: network_cli
ansible_network_os: nxos
ansible_become: no
tasks:
- name: Merge provided configuration with device configuration.
cisco.nxos.nxos_vlans:
config:
- vlan_id: 100
name: Production
- vlan_id: 200
name: Site_Backup
- vlan_id: 300
name: Development
state: merged
...
(env) ansibleadmin@ansibleserver:/opt/ansible$
Inspecting one of the device configurations, you can see the results shown in
Example 13-64.
Example 13-64 Reviewing Ansible Playbook Results
n9kv-nxapi-1# sh vlan
Now let’s make a more complicated example that includes conditional logic and loops. In this
case, create a playbook that creates two new VLANs if VLAN 200 does not exist; deletes
NTP server 10.1.1.100 if it exists; and creates a new message of the day, MOTD, login ban-
ner. Call this playbook prep-dev-env.yaml.
You can look up Ansible documentation for the cisco ntp module from the Ansible collec-
tions doc URL provided previously at https://docs.ansible.com/ansible/latest/collections/
cisco/nxos/nxos_ntp_module.html.
Likewise, you can look for the banner/MOTD functionality and find cisco nxos banner with
the module documentation at https://docs.ansible.com/ansible/latest/collections/cisco/nxos/
nxos_banner_module.html.
Start by creating the Ansible playbook:
---
- name: Prep Development Environment
hosts: infra
gather_facts: no
vars:
ansible_connection: network_cli
ansible_network_os: nxos
ansible_become: no
dev_vlans:
- { vlan_id: 1000, name: Dev1 }
- { vlan_id: 1001, name: Dev2 }
tasks:
- name: Get Nexus facts
nxos_facts:
gather_subset: legacy
register: data
Interpreting the output, you can see that n9kv-nxapi-1 does have VLAN 200, so it skips
adding the new VLANs. However, nk9v-nxapi-2 does not have VLAN 200, so it adds the
new VLANs 1000 and 1001 (the “changed” results). Neither device has the decommissioned
NTP server 10.1.1.100—the ok result, not changed. The MOTD banner is created on both
devices—the changed results. You can see the summary results of one change for the first
device and two changes for the second device.
Now that you’ve experienced some Ansible-style IaC, let’s get into Terraform, another popu-
lar IaC solution.
Terraform Overview
Terraform is another open-source IaC solution that is generally associated with cloud
orchestration. It is a relative newcomer to the IaC fold, initially released mid-2014 by
HashiCorp (see Figure 13-6). As noted previously, it achieved version 1.0.0 status in the
summer of 2021.
Installing Terraform
To install Terraform, you can use the same model you used for Puppet and Ansible. In this
case, start with a virgin Ubuntu 21.04 server and build up a Terraform server from there.
NOTE Terraform supports Windows (using Chocolatey) and OS X (using Homebrew) also.
In this chapter, we show a common Linux example for broader application.
Terraform has a simple client-only architecture that is easily installed. You start by adding
HashiCorp’s GNU Privacy Guard (gpg) key by downloading it with the curl utility and pass- 13
ing it to the apt-key utility, adding it to your list of trusted keys. This approach is shown in
Example 13-67.
Example 13-67 Obtaining Hashicorp GPG Keys for Package Management
Next, you add the software repository to your local list of authorized sources, as in
Example 13-68.
Example 13-68 Adding the Hashicorp Software Repository to the Package Manager
For good measure, Example 13-69 shows how to update the package information from all
configured sources.
Then you direct the package installer to install Terraform, as shown in Example 13-70.
Example 13-70 Using the Package Manager to Install Terraform
For good measure, Example 13-71 shows how to verify the version of Terraform installed.
Example 13-71 Checking the Installed Version of Terraform
Using Terraform
Let’s use Terraform to do an IaC provisioning use case that deploys new tenants in an ACI
environment. Terraform can download and install any provider defined in a configuration file
as it initializes the project’s working directory.
If you have your own ACI DevTest environment, feel free to use that. If not, the DevNet ACI
Always On Sandbox Lab is free to use as a shared, public resource at https://devnetsand-
box.cisco.com/RM/Diagram/Index/18a514e8-21d4-4c29-96b2-e3c16b1ee62e?diagramType
=Topology.
The first thing to do is create a project directory to maintain configuration files and Terra-
13
form state information:
terraformadmin@terraformserver:~$ cd aci-create-tenant/
Next, create the Terraform configuration file; ideally, this would be programmatically created
and maintained in a git repo. Terraform files are named with a .tf extension by convention.
terraformadmin@terraformserver:~/aci-create-tenant$ vi
deploy-test-tenant.tf
Ensure the configuration file contents appear as shown in Example 13-72. Replace the ACI
APIC URL and credentials to suit your environment or use the DevNet Always On environ-
ment, as supplied.
Example 13-72 A Terraform File to Create an ACI Tenant and App
terraform {
required_providers {
aci = {
source = "ciscodevnet/aci"
}
}
}
To make use of Terraform, you must initialize your current working directory. Initializing the
directory creates a .terraform subdirectory containing provider information. You may also
see a lock file with a .lock.hcl extension. Example 13-73 shows the execution.
Example 13-73 Initializing a Terraform Project
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
Note that the terraform init command also downloads the ciscodevnet/aci provider from the
Terraform registry.
The next activity is to instruct Terraform to assess the IaC configuration you provided. You
use the terraform plan command to show the changes necessary to bring the environment
in line with the configuration you declared. Terraform connects to the endpoints defined in
the configuration file to assess the changes necessary. Example 13-74 shows the execution
of terraform plan.
Example 13-74 Executing terraform plan
────────────────────────────────────────────────────────────────────────────────────
───────────────────────────────────────────────────────
Note: You didn't use the -out option to save this plan, so Terraform can't guarantee
to take exactly these actions if you run "terraform
apply" now.
If you agree with the assessment of creating three new tenants, you can move to the provi-
sioning mode with terraform apply, depicted in Example 13-75.
Example 13-75 Applying the Terraform Plan
Terraform used the selected providers to generate the following execution plan.
Resource actions are indicated with the following symbols:
+ create
aci_tenant.devcor-test-tenant2: Creating...
aci_tenant.devcor-test-tenant3: Creating...
aci_tenant.devcor-test-tenant: Creating...
13
aci_tenant.devcor-test-tenant: Creation complete after 1s [id=uni/
tn-devcor-test-tenant]
aci_tenant.devcor-test-tenant2: Creation complete after 1s [id=uni/
tn-devcor-test-tenant2]
aci_tenant.devcor-test-tenant3: Creation complete after 1s [id=uni/
tn-devcor-test-tenant3]
To verify the creation, you can go to the ACI APIC controller’s web UI, as shown in Figure 13-7.
You have the flexibility to align your provisioning and orchestration with the same solutions
that are popular in the server/compute domain. Cisco is an active developer in all these com-
munities and has a dedicated developer advocacy team in its DevNet organization. You can
engage with them through the following links:
Twitter: @CiscoDevNet
Facebook: https://www.facebook.com/ciscodevnet/
GitHub: https://github.com/CiscoDevNet
Cisco developer forums: https://community.cisco.com/t5/for-developers/
ct-p/4409j-developer-home
agentless, agent-based, declarative model, imperative model, Chef, Puppet, Ansible, Terra-
form, manifest, playbook, spec file, plan
References
URL QR Code
https://github.com/cisco/cisco-network-puppet-module/blob/develop/
docs/README-agent-install.md
https://github.com/grpc/grpc/issues/21514
https://puppet.com/docs/puppet/7/core_facts.html
https://devnetsandbox.cisco.com/RM/Diagram/Index/
dae38dd8-e8ee-4d7c-a21c-6036bed7a804?diagramType=Topology
URL QR Code
https://www.w3schools.io/file/yaml-introduction/
https://docs.ansible.com/ansible/latest/reference_appendices/
YAMLSyntax.html
https://docs.ansible.com/ansible/latest/collections/index_module.html
https://docs.ansible.com/ansible/latest/collections/ansible/builtin/
file_module.html#ansible-collections-ansible-builtin-file-module
https://docs.ansible.com/ansible/latest/collections/cisco/nxos/nxos_
command_module.html
https://docs.ansible.com/ansible/latest/collections/ansible/builtin/
copy_module.html#ansible-collections-ansible-builtin-copy-module
https://docs.ansible.com/ansible/latest/user_guide/playbooks_
templating.html
https://jinja.palletsprojects.com/en/3.0.x/templates/#builtin-filters
URL QR Code
https://docs.ansible.com/ansible/latest/user_guide/playbooks_
filters.html#network-xml-filters
13
https://galaxy.ansible.com/
https://docs.ansible.com/ansible/latest/collections/ansible/index.html
https://docs.ansible.com/ansible/latest/collections/cisco/nxos/nxos_
vlans_module.html
https://docs.ansible.com/ansible/latest/collections/cisco/nxos/nxos_
ntp_module.html
https://docs.ansible.com/ansible/latest/collections/cisco/nxos/nxos_
banner_module.html
https://registry.terraform.io/browse/providers
https://devnetsandbox.cisco.com/RM/Diagram/Index/18a514e8-21d4-
4c29-96b2-e3c16b1ee62e?diagramType=Topology
URL QR Code
https://github.com/CiscoDevNet/terraform-provider-aci/tree/master/
examples
https://www.facebook.com/ciscodevnet/
https://github.com/CiscoDevNet
https://community.cisco.com/t5/for-developers/
ct-p/4409j-developer-home
Software Configuration
Management
This chapter maps to the first part of the Developing Applications Using Cisco Core
Platforms and APIs v1.0 (350-901) Exam Blueprint Section 5.0, “Infrastructure and Automa-
tion,” with specific connection to subsections 5.3 and 5.4.
Gathering requirements, building a development plan, and then building an implementation
plan, a test plan, and a support plan are a few small tasks of a huge software development
project. Diligence, speed, and the ability to correct course are also the basis for achieving
success. In this chapter we introduce you to few important concepts for participating in or
leading major software development projects and how to ensure that adequate communica-
tion, documentation, and interaction with all parties are essential for flawless execution.
At the beginning of this book we focused on helping you understand the functional and
nonfunctional requirements. You learned what they are, how they influence the architecture,
and what trade-offs you must make to build the best product. But you also learned that
architectures could evolve to meet new requirements, or they may also be changed to speed
up the go-to-market cycle. Time to market is important to business and marketing stakehold-
ers; however, architecture adherence and diligence are important to engineering teams striv-
ing to build high-quality products. Somewhere in between the two lies the perfect world (or
the perfect product, if such a thing even exists).
7. Terraform uses the declarative approach and has the following simple lifecycle:
a. Initialize, execute, reset, repeat
b. Init, plan, apply, destroy
c. Architect, document, init, apply, delete
d. Open, execute, reset, close
8. What is technical debt?
a. A database decision that does not support all quality attributes
b. Short-term decisions that have long-term consequences
c. A test environment that facilitates easier testing but simulates a production
environment
d. All of these answers are correct.
Foundation Topics
Software Configuration Management (SCM)
Chapter 1, “Software Development Essentials,” defined software architecture to be a set
of structures needed to reason about the system. These structures comprise software ele-
ments, relations among them, and properties of both. When you inspected several software
architecture documents, you likely found very little text explaining rationale and a lot more
diagrams with colors and a lot of arrows among the colorful components. You know that
you arrived at that architecture as a result of meetings, questionnaires, and interviews with
business and technical stakeholders. You know that the rationale exists for all these colors
and arrows (on an architectural diagram), but how is it continuously being referenced as
you move toward the development cycle? You know for a fact that many technical deci-
sions are made during the development cycle that are not fully captured into the original
documentations or specifications. That chapter also discussed how to solve some of these
issues through continuous collaboration and frequent communication with Agile processes.
However, are you making sure that the proper “feedback loops” exists? Are you updating all
aspects or layers of the project?
■ Support systems
■ Source code
■ Libraries
■ Data dictionaries
■ Maintenance documentation
You can add other documentations or items to this list, but what you use to represent the
architecture’s lifecycle may be project- or organization-dependent.
ANSI and IEEE attempted to put on paper what configuration management is. There are a
few documents that can help you. A simple and most recent attempt is “The IEEE Standard
for Configuration Management in Systems and Software Engineering” (IEEE Std 828.). 14
Figure 14-1 sums it up.
Requirements
Management
Project Design
Management
Release/
Transition
Integration
Verification
The general answer to “Why do you need SCM?” can be summarized as follows:
■ Chef: An excellent SCM and IaC tool with few restrictions. The fact that it uses Ruby
is great if you’re a fan. But you have no choice here. Chef has good interoperability
and integration with major operating systems (Windows included). It’s open source but
is also available as an enterprise offering. Unlike Puppet and Ansible, Chef does not
support Push functionality.
■ Puppet: An open-source SCM tool. Puppet is mostly used for the server side of the
operation; however, it has wider infrastructure capabilities. Puppet is also agent-based
and therefore not the preferred choice as compared to the agentless tools. Puppet is
known for having excellent reporting and compliance tools, alongside its easy-to-use
user interface. You can find more information athttps://puppet.com/.
Ansible gets the most attention in this chapter because we have found it to be the most
popular among the development communities we interact with.
Ansible
Ansible is open source, free, and easy to use. What else do you need? It also integrates
with almost all the major infrastructure providers out there. Alongside all the automation
and management capabilities, it also provides for an adequate collaboration platform among
developers, communities, or departments.
The main reasons that Ansible is a favorite include
■ Ease of use (You can get up to speed and get on with your automation journey in a
very short time.)
■ Secure communication between the nodes and the servers (It uses SSH.)
■ Agentless execution
■ Interoperability
■ Control node: Any machine with Ansible and Python installed can be a control node. 14
The control node is like the engine that runs the automation flow. There can be one or
multiple control nodes.
■ Show run
---
- hosts: cisco-ios
gather_facts: true
connection: local
tasks:
- name: show run
ios_command:
commands: show run
Play Task host: "{{ ansible_host }}"
username: cisco-ios
password: cisco-ios
register: config
Figure 14-3 shows the flow and interaction among all the components of Ansible.
Developers Build
Playbooks
Modules Inventory
APIs
Playbooks
in YAML Ansible Control Node
M
s od
ule
ule
les
Module
od s
Modu
M
s
Modules
Managed Nodes
Example 14-1 shows ansible -version returning the Ansible version as well as the
Python version (and the path to all relevant directories).
Example 14-1 An Ansible CLI Command to Display the Version
NOTE A great deal of information about Ansible is available online, and it is rapidly grow-
ing. We recommend you take a look at the open-source data at https://docs.ansible.com/
ansible/latest/index.html as well as the Red Hat sponsored sites at https://www.ansible.com/.
Terraform
Terraform is not an SCM. This is why it is explained in its own section. It’s not unusual for
engineers to speak about Terraform and Ansible in the same context and in direct compari-
son. There is some truth to that, but there are differences in how each tool performs its tasks.
There is no doubt Terraform is the most-used Infrastructure as Code (IaC) tool for managing 14
data centers, software-defined networks (SDNs), and cloud assets.
Terraform is a secure, efficient, easy-to-use open-source system for building, configuring,
and managing infrastructure. Here are a few interesting features:
■ It enables you to compose and combine infrastructure resources to build and maintain
a desired state.
■ It provides for fast and easy deployment because of the declarative approach.
Declarative means that the user defines the end state and Terraform takes the appro-
priate actions to achieve the desired outcome. The SCMs we discussed earlier take an
imperative or procedural approach, meaning that the user defines a series of steps or
tasks that the system must follow. Ansible uses an imperative approach as a configura-
tion management tool. However, Ansible does have orchestration capabilities using the
declarative approach. For this reason, we like to think of Ansible as a hybrid.
As seen in Figure 14-4, the Core provides the simplicity and the sophistication for the
system. It defines a configuration language (HashiCorp Configuration Language, or HCL),
rules of operations, and plug-ins. Providers, on the other hand, are an abstraction that uses
a plug-in to interact with the cloud service providers. A common example of providers would
be Google Cloud (GCP), Microsoft Azure, or Amazon Web Services (AWS).
Google
Cloud
Azure
Developer Community
AWS
Terraform Core
Configuration Other IaaS
Files
Providers
Terraform
State
(Current State)
Figure 14-4 High-Level Terraform Architecture
Terraform has a simple lifecycle. As a matter of fact, the core Terraform workflow is “write,
plan, apply.” In practical coding terms, Figure 14-5 through Figure 14-8 explain the lifecycle.
Errors
Errors
Figure 14-5 Terraform Lifecycle of IaC
■ Init: Initialize workflows and build directories of all configuration files to represent
your environments. This is where you write configuration files.
■ Plan: Create the execution plan for desired state(s). During the development cycle, it is
recommended to repeatedly run plan to flush out errors as you go.
■ Resources: This term refers to a block of one or more infrastructure objects (compute
instances, virtual networks, and so on), which are used in configuring and managing
the infrastructure.
■ Output values: These return values of a Terraform module can be used by other con-
figurations.
functions for performing string manipulations, evaluating math operations, and doing
list comprehensions.
NOTE A great deal of information about Terraform is available online, and it is rapidly growing.
We recommend you take a look at the open-source data athttps://www.terraform.io/.
At the end of the day, you know your organization, the level of collaboration, the level
of adherence to process, the politics within your IT group, and what tool serves you best.
Which one you choose depends!
■ Maintainability
■ Availability
■ Usability
■ Security
■ Performance
■ Modularity
A majority of the product failures are due to weak quality attributes. Applications or web apps
that fail to meet performance or security thresholds expected by their customers will fail.
So, what do you do about quality attributes?
Architectural Decisions
The answer to the question about quality attributes is simple: List all quality attributes that
are most important to your stakeholders in the order of priority and prepare for trade-offs.
You will not be able to treat them all the same. For example, you want the highest level of
security? In that case, prepare to give up some performance or compromise on usability.
These are architectural decisions that you must make, and you must clearly document
them. This is the only way you can get in the mind of the architects and how they arrived at
these decisions and what trade-offs you allowed. There are many ways to do this—whether
as an extension of your SCM tooling (see the previous section) or through development plat-
forms where the interpretation of the architecture into working code happens. For example,
This paragraph sums it up very well. The https://adr.github.io/ site has few interesting articles
and blogs from software engineering professionals. It’s worth the time to check them out.
Technical Debt
It’s natural to discuss technical debt after architectural decisions. It is not unusual to work
against a go-to-market deadline or against system constraints that require you to make quick
or short architectural decisions that may or may not deviate from the main architecture prin-
ciples. In other words, small or short-term decisions that expedite the development process,
if not corrected at a later time, will turn into big or long-term problems. Simply speaking,
that’s technical debt.
Although the original idea was concerned with the code, rightly so, it was recently expanded
to include other aspects of the software development lifecycle (SDLC). In their book Con-
tinuous Architecture in Practice, Murat Erder, Pierre Pureur, and Eoin Woods categorized
technical debt into three distinct categories:
■ Code: Code written to meet an aggressive timeline and few short decisions at the early
stages may make it difficult to maintain and evolve (that is, introduce new features).
■ Architecture: This comes as a result of architectural decisions made during the soft-
ware development cycle. This type of technical debt is difficult to measure via tools
but usually has a more significant impact on the system than other types of debt. This
type of debt affects scalability and maintainability of the architecture.
Figure 14-9 presents the problem and where it manifests itself and what effect it has on the
software quality.
Architecture Code
Technological Gap
Architecture Smells Low Internal Quality
New Features Pattern Violations Code Complexity Defects
Structural Complexity Code Smells
Additional Functionality Coding Style Violations Low External Quality
Production Infrastructure
Build, Test, and Deploy Issues
14
Evolution Issues: Evolvability Quality Issues: Maintainability
Figure 14-9 Technical Department Landscape (Source: M. Erder, P. Pureur, & E. Woods,
Continuous Architecture in Practice, Addison-Wesley Professional, 2021)
Technical debt is visible and incurred by choice when you’re adding new features. You want
to integrate new features quickly to meet market demand or to improve the user experience.
Similarly, technical debt is also visible and difficult to deal with when you’re troubleshooting
technical or quality issues. However, it’s almost invisible when you’re working on these three
categories. Complex architecture is interpreted or simplified by programmers and tested or
deployed on an infrastructure also customized for the final product (debt included). This is
how technical debt gets compounded and becomes difficult to handle.
A few interesting terms in Figure 14-9 are in the middle section marked “Technological
Gap.” It clearly shows the bulk of the software development decisions that are made without
an immediate visible impact to the quality software. Architecture Smells and Code Smells
are both symptoms of bad decisions, bad code, or bad design that can solve a short-term
problem but introduce software quality issues and technical debt that affects the evolvability
or maintainability on the long run.
References
URL QR Code
https://standards.ieee.org/standard/828-2012.html
https://www.ansible.com/
URL QR Code
https://www.terraform.io/
Hosting an Application on a
Network Device
■ Application Container Ideas: This section covers the concept of application con-
tainers and how they may be used along with cautions about where they may not be
optimal.
■ Best Practices for Managing Application Containers: This section covers best prac-
tices for managing your deployed application containers. When you get the container
bug, your next challenge is how to manage your growing inventory.
This chapter maps to the Developing Applications Using Cisco Core Platforms and APIs
v1.0 (350-901) Exam Blueprint Section 5.5, “Describe how to host an application on a net-
work device (including Catalyst 9000 and Cisco IOx-enabled devices).”
Application containers are a newish technology with a legacy following Sun Solaris Zones
and BSD jails. Around 2000, shared-environment hosting providers developed the notion of
FreeBSD jails. They provided partitioning of the operating system into multiple independent
systems, or “jails.” Each independent environment could have its own IP address and separate
configuration for common applications like Apache and mail server. In 2004, Sun Microsys-
tems released Solaris Containers, which also leveraged a common underpinning operating
system with separations called “zones.”
This chapter provides information on containers running on network devices. You may find
suitable use cases in your environment by distributing the computing requirements and data
collection for your applications.
Foundation Topics
Benefits of Edge Computing
Edge computing is a design methodology in which computing resources are shifted from
remote, centralized data centers and the cloud to networked devices closer to users at the
edge of the network. Edge computing use cases include architecture, applications, and meth-
odologies such as 5G, Internet of Things (IoT), and streaming services.
Some applications have strict requirements for low latency. User experience for voice and
video applications suffers when there is lag due to long-distance network paths. Indeed,
the longer the distance is, the greater potential for additional hop-counts and routing deci-
sion points that make a nondeterministic experience among different invocations. Gaming
applications also benefit from low-latency, low-loss networks (according to my son, Every.
Single. Day.).
15
The promise of new 5G networks with higher bandwidth and lower latency even enables
intelligent car navigation systems that require the best performance for public safety
interests.
The IoT community also benefits from pushing the reception, computation, and analysis of
sensor data closer to the device.
Next, let’s consider situations where polling at the edge of the network is desirable. In most
traditional network management situations, the element management systems (EMSs), plus
performance and fault management systems, are centralized, many times in a data center.
When those systems are responsible for polling or taking alerts in from hundreds, thousands,
or tens of thousands of devices, then extreme amounts of data can be transferred. Addition-
ally, the information is often duplicated or nonurgent. If the data is being collected regularly
for trending or accounting purposes, then the collection of every data point may be neces-
sary. However, if not, then great efficiencies of data transfer and retention can be achieved
by only transferring data that violates a threshold or policy. Distributing the collection, anal-
ysis, and exception-based alerting to the edge can greatly reduce the administrative network
management traffic.
Similarly, businesses can save network backhaul costs if their use case can be handled closer
to the source. If the information is regional in nature, there may be little need to serve and
distribute it from a centralized data store or cloud location.
Not all use cases are practical or economically feasible for edge computing, so careful
consideration must be given to the application requirements and costs before distributing
workloads. Some business decisions may support edge computing where there are higher
concentrations of users who do multimedia streaming, whereas other locations may still be
served from consolidated compute farms in a centralized data center because of lower user
count.
Virtualization Technologies
Edge computing can be implemented on bare-metal compute nodes near the data consum-
ers; however, the use of compute virtualization technologies is more common for flexibility
of service delivery and resource utilization. Several virtualization technologies exist today:
Type-1 and Type-2 hypervisors, Linux Containers (LXC), and Docker containers. We
cover the nuances of each, but for purposes of the DEVCOR exam, the Cisco support of
LXC and Docker container solutions is of key importance. In any of these virtualization
methods, a hosting system shares its finite resources with a guest environment that can
take a portion of those resources—CPU, memory, or disk—for its own application ser-
vicing purposes. The virtualization technologies differ on the level of isolation from the
foundational hypervisor, or operating system, and that among the separate virtualized
environments.
Type-1 Hypervisors
The architecture of Type-1, or native (or bare-metal), hypervisors typically involves the
hypervisor kernel acting as a shim layer between the underlying hardware serving compute,
memory, network, and storage, from the overlying operating systems. Sample solutions are
Microsoft Hyper-V, Xen, and VMware ESXi. Figure 15-1 depicts the architecture involving a
Type-1 hypervisor.
Hypervisor
Hardware
Type-2 Hypervisors
The architecture of Type-2, or hosted, hypervisors involves running the hypervisor over
the top of a conventional hosted operating system (OS). Other applications, besides the
hypervisor, may also run on the hosted OS as other programs or processes. One or more
guest operating systems run over the hypervisor. Figure 15-2 shows the architecture of a
Type-2 hypervisor.
15
Hypervisor
Hardware
Binaries/Libraries
Container
Container
Binaries/Libraries
Hypervisor
Hardware
Docker Containers
Docker is another type of containerized virtual environment. A Docker runtime is executed
on the hosting operating system and controls the deployment and status of the contain-
ers. Docker containers are lightweight; you don’t have the administrative burden of setting
up virtual machines and environments. Docker containers use the host OS but don’t have a
separate kernel to run the containers; this is shown in Figure 15-4. It uses the same resources
as the host OS. Docker also uses namespaces and control groups, as you saw with Linux
Containers.
App App
Container
Docker
Hardware
■ Where are you trying to improve user experience by reducing latency due to network
distance and routing?
As mentioned in the previous section, an example could be monitoring network stats at the
edge, reporting to a central system only when out of norms. This model would relieve net-
work polling requirements, traffic, and data storage from centralized management tools. If
your business model doesn’t require high detail trending of network statistics, this may be a
viable option.
Measuring user experience through availability, latency, packet loss, and responsiveness
ensures visibility into what your customers are seeing. Consider the deployment model used
with the Cisco IP SLAs feature and the Cisco ThousandEyes technology; they use an agent-
based, source-destination deployment. If the monitoring source is as close to the network
edge as possible, then you benefit from the closest possible user experience.
When there are clustered users in a locale, it may make sense to keep their local, regional
data as close to them as possible. If the data being collected and processed involves person-
ally identifiable information (PII), then country data privacy and sovereignty regulations may
apply. In those cases, it makes sense to handle the data in the region without backhauling to
a regional or centralized corporate data center or cloud environment, potentially outside the
country policy.
Even if the localized or regional data is not governed by data privacy or sovereignty, it may
make sense to host the information closer to the interested users. If the “Woolly Worm Fes-
tival” is only interesting to consumers in Banner Elk, North Carolina, why store and serve it
in a data center in Seattle?
Another common use case for application hosting and containerization is with packet cap-
ture applications, such as Wireshark, embedded with a Cisco network switch. Now, no more
dragging out a physical packet sniffer device when needing to do deep packet inspection of
network traffic!
Syslog event messaging is an important monitoring service. Oftentimes it is used for alerting
on failure of a component, but it is also possible to alert on informational events, such as a
port or service being used. Security management is another prominent use case for syslog
event data. With the diversity of network component failures, informational service report-
ing, and security notification, syslog event data is often multiplexed to different consuming
management applications focused on function or domain. Another idea is to host a local
syslog management application in a container to optimize the management traffic and limit
redundancies. Other network devices (or application servers) that generate syslog event mes-
sages can forward to the containerized syslog receiver.
■ The Cisco IC3000 Industrial Compute Gateway running software release 1.2.1 or
higher for native Docker support for industrial IoT use cases.
■ The Cisco IR809 and 829 platforms running Cisco IOS-XE release 15.6(1)T1 or higher
for mobile IoT use cases.
15
■ The Cisco IR1101 Integrated Services Router (ISR) platform running Cisco IOS-XE
release 17.2.1 or higher for ARM-based IoT gateway use cases.
■ The Cisco IE3400 switch platform running Cisco IOS-XE release 17.2.1 or higher for
ruggedized industrial IoT use cases.
■ The Cisco IE4000 switch platform running Cisco IOS release 15.2(5)E1 or higher for
ruggedized industrial IoT user cases. Note that the IE4010 does not support Cisco IOx.
■ The Catalyst 9000 series switch platform running Cisco IOS-XE release 16.2.1 or
higher for diverse LAN access deployment use cases. This platform is most generally
covered in this book and the DEVCOR exam.
■ Catalyst 9404 and 9407 switches with Cisco IOS XE 17.1.1 release
■ Catalyst 9500 High Performance and 9600 series switches with Cisco IOS XE
17.5.1 release
NOTE The new AppGigabitEthernet interface was introduced on the Catalyst 9300/9400
for dedicated application traffic. Catalyst 9500/9600 switches do not support the AppGiga-
bitEthernet interface. Containers that need external network connectivity must use a man-
agement interface through loopback from any front-panel port.
■ The CGR1000 Compute Module for CGR1000 series routers running Cisco IOS
release 15.6(3)M2 for edge compute enablement on Connected Grid Routers
(SmartGrid deployments).
■ The Cisco IW6300 heavy duty series access points managed by Cisco Wireless
Controller (WLC) release 8.10 or higher.
■ The Cisco Nexus series switches running Cisco NX-OS Release 9.2(1) or higher.
Install
Uninstall Activate
Deactivate Start
Stop
NOTE If you prefer to go the DevNet Sandbox Lab route, navigate to https://
devnetsandbox.cisco.com and use the search bar to find options with “DNA Center,” “IOS
XE,” or “Catalyst 9.” The intent is to find an available lab with Catalyst 9300 platforms run-
ning IOS XE 16.12 or higher—17.3, ideally. The Always On labs would be the most imme-
diately available but may not have all the permissions or dependencies needed for these
examples (such as SSD flash drives). Alternatively, reservable labs like the ones for DNA Cen-
ter, including Catalyst 9300s, would provide more access but may also be reserved, requiring
you to schedule a timeslot in the future.
There are several steps to implementing a Docker container on a network switch: validate
prerequisites, enable the application hosting framework, and install/activate/start the app.
Validating Prerequisites
For a Catalyst 9k switch, ensure you are running at least IOS XE release 16.12.1; release
17.3.3 or higher is preferable, in case you wish to deploy a Cisco ThousandEyes agent as a
Docker container image. Example 15-1 provides the process and sample output for validating
the running IOS XE release.
NAME: "Switch 1 - Power Supply B", DESCR: "Switch 1 - Power Supply B"
PID: PWR-C1-350WAC-P , VID: V01 , SN: ART********
NAME: "Switch 1 FRU Uplink Module 1", DESCR: "8x10G Uplink Module"
PID: C9300-NM-8X , VID: V02 , SN: FJZ********
At this point, ensure you have the DNA-Advantage subscription licensing enabled:
License Usage:
-----------------------------------------------------------------
cat9k(config)# iox
cat9k(config)#
Verify the application hosting services are running by using the IOS XE CLI command
shown in Example 15-3.
Example 15-3 Verifying the IOx Application Hosting Infrastructure Services
If the highlighted services do not show as Running, toggle the IOx configuration in config
mode with no iox; iox to restart the services.
Be patient when you run the following IOS XE CLI command to review the status and list of
hosted applications. It may take several minutes for the application hosting environment to
be fully ready, even if reporting that no applications are registered.
No App found
Now that you have an environment ready to host a Docker container, you must decide what
application to host. Several Cisco-validated open-source options are available at https://
developer.cisco.com/app-hosting/opensource/.
You are free to pick other Docker container options, but for this exercise, choose iPerf
because it is a useful utility. iPerf3 is an open-source, cross-platform, and CLI-based utility
that performs active measurements between iPerf endpoints to gauge bandwidth and loss
statistics. It supports IPv4 and IPv6 endpoints.
15
The first action is to download the iPerf Docker container code to the usbflash1: file-system.
We also suggest you familiarize yourself with the project notes from https://hub.docker.com/
r/mlabbe/iperf3.
Pulling the Docker image directly from Docker hub to an IOx-enabled Cisco platform is not
currently supported. However, you can pull the Docker image down to another system that
already has the Docker utilities—a PC, laptop, or Linux VM—and save the Docker image
as a tar archive. Example 15-4 shows the process of using a separate computer with Docker
installed to pull the iPerf image and save it as a tar file for later transfer to the network
device.
Example 15-4 Using Docker Commands to Pull and Save the iPerf Container Image
You now have several deployment options. You can use Cisco DNA Center, the Cisco IOx
Local Manager, or you can do CLI-based actions. We cover each option in the following
sections.
Step 1. Check that device prerequisites are complete. Navigate to the main DNA Center
menu panel to Provision, into Services and App Hosting for Switches, as seen
in Figure 15-6.
If devices are Not Ready, you can hover over the status to get a summary of
what needs to be fixed. The summary does not list all discrepancies, so you
may need to take an iterative approach to fixing each issue until the device
becomes fully Ready. The prerequisites are also documented on this portal on
the Click Here link in the Information banner. Example 15-5 shows the content
of the prerequisites.
Example 15-5 DNA Center Guidance on Application Hosting Prerequisites
Prerequisites
To enable application hosting on a Cisco Catalyst 9000 device, the following prereq-
uisites must be fulfilled.
1. Configure a secure HTTP server on the switch where the applications will be
hosted.
15
2. Configure local or AAA based authentication server for the HTTPS user on the
switch. You must configure the username and the password with privilege level 15.
3. Ensure Cisco Catalyst 9300 Series switches are running Cisco IOS XE 16.12.x or
later version and Cisco Catalyst 9400 Series switches are running Cisco IOS XE
17.1.x or later version.
4. Ensure that the device has an external USB SSD pluggable storage. (Only for the
switches of 9300 family)
To verify that the configuration on the switch is correct, open the WebUI on the
switch and ensure that you can log in as the HTTPS user.
6. On Cisco DNA Center, configure the HTTPS credentials while manually adding the
device. The HTTPS username, password, and port number are mandatory for application
hosting. The default port number is 443. You can also edit the device credentials.
If you edit a device that is already managed, resynchronize that device in the
inventory before it is used for application hosting-related actions.
Step 2. Upload the container image into the DNA Center App Hosting software
repository.
Navigate to the main DNA Center menu panel to Provision, into Services and
App Hosting for Switches, as seen in Figure 15-6, resulting in the software
repository seen in Figure 15-8. There may or may not be existing app packages
based on prior use.
15
Figure 15-10 Cisco DNA Center Performing App Hosting Image Import
When the import into the DNA Center App Hosting repository is complete, the
portal updates to show the new app. In this situation, Figure 15-11 depicts this
as Mlabbe/Iperf3, the first app in the list.
Figure 15-11 Cisco DNA Center App Hosting After iPerf3 Import
15
Figure 15-13 Selecting Devices for App Hosting Deployment in Cisco DNA Center
Step 5. Configure host-specific deployment options—Network Settings.
The next portal to appear is the Configure App, where Network Settings can be
defined. If the app doesn’t expose network services or if your environment is
appropriately configured to serve DHCP IP addresses, then this step is trivial. In
scenarios using static IP addresses, there are several more steps. First, you must
click the Export button to download a CSV file defining the network specifica-
tions. Figures 15-14 and 15-15 show these steps.
15
Figure 15-17 Provisioning iPerf Application Hosting Instance
The workflow continues to show a provisioning status screen, as seen in
Figure 15-18. The amount of time to complete depends on the number of sites,
building, switches, and the size of the container apps that must be uploaded to
each host switch.
shows this portal, and Figure 15-20 shows the selection of a specific hosting
switch for its individual app details and status.
At this point, you can interact with the container app. Navigate down to the
“Interacting with App Hosted iPerf3” section if you prefer. The following
sections cover a similar process of getting the container app into the host-
ing switch using alternative IOx Local Manager or command-line interface
methods.
15
Figure 15-23 Using Cisco IOx Local Manager to Import iPerf3 App
After you click the OK button, the system starts importing the Docker app, as seen in
Figure 15-24, resulting in the message shown in Figure 15-25.
15
Figure 15-26 Activating the iPerf3 Image in Cisco IOx Local Manager
Figure 15-27 Updating iPerf3 Image Network and Docker Options in Cisco IOx Local
Manager
Now that the Docker container app has the desired guest parameters, you can fully activate
it by clicking the Activate App button in the upper-right corner, as seen in Figure 15-28. This
shows a processing pop-up, like the one in Figure 15-29.
Figure 15-28 Activating the iPerf3 App in Cisco IOx Local Manager
15
Figure 15-29 Activation Progress of iPerf3 App in Cisco IOx Local Manager
With an activated application, you can now start it by clicking the Start button, as shown in
Figure 15-30. You then see a processing pop-up, like the one in Figure 15-31.
Figure 15-30 Starting the iPerf3 App in Cisco IOx Local Manager
Figure 15-31 Startup Progress of iPerf3 App in Cisco IOx Local Manager
Finally, the IOx Local Manager shows a fully running Docker container app, as in
Figure 15-32.
Figure 15-32 Final Startup Status of iPerf3 App in Cisco IOx Local Manager
Jump ahead to the “Interacting with App Hosted iPerf3” section to interact with the running
Docker container app.
Step 1. Copy the tar file to the switch usbflash1: filesystem using available file trans-
fer protocols, such as TFTP, SCP, FTP, RCP, HTTP, or HTTPS. If you need a
refresher, reference this document for Catalyst 9300 and the minimum IOS XE
release 16.12 at http://cs.co/9004Jryo2#concept_lqt_ltk_l1b.
Step 2. Configure the AppGigabitEthernet1/0/1 interface.
Application hosting in IOS XE on Catalyst 9300/9400 includes a new App-
GigabitEthernet interface, specifically, AppGigabitEthernet1/0/1, which acts as
the bridge between the physical switch networking and the virtualized Docker
application.
Depending on your use case, you can trunk all switch VLAN traffic, a subset, 15
or just a single VLAN. For this educational use case, you can trunk but allow
just one VLAN:
!
interface AppGigabitEthernet1/0/1
switchport trunk allowed vlan 100
switchport mode trunk
end
Step 3. Map any Docker application virtual network (vNIC) interfaces. For this iPerf
example, there is only one, but there could be other scenarios where your
Docker app may have a “management” interface that is separate from a “sniffer”
or “data-traffic” interface. Pay attention to any VLAN, guest IP address, gate-
way, and Docker resource dependencies.
app-hosting appid iPerf3
app-vnic AppGigabitEthernet trunk
vlan 100 guest-interface 0
guest-ipaddress 10.10.20.101 netmask 255.255.255.0
app-default-gateway 10.10.20.254 guest-interface 0
app-resource docker
run-opts 1 "-p 5201:5201/tcp -p 5201:5201/udp"
Unpacking this set of commands a bit, you can see that it gives the Docker con-
tainer app an application ID (appid) of iPerf3. You’re defining the application
virtual network interface card (vNIC) to the AppGigabithEthernet trunk. Cor-
respondingly, it assigns the application’s guest-interface, traditionally Ethernet0,
as instance 0 mapped to VLAN 100, matching the earlier interface AppGiga-
bitEthernet1/0/1 settings. In this use case, the intention is to give the guest app
a static IP address, so the address, netmask, and default gateway should be sup-
plied. DNS is optional. DHCP served parameters are supported. Finally, there
is a special consideration to add Docker resource parameters that port forward
TCP and UDP port 5201 externally from the hosting switch to the internal port
5201 of the guest app. This is a default port listener for iPerf3.
Step 4. Install the iPerf3 docker app by referencing its flash file-system location. You
can follow the installation with commands to verify the status:
cat9k# app-hosting install appid iPerf3 package
usbflash1:iperf3.tar
Installing package 'usbflash1:iperf3.tar' for 'iPerf3'.
Use 'show app-hosting list' for progress.
cat9k#
Again, remember the show app-hosting list command may take some time to
process.
Step 5. Activate the docker app, referencing it by application ID (appid).
cat9k# app-hosting activate appid iPerf3
iPerf3 activated successfully
Current state is: ACTIVATED
cat9k#
Step 6. Finally, run the iPerf3 docker app with the app-hosting start directive:
cat9k# app-hosting start appid iPerf3
iPerf3 started successfully
Current state is: RUNNING
Again, you can double-check the status with other app-hosting commands.
Example 15-6 shows a couple of commands to verify status.
Example 15-6 Reviewing Container Status with show app-hosting Commands
Application
Type : docker
Name : mlabbe/iperf3
Version : latest
Description :
Path : usbflash1:iperf3.tar
URL Path :
Activated profile name : custom
Resource reservation
Memory : 1024 MB
Disk : 10 MB
CPU : 3700 units
VCPU : 1 15
Attached devices
Type Name Alias
---------------------------------------------
serial/shell iox_console_shell serial0
serial/aux iox_console_aux serial1
serial/syslog iox_syslog serial2
serial/trace iox_trace serial3
Network interfaces
---------------------------------------
eth0:
MAC address : 52:54:dd:56:a:40
IPv4 address : 10.10.20.101
Network name : mgmt-bridge100
Docker
------
Run-time information
Command :
Entry-point : iperf3 -s
Run options in use : -p 5201:5201/tcp -p 5201:5201/udp
Package run options :
Application health information
Status : 0
Last probe error :
Last probe output :
Now that you have a fully running iPerf3 Docker application, you can proceed to interacting
with the app either as a server endpoint or as an embedded client.
■ A local laptop, server, or virtual machine with the iPerf3 utility as a client, target-
ing a local Cisco Nexus 9300 running the iPerf3 docker-app in IOx Application
Hosting
■ A local laptop, server, or virtual machine with the iPerf3 utility as a client and the
OpenConnect VPN software, targeting a Cisco Sandbox Lab environment with a
Cisco Nexus 9300 running the iPerf3 docker-app in IOx Application Hosting
Cat9300
Server, Cat9300
Cat9300
Laptop, or with iPerf
VM with iPerf Container App
(as Client) (as Server)
Server,
Laptop, or Cat9300
VM with iPerf with iPerf
(as Client) Container App 6
Example 15-7 Running the iPerf3 Utility as a Client on the Local System
iperf Done.
Now let’s go from the embedded Docker app running on the switch using iPerf3 out to
another iPerf3 system running in server (-s) mode. The iPerf project page documents some
publicly accessible iPerf servers if you don’t have any other systems available (see https://
iperf.fr/iperf-servers.php).
Figure 15-34 shows the topology use case options:
■ A local laptop, server, or virtual machine with the iPerf3 utility as a server, receiving
client traffic from a local Cisco Nexus 9300 running the iPerf3 docker-app in IOx
Application Hosting
■ A local Cisco Nexus 9300 running the iPerf3 docker-app in IOx Application Hosting
as a client, targeting Internet-based iPerf3 servers
■ A local laptop, server, or virtual machine with the iPerf3 utility as a server and the
OpenConnect VPN software, receiving client traffic from a Cisco Sandbox Lab envi-
ronment with a Cisco Nexus 9300 running the iPerf3 docker-app in IOx Application
Hosting as a client
NOTE A Cisco Sandbox environment sending traffic out the general Internet is not
displayed in this example because it is not supported.
Cat9300
Cat9300 Cat9300
Figure 15-34 Testing iPerf3 Environment Options with a Catalyst Switch as the Client
Example 15-8 shows the steps you need to run from the Catalyst 9300 to access the iPerf3
container-app and run the iPerf3 utility. In this example, the server would be the local PC,
server, or virtual machine running in server mode with iperf3 -s and a local Catalyst 9300
targeting that local system. Sending traffic to Internet-based iPerf servers is possible if your
system allows traffic to the Internet.
Example 15-8 Accessing the Catalyst Switches’ iPerf3 Container-App and Running as
a Client
iperf Done.
/ $
Now that you have a taste for using traffic testing functions, what other workloads can you
think of? Would it be useful to run a local syslog receiver with more advanced filtering and
forwarding as a local Docker container app? How about using the robust ThousandEyes
Enterprise agent to get much more observability functions?
As one more example of this exciting container functionality, let’s build a container in which
you run a local syslog receiver in the Catalyst 9300. This container-app will run the popular,
feature-rich Syslog-NG utility and do better event message handling, filtering, and forward-
ing than the native IOS XE can do. You can forward the hosting Catalyst 9300’s syslog
messages to the container. It is possible to have other devices near the hosting system also
forward their messages to the container. You can provide a Syslog-NG configuration file that
■ Sends security-focused ACL violations to the security management tools, like Cisco
Secure Network Analytics (formerly Stealthwatch)
This model would be effective for regional collection, filtering, and follow-on forwarding to
centralized systems.
First, you need to access a system that has a Docker engine running on it to build a container
app. You can use something familiar like Ubuntu, Red Hat Enterprise Linux, or CentOS to
build the container. Then use the following commands to create a project directory:
FROM balabit/syslog-ng:latest
EXPOSE 514/udp
EXPOSE 601/tcp
In Example 15-9 the FROM directive pulls the balabit/syslog-ng:latest image from Dock-
erhub. You use ADD to add a syslog-ng.conf file, which you define in the next step, and
mount it to the container’s /etc/syslog-ng/syslog-ng.conf location. You then use EXPOSE to
expose the traditional Syslog UDP/514 port, which is typical for the network equipment,
but you also use EXPOSE to expose and listen to TCP/601 for any reliable syslog event
messages coming from other sources. Then you define a HEALTHCHECK function that
periodically checks the operation of Syslog-NG. Finally, you use ENTRYPOINT to execute
the Syslog-NG utility.
The Dockerfile is complete and can be saved, but you need to define the syslog-ng.conf file
that was referenced in the Dockerfile before you can build the image. Example 15-10 depicts
a suggested syslog-ng.conf file. You can use an editor to create this syslog-ng.conf file in the
same directory as the Dockerfile.
Example 15-10 syslog-ng.conf File
@version: 3.35
@include "scl.conf"
filter f_ACL-violation {
match("SEC-6-IPACCESSLOG" value("MESSAGE")) or
match("SYS-5-PRIV_AUTH_FAIL" value("MESSAGE"))
};
log {
source(s_net);
filter(f_ACL-violation);
destination(d_securityapp);
};
log {
source(s_net);
destination(d_DNAC);
destination(d_LOGArchive);
destination(d_localfile);
};
The first lines of the syslog-ng file are administrative but necessary. If you are pulling a dif-
ferent version of the Syslog-NG utility, then specify the correct version; if newer than v3.35,
you can leave as is. The @include “acl.conf” is necessary for the network drivers source seen
next.
The source s_net definition pulls in default network drivers that list to several ports, such
as UDP/514, TCP/601, and TCP/6514 (for syslog over TLS). It also includes a standard Cisco
parser. You can define other ports to suit your needs.
The filter f_ACL-violation definition includes a pattern match for a typical login and ACL
violation message with SYS-5-PRIV_AUTH_FAIL and SEC-6-IPACCESSLOG. You can
define even more by using additional or conditions.
The destination d_securityapp definition specifies the target syslog receiver for security
messages. Change the entry, including the < >s, to reflect your environment.
The destination d_DNAC definition specifies a target syslog receiver for Cisco DNA Center.
Change the entry, including the < >s, to reflect your environment.
The destination d_LOGArchive definition specifies a target syslog receiver for another
Syslog-NG server for archival purposes. Change the entry, including the < >s, to reflect your
environment. 15
The destination d_localfile definition specifies a target file, which is in the switch’s Docker
container in the /var/log/ directory and with subdirectories by Year, Month, and Date.
Dynamic directories are created, such as /var/log/2022.01.01/messages, for all incoming
syslog event messages received on January 1, 2022.
The first log definition takes in messages from the network source and matches any with the
ACL-violation filter and then sends to the security app destination.
The second log definition takes in messages from the network source and sends all messages
to the DNAC, LOGArchive, and localfile destinations. No filter is defined, so no messages
are excluded.
Now, you should save the syslog-ng.conf file in the same directory as the Dockerfile.
Example 15-11 shows the process of building the container image.
Example 15-11 Building the Syslog-NG Container Image
Use the docker save command previously described to create a .tar image of the container
on the local system:
App id State
---------------------------------------------------------
syslog_ng RUNNING
Then you can access the Syslog-NG container-app with the app-hosting connect command:
# cat /var/log/2021.10.26/messages
#
Finally, remember to forward the syslog event messages from your hosting system back to
the assigned IP address of the Docker container:
a centralized software repository to manage the .tar and .tar.gz files and facilitate version
control. The centralized software repository can also serve to host the images via TFTP or
SCP, if desired. It is also suggested that you need a database or spreadsheet of which appli-
cations and versions are being deployed to which switches.
For containerized apps that store files, plus syslog event messages, binary software images,
database components, and so on, it is also a best practice to routinely monitor the usage of
the switch’s usbflash1: filesystem. Monitoring the flash disk usage by container-app is best,
but at a minimum, you should retrieve the whole flash disk usage to know when “spring
cleaning” is necessary.
Availability monitoring of the container-apps is trivial if the apps expose an IP address to
ping or if they have APIs accessible for polling. If you are in control of the development and
deployment of the container-apps, consider building API endpoints that expose the app
name, version, resource deployment, uptime, CPU/memory/interface, and disk usage.
Performance monitoring of the container-apps depends on the embedded instrumentation
and telemetry. If the container-app has a lightweight Linux foundation with shell access,
then a simple Ansible script that accesses it via SSH to run relevant performance commands
would be appropriate. If you are in control of the development and deployment of the con-
tainer-apps, consider using a gRPC or AMQP message bus approach to stream telemetry and
messages from the remote app to a centralized collection and monitoring system.
Fault monitoring of the container-apps is like the performance monitoring aspect: it depends
on the embedded instrumentation and telemetry. If the container-app is generating log
messages, then ensure a function like syslog-ng or logger is creating outbound copies for
collection, archiving, and analysis.
Security management of the container-apps varies on the services deployed in the container.
Remember that containers are not full-blown operating systems. The docker daemon and
Cisco IOx hosting services provide some protections. If your container-app provides a shell
access, then you should use basic practices of good password management, prioritizing pub-
lic key authentication over passwords. Be mindful of the ports and services exposed from
the container-app to the hosting device. In a containerized database scenario, you might con-
sider the application-level configuration protections: encryption, RBAC, and table/schema
partitioning. If the container-app is running a more feature-rich base like Ubuntu Linux and
the app data is sensitive or proprietary, it may make sense to implement host-based firewalls
like iptables, firewalld, or UFW. Note, however, that Docker does manipulate and control
many networking interactions, so you need to review documentation associated with the
version of dockerd running on the host. Nginx, haproxy, and other proxy use common with
containers prevents the container-app from seeing the remote IP address, so dynamic track-
ing and blocking tools (such as fail2ban) are not effective without using Layer-2/bridging
connectivity. Finally, it might be warranted to run a container-app firewall appliance and
chain the other container-app traffic through it. Keep in mind the best practices learned with
traditional, hardware-based architectures because many are possible to implement with vir-
tualized software.
As mentioned earlier, inventory and asset management are important. Keeping track of
container-app versions to ensure the latest with the newest features, bug fixes, and patches is
an additional administrative task. Routine asset scans to validate the asset tracking system or
CMDB are desirable.
References
URL QR Code
https://devnetsandbox.cisco.com/
https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/prog/
configuration/173/b_173_programmability_cg/guest_shell.html
https://developer.cisco.com/app-hosting/opensource/
https://hub.docker.com/r/mlabbe/iperf3
https://developer.cisco.com/docs/iox/#!package-format/
iox-application-package
http://cs.co/9004Jryo2#concept_lqt_ltk_l1b
https://iperf.fr/iperf-servers.php
Cisco Platforms
■ Firepower: This section covers practical examples of using the Firepower Manage-
ment Center APIs.
■ Meraki: This section covers practical examples of using the Meraki API and SDKs.
■ Intersight: This section covers practical examples of using the Intersight API and
SDKs.
■ UCS Manager: This section covers practical examples of using the UCS Manager API
and SDKs.
■ DNA Center: This section covers practical examples of using the DNA Center API
and SDKs.
■ App Dynamics: This section covers practical examples of using the AppDynamics API.
This chapter does not map to a specific section of the Developing Applications Using
Cisco Core Platforms and APIs v1.0 (350-901) Exam Blueprint. However, the examples in
this chapter may be instructional for other components of the exam. Specifically, it is ben-
eficial for you to understand the common product application programming interfaces, how
they authenticate, and some basic function methods. This chapter provides those examples
that should augment your continuous learning process.
5. Which one of the following would be a correct authentication body for the Meraki
Dashboard v1 API?
a.
{
}
b.
{
"Authentication": "Bearer <MERAKI_DASHBOARD_API_KEY>"
}
c.
{
}
d.
{
}
6. Which of the following is the correct method invocation for the Meraki Python SDK
to obtain device information from a specific organization?
a. api.organizations.getOrganizationDevices(organization_id)
b. api.devices.getDevices(organization_id)
c. api.getOrganizationDevices()
d. api.organizations.<orgid>.getOrganizationDevices()
7. What pieces of information must be present to successfully authenticate to the Inter-
sight REST API?
a. Username and password
b. API key and password
c. MD5-encoded bearer token
d. API key and private key
8. What tools or SDKs exist to interact with the Intersight APIs?
a. REST APIs
b. Python SDK, PowerShell SDK
c. Ansible, Terraform
d. All of these answers are correct.
9. What data format(s) of request and response are supported by the UCS Manager API?
a. JSON only
b. XML only
c. JSON and XML
d. RESTCONF
10. Which statement about the UCS PowerTool is correct?
a. It is supported only on Windows operating systems.
b. It requires a manually collected API key to connect to the UCS Manager instance.
c. It allows output to be filtered to only specific object values.
d. It displays output data in native XML format by default.
11. Direct REST API usage with DNA Center requires which type of header?
a. An authorization header with a basic authentication string
b. An authentication header with a basic authentication string
c. A bearer token authentication header
d. An OAuth token authorization header
12. What is the name of the community-supported Python SDK that should be pip 16
installed and imported into your Python script for DNA Center SDK usage?
a. ciscodnacsdk
b. dnacsdk
c. ciscodnac
d. dnacentersdk
13. In AppDynamics authorization, what is the difference in usage between client secrets
and temporary access tokens? (Choose two.)
a. The client secret is generated through the API.
b. A client secret is used to generate a short-lived access token via the API.
c. The temporary access token is generated in the WebUI and is generally a longer-
term assignment.
d. The client secret can be defined by the user.
e. The temporary access token can be defined by the user.
14. In what format is the AppDynamics API default output encoded?
a. JSON
b. SAML
c. XML
d. YAML
Foundation Topics
Webex
Webex by Cisco is the industry’s leading collaboration platform, providing users with a
secured unified experience of meeting, calling, and messaging in one app. Webex brings
virtual meetings to life by allowing users to stay connected with video conferencing that’s
engaging, intelligent, and inclusive. Webex Calling provides a fully integrated, carrier grade
cloud phone system that keeps users connected from anywhere. Webex Messaging keeps the
work flowing between meetings by bringing everyone together with secure and intelligent
messaging organized by workstreams. The Webex Events portfolio delivers solutions for any
occasion, any size audience, anywhere—from town halls and customer webinars to multises-
sion conferences and networking events.
Webex provides an extensive set of REST APIs and SDKs to build Webex-embedded apps,
bots, integrations, widgets, or guest issuer apps. These APIs and SDKs provide your applica-
tions with direct access to the Cisco Webex platform, which includes a suite of features and
applications for performing administrative tasks, meetings, calling, messaging, device con-
figuration, and Webex Assistant. Webex provides SDKs for iOS, Android, Web Apps, Node.
js, and Java. Several other community-developed SDKs are available for other languages.
Step 1. You need a Webex account backed by Cisco Webex Common Identity (CI).
You can sign up for a free Webex account at https://www.webex.com/.
Step 2. Using your Webex credentials, access the developer dashboard athttps://
developer.webex.com/(see Figure 16-1).
When your Webex bot or integration is ready, you can publish it on Webex App Hub, where
existing Webex users can browse the App Hub and find your apps (see Figure 16-6).
16
Figure 16-6 Existing Webex Apps (App Hub)
You can access the Webex App Hub submission process at https://developer.webex.com/
docs/app-hub-submission-process.
API Examples
Using the Webex REST API (documented at https://developer.webex.com/docs/getting-
started) or through Postman collections (available at https://github.com/CiscoDevNet/
postman-webex) is simple. The first activity is to understand the Webex API authentication
process. When you make requests to the Webex REST API, an authentication HTTP header
is used to identify you as the requesting user. The header must include an access token—
whether a personal access token, bot token, or OAuth token—as described previously. The
token is supplied as bearer authentication, sometimes called token authentication.
}
It is important to note that personal access tokens are time-limited to 12 hours after you log
in to the site. You can extract your PAT from the getting-started URL at https://developer.
webex.com/docs/getting-started.
As a simple example, let’s assume that you want to determine what Webex rooms a bot is
part of. You can find the Webex REST API for Messaging and Rooms directly at https://
developer.webex.com/docs/api/v1/rooms(see Figure 16-7).
import requests
import os
import json
import pprint
pp = pprint.PrettyPrinter(indent=4)
url = "https://webexapis.com/v1/rooms"
payload={}
headers = {
'Authorization': 'Bearer ' + ACCESS_TOKEN
}
fimi
response = requests.request("GET", url, headers=headers, data=payload)
json_response = json.loads(response.text)
pp.pprint(json_response)
By running this script, you obtain the output shown in Example 16-2.
Example 16-2 Output of Running the Script Shown in Example 16-1
SDK Examples
The Webex software development kit is available for iOS, Android, Web Apps, Node.js, and
Java, making it easy for you to integrate Webex functionalities within your own mobile and
web applications (see Figure 16-8). In addition to these, you can access community-
developed SDKs for other language applications on GitHub at https://github.com/
CiscoDevNet/awesome-webex-client-sdk.
Step 3. Create a new file, Podfile, with the content shown in Example 16-3 in your
MyWebexApp project directory.
Example 16-3 Webex iOS SDK Integration
source 'https://github.com/CocoaPods/Specs.git'
use_frameworks!
target 'MyWebexApp' do
platform :ios, '13'
pod 'WebexSDK'
end
target 'MyWebexAppBroadcastExtension' do
platform :ios, '13'
pod 'WebexBroadcastExtensionKit'
end
Step 4. Install the Webex iOS SDK from your MyWebexApp project directory:
pod install
Step 5. To your app’s Info.plist, add the GroupIdentifier entry with the value as your
app’s GroupIdentifier. This step is required so that you can get a path to store
the local data warehouse.
Step 6. If you plan to use the WebexBroadcastExtensionKit, you also need to add a
GroupIdentifier entry with the value as your app’s GroupIdentifier to your
Broadcast Extension target. This step is required so that you can communicate
with the main app for screen sharing.
Step 7. Modify the Signing & Capabilities section in your Xcode project, as shown in
Figure 16-9.
16
The following examples show how to use the iOS SDK in your iOS app:
Step 1. Create the Webex instance using Webex ID authentication (OAuth-based), as
shown in Example 16-4.
Example 16-4 Creating an Instance Using Webex ID
webex.initialize { isLoggedIn in
if isLoggedIn {
print("User is authorized")
} else {
authenticator.authorize(parentViewController: self) { result in
if result == .success {
print("Login successful")
} else {
print("Login failed")
}
}
// ...
Firepower
Firepower Threat Defense (FTD) is the Cisco next-generation firewall (NGFW) solution,
providing not only a traditional L3/L4 security policy, but also L7 application inspection
and a comprehensive IDS/IPS, leveraging the Cisco Snort acquisition. These devices can be
managed individually, using Firepower Device Management (FDM), or using a central man-
agement platform called Firepower Management Center (FMC). Although both tools can be
used, they cannot be used concurrently (the use of FDM removes the ability to manage the
device using FMC and vice versa). Additionally, when opting to use FDM rather than FMC,
the management functions (and subsequent APIs supporting the functions) are diminished or
smaller in number. As such, this discussion focuses on the FMC APIs.
The REST APIs provided within FMC cover the complete operations and management of the
devices under FMC, the deployed security policy, and configuration objects.
Step 1. Log in to the FMC WebUI using your username and credentials.
Step 2. Navigate through System > Configuration > REST API Preferences > Enable
REST API.
Step 3. Click the Enable REST API check box, as shown in Figure 16-10.
The DevNet team offers several labs focused on exploring and utilizing the FMC REST
API. There is also a reservable FMC Sandbox available for learning and testing the API. This
Sandbox provides a view into the FMC UI as well as the APIs, while ensuring security and
preventing the Sandbox from being sabotaged.
Another helpful resource is the Postman collection, which allows you to import all public
API resources into a common utility that is popular with software developers. The Postman
collection can be accessed and imported into your Postman environment at https://
www.postman.com/ciscodevnet/workspace/cisco-devnet-s-public-workspace/collection/
8697084-bf06287b-a7f3-4572-a4d5-84f1c652109a?ctx=documentation (or following the
QR code in the “References” section at the end of this chapter) and clicking the Cisco Secure
Firewall Management collection, as seen in Figure 16-12. You can then fork this collection
into your personal collection for editing and use.
16
should set up a Python virtual environment (despite the requirements for accessing the FMC
to be only the requests library). Example 16-8 demonstrates the steps required to create this
virtual environment.
Example 16-8 Creating a Python Virtual Environment to Access the FMC REST API
After creating the virtual environment, you can create the code to gather the authorization
token for the FMC, as shown in Example 16-9.
Example 16-9 Python Code for FMC API Authorization
import argparse
import json
import requests
if __name__ == "__main__":
parser = argparse.ArgumentParser(
formatter_class=argparse.RawDescriptionHelpFormatter)
parser.add_argument("user", type=str, help ="Valid FMC Username")
parser.add_argument("password", type=str, help="Valid FMC Password")
parser.add_argument("ip_address", type=str, help="IP of FMC")
args = parser.parse_args()
user = args.user
password = args.password
ip = args.ip_address
token_path = "/api/fmc_platform/v1/auth/generatetoken"
16
header = get_token(ip, token_path, user, password)
This code gathers the appropriate headers from the response and stores them as a header
variable for use in subsequent requests against the API. You need to use this header value
to gather additional information against the FMC API. You do so by appending the sample
code shown in Example 16-10 to the request in Example 16-9.
Example 16-10 Gathering FMC Version Using the REST API and Python
version_path = f"/api/fmc_platform/v1/info/serverversion"
try:
r = requests.get(f"https://{ip}/{version_path}", headers=header,
verify=False)
except requests.exceptions.HTTPError as e:
raise SystemExit(e)
except requests.exceptions.RequestException as e:
raise SystemExit(e)
try:
print(json.dumps(r.json(), indent=2))
except Exception as e:
raise SystemExit(e)
When this code is combined and run, output like that shown in Example 16-11 is received.
Keep in mind that the exact output will vary depending on the version of code running on
the FMC within your environment.
Example 16-11 Output from Gathering the FMC Version Using the REST API
{
"links": {
"self": "https://198.19.10.120/api/fmc_platform/v1/info/
serverversion?offset=0&limit=1"
},
"items": [
{
"serverVersion": "6.7.0 (build 65)",
"geoVersion": "2021-01-25-002",
"vdbVersion": "build 340 ( 2020-12-16 00:13:46 )",
"sruVersion": "2021-01-20-001-vrt",
"type": "ServerVersion"
}
],
"paging": {
"offset": 0,
"limit": 1,
"count": 1,
"pages": 1
}
}
Using the header variable passed in as part of the GET request does work (otherwise, the
code shown in Example 16-11 would not return the FMC version). However, it is a generally
accepted practice to leverage the .session() method within the requests library to ensure a
persistent HTTP session instead of creating a new request with each new URI. This approach
works fine because the header value is passed in as part of the GET request to the system
version URI. However, when you use the requests module, you can leverage the header as
part of a persistent session. Doing so removes the need for sending the header with each
request, increasing performance, and removing the need to have as much duplicated code
within the script.
Example 16-12 Using the .session() Method to Query Multiple API Endpoints on the
FMC Using Python
import argparse
import json
import requests
r = requests.post(f"https://{fmcIP}/{path}", auth=(f"{user}",
f"{password}"), verify=False)
except requests.exceptions.HTTPError as e:
raise SystemExit(e)
except requests.exceptions.RequestException as e:
raise SystemExit(e)
if __name__ == "__main__":
parser = argparse.ArgumentParser(
formatter_class=argparse.RawDescriptionHelpFormatter)
parser.add_argument("user", type=str, help ="Valid FMC Username")
parser.add_argument("password", type=str, help="Valid FMC Password")
parser.add_argument("ip_address", type=str, help="IP of FMC") 16
args = parser.parse_args()
user = args.user
password = args.password
ip = args.ip_address
token_path = "/api/fmc_platform/v1/auth/generatetoken"
version_path = f"/api/fmc_platform/v1/info/serverversion"
device_path = f"/api/fmc_config/v1/domain/{UUID}/devices/devicerecords?
expanded=True"
sess = requests.Session()
sess.headers.update({'X-auth-access-token': header["X-auth-access-token"],
'X-auth-refresh-token': header["X-auth-refresh-token"]})
try:
resp1 = sess.get(f"https://{ip}/{version_path}", verify=False)
resp2 = sess.get(f"https://{ip}/{device_path}", verify=False)
except requests.exceptions.HTTPError as e:
raise SystemExit(e)
except requests.exceptions.RequestException as e:
raise SystemExit(e)
try:
print(json.dumps(resp1.json(), indent=2))
print()
print("*************")
print()
print(json.dumps(resp2.json(), indent=2))
except Exception as e:
raise SystemExit(e)
By adding (or changing) the highlighted sessions, you are able to define the headers to be
sent toward the API endpoint only once. These values are then used for all subsequent
requests calling that instance of the .session() method, without declaring explicit header
values each time. Multiple requests can use the same instance and store the response within
different variables to be accessed later. The results of the script look like Example 16-13
(depending on devices connected to the FMC).
Example 16-13 Responses from Querying the FMC Version and Managed Device APIs
{
"links": {
"self": "https://198.19.10.120/api/fmc_platform/v1/info/
serverversion?offset=0&limit=1"
},
"items": [
{
"serverVersion": "6.7.0 (build 65)",
"geoVersion": "2021-01-25-002",
"vdbVersion": "build 340 ( 2020-12-16 00:13:46 )",
"sruVersion": "2021-01-20-001-vrt",
"type": "ServerVersion"
}
],
"paging": {
"offset": 0,
"limit": 1,
"count": 1,
"pages": 1
}
}
*************
{
"links": {
"self": "https://198.19.10.120/api/fmc_config/v1/domain/
e276abec-e0f2-11e3-8169-6d9ed49b625f/devices/devicerecords?o
ffset=0&limit=3&expanded=True"
},
"items": [
{
"id": "15c6c338-4f6e-11eb-845c-b89f5b28ebea",
"type": "Device",
"links": {
"self": "https://198.19.10.120/api/fmc_config/v1/domain/
e276abec-e0f2-11e3-8169-6d9ed49b625f/devices/devicerecor
ds/15c6c338-4f6e-11eb-845c-b89f5b28ebea"
},
"name": "NGFWBR1",
"description": "NOT SUPPORTED",
"model": "Cisco Firepower Threat Defense for VMWare",
"modelId": "A",
"modelNumber": "75",
"modelType": "Sensor",
"healthStatus": "green",
"sw_version": "6.7.0",
"healthPolicy": { 16
"id": "c253737c-2b73-11eb-960a-bf3ad063ddd4",
"type": "HealthPolicy",
"name": "Initial_Health_Policy 2020-11-20 21:02:54"
},
"accessPolicy": {
"name": "Branch Access Control Policy",
"id": "00505697-87b7-0ed3-0000-034359738978",
"type": "AccessPolicy"
},
"advanced": {
"enableOGS": false
},
"hostName": "ngfwbr1.dcloud.local",
"license_caps": [
"THREAT",
"MALWARE",
"URLFilter"
],
"keepLocalEvents": false,
"prohibitPacketTransfer": false,
"ftdMode": "ROUTED",
"metadata": {
"readOnly": {
"state": false
},
"inventoryData": {
"cpuCores": "1 CPU (4 cores)",
"cpuType": "CPU Pentium II 2700 MHz",
"memoryInMB": "8192"
},
"deviceSerialNumber": "9A1V2FD3L7A",
"domain": {
"name": "Global",
"id": "e276abec-e0f2-11e3-8169-6d9ed49b625f",
"type": "Domain"
},
"isPartOfContainer": false,
"isMultiInstance": false,
"snortVersion": "2.9.17 (Build 200 - daq12)",
"sruVersion": "2021-01-20-001-vrt",
"vdbVersion": "Build 340 - 2020-12-16 00:13:46"
}
},
<...output omitted for brevity...>
For further exploration of the FMC API, Cisco DevNet provides several learning modules
dedicated to Firepower APIs, including Threat Defense, Device Manager, Management
Center, and others. You can find these modules at https://developer.cisco.com/learning/
modules?keywords=firepower.
Meraki
Cisco’s acquisition of Meraki in 2012 has proven to be popular with those desiring a cloud-
managed solution of software-defined WAN devices, switches, wireless access points,
security, and IoT solutions. Even with simplified management and monitoring, the Meraki
solutions have a robust set of APIs and SDKs that are appealing to companies looking to
extract even more functionality from their cloud-managed assets.
The APIs span the gamut of provisioning functions, monitoring and performance values, and
inventory whether your perspective is at an organization, network, or device level. The API
also provides an architecture that reflects product types: Appliance, Camera, Cellular Gate-
way, Switch, Wireless, Insight (traffic analytics), and Sm (Systems Manager, Mobile Device
Manager).
The APIs and Python SDK are useful for custom application development, guest Wi-Fi
insights, location services, and sensor management.
Step 2. Navigate through Organization > Settings down to the Dashboard API access
section.
Step 3. Ensure the Enable Access to the Cisco Meraki Dashboard API option is
selected, as shown in Figure 16-14.
Within each category, further Configure and Monitor subdivisions allow for specific func-
tions related to provisioning and management.
Now the environment is ready for the follow-on SDK examples. Note the authorization
methods available in the next section.
Meraki Authorization
The Meraki Dashboard API requires a REST request header parameter, X-Cisco-Meraki-
API-Key, for authorization in each request.
The JSON representation of this parameter is
"X-Cisco-Meraki-API-Key": <MERAKI_DASHBOARD_API_KEY>
}
If you use the curl command-line tool, it would appear as
curl https://api.meraki.com/api/v1/<request_resource> \
-H 'X-Cisco-Meraki-API-Key: <MERAKI_DASHBOARD_API_KEY>'
The Python SDK method, without follow-on processing, would appear as
import meraki
Dashboard = meraki.DashboardAPI()
The newer and current Dashboard API v1 supports bearer authentication using the authori-
zation header parameter. This is generally preferred over using bespoke header parameters.
The JSON representation of this parameter is
}
If you use the curl command-line tool, it would appear as
curl https://api.meraki.com/api/v1/<request_resource> \
16
GET https://api.meraki.com/api/v1/organizations
This poll should result in output like that shown in Figure 16-19 using Postman.
From this output, you can see the organization name DevNet Sandbox has the ID 549236.
Knowing and retaining your specific organization ID help make Meraki polling more effi-
cient and specific.
The following simple script extracts Meraki devices in a cloud-managed environment. You
start by maintaining the Meraki API key in a shell environment variable. This is a more
secure method over storing it in a Python script variable.
export MERAKI_DASHBOARD_API_KEY=1234567890…
Then you create a new Python script in a virtual environment:
vi GetMerakiDevices.py
Next, you create the contents of the file as shown in Example 16-15.
Example 16-15 Python Script to Extract Meraki Devices in the Organization
import meraki
import pprint
pp = pprint.PrettyPrinter(indent=4) 16
organization_id = '549236'
dashboard = meraki.DashboardAPI()
my_devices = dashboard.organizations.getOrganizationDevices(organization_id,
total_pages='all')
pp.pprint(my_devices)
Executing the script provides a structured output view of devices, as shown in Example 16-16.
Example 16-16 Execution and Output of GetMerakiDevices.py
{ 'address': '',
'configurationUpdatedAt': '2021-11-29T17:57:48Z',
'firmware': 'Not running configured version',
'lanIp': None,
'lat': 37.****951010362,
'lng': -122.****31723022,
'mac': 'e0:55:3d:10:**:**',
'model': 'MR84',
'name': '',
'networkId': 'L_64682949648*****',
'notes': '',
'productType': 'wireless',
'serial': 'Q2EK-****-****',
'tags': [],
'url': 'https://n149.meraki.com/DNSMB1-wireless/n/XhipOdvc/manage/nodes/
new_list/246656701****'},
{ 'address': '',
'configurationUpdatedAt': '2021-11-28T16:21:51Z',
'firmware': 'Not running configured version',
'lanIp': None,
'lat': 34.****4579899,
'lng': -117.****867527,
'mac': '34:56:fe:a2:**:**,
'model': 'MV12WE',
'name': '',
'networkId': 'L_64682949648*****',
'notes': '',
'productType': 'camera',
'serial': 'Q2FV-****-****',
'tags': [],
'url': 'https://n149.meraki.com/DNENT1-vxxxogmai/n/o4tztdvc/manage/nodes/
new_list/575482438****'
/*** Abbreviated output ***/
]
(.venv) myserver MyProject %
For other SDK (and API) insights, you can find the interactive Meraki API documentation at
https://developer.cisco.com/meraki/api-v1/#!get-device.
It serves as a great resource for planning your Python development strategy on Meraki
equipment.
If you’re looking to get deep into the guts of the SDK, the project is maintained on GitHub
at https://github.com/meraki/dashboard-api-python.
Specifically, this directory shows a lot of detail on the API methods: https://github.com/
meraki/dashboard-api-python/tree/master/meraki/api.
Intersight
Intersight is the Cisco cloud-enabled management platform for on-premises compute. Rather
than relying on local connectivity to the server management portal (UCSM or CIMC), Inter-
sight enables remote, API-enabled capabilities via secured connections between devices
and the Intersight cloud. Although Intersight provides a powerful management platform for
server infrastructure, enabling ease of logging, proactive TAC/RMA cases, and hardware
compatibility lists based on the OS/hypervisor running on the server, it also allows adminis-
trators and operators to utilize purpose-built applications powered by the cloud to manage
and operate their server infrastructure in a faster, orchestrated manner.
Because Intersight is API-driven, you can use a robust set of REST APIs in lieu of the
graphical web interface. You can use these APIs to interact with not only the Cisco hard-
ware controlled by Intersight but also the management of Kubernetes clusters, workflow
orchestration, and third-party hardware platforms connected to Intersight. Additionally, the
Intersight APIs are designed using the Open API Specification version 3 (OASv3) standard,
enabling them to be easily translated into SDKs for various languages, as well as modules and
providers for third-party automation and orchestration tools, such as Ansible and Terraform.
Step 3. Click the API Keys menu item toward the bottom of the account management
window. This brings up the API key management screen, where you can create a
new API key by clicking the Generate API Key button.
Step 4. Within the resulting dialog box, you provide a name for the API key pair, as
well as which version of key you want to generate. At the time of this writing,
the OASv2 schema will suffice for most users, though you can create both key
types within the account (see Figure 16-22). The “legacy” version uses RSA pri-
vate keys, whereas the newer version uses elliptical curve private keys.
Figure 16-21 shows the generation of the API keys, and Figure 16-22 shows all
generated API keys for that user account.
Figure 16-22 Viewing Generated Intersight API Keys for a User Account
If a previous API key and secret have been generated, there is no way to recover
them if they are not documented. In this instance, a new API key and secret
need to be created prior to using the API or SDK.
Step 5. If you generated a new key, remember to store it in your secure password reposi-
tory. If you do a lot of development with API keys, it's inefficient to regener-
ate keys and impact several applications because of unrecorded credentials, so
develop a discipline of secure credential storage. Ensure that both the API key and
secret are kept and labeled together because API keys and secrets are generated in
pairs and cannot be mismatched with other generated key/secret pairs.
Figure 16-23 Adding api-key and secret-key Values to the Postman Collection for Intersight
Intersight must have devices added in order for you to manage them. If you’re reserving the
DevNet Sandbox for Intersight, which contains a pair of UCS Platform Emulators (UCSPEs),
these devices should be claimed in order to receive data into Intersight, which requires the
device’s DeviceID and claim code. You can find them within the UCSPE by clicking the
Admin menu on the left side of the screen and then Device Connector at the bottom of the
submenu screen, as shown in Figure 16-24. The time-sensitive claim code is refreshed every
five minutes.
Figure 16-25 Claiming UCS Manager Instances Using the Postman Collection for Intersight
The rest of the Postman collection is focused on various tasks, including activities around
server profiles, vMedia policies, boot policies, and NTP policies. The collection is not
exhaustive, especially because the capabilities within Intersight have been under constant
development and some features require additional licensing, but it should suffice when you’re
getting familiar with the REST API of the platform.
Intersight Authorization
Authorization requirements using the Python SDK are like those required for using the
Postman collection: everything is generated using the API and secret keys. However, authen-
ticating directly using the key and secret is not possible (as evidenced by the Pre-request
Script within the Intersight Postman collection). Plus, both the v2 and v3 API key methods
can be used to authenticate to the Intersight cloud. The Intersight SDK repository within
the CiscoDevNet organization (https://github.com/ciscodevnet/intersight-python) has a
sample script to authenticate to the Intersight API. This script can be added to Python code
as a function or created as a standalone Python file that can be imported into a subsequent
Python script. Example 16-18 provides sample Intersight authorization code that can handle
both types of API keys generated from the Intersight portal.
Example 16-18 Intersight Authorization Python Code
import intersight
import re
configuration = intersight.Configuration(
host=endpoint,
signing_info=intersight.signing.HttpSigningConfiguration(
key_id=api_key_id,
private_key_path=api_secret_file,
signing_scheme=signing_scheme,
signing_algorithm=signing_algorithm,
hash_algorithm=hash_algorithm,
signed_headers=[
intersight.signing.HEADER_REQUEST_TARGET,
intersight.signing.HEADER_HOST,
intersight.signing.HEADER_DATE,
intersight.signing.HEADER_DIGEST,
]
)
)
return intersight.ApiClient(configuration)
This code handles the computations that occurred in the Pre-request Script within the Post-
man collection, showing the value of an SDK because the SDK can abstract API complexity
away from a user, allowing the user to focus on the end result of the code. Adding additional
code to apply or retrieve configuration becomes somewhat trivial, given how the SDK is
organized. Referencing the API documentation on intersight.com and the samples within the
GitHub repository, you can make some assumptions within the code to achieve the desired
outcomes. Assuming that you want to receive all alarms within Intersight, you can start by
looking at the API docs. You can see that this falls inside the cond/Alarms API path, shown
16
in Figure 16-26.
Example 16-19 Gathering Intersight Alarm Status Using the Intersight Python SDK
import intersight
import os
import re
from intersight.api import cond_api
from pprint import pprint
configuration = intersight.Configuration(
host=endpoint,
signing_info=intersight.signing.HttpSigningConfiguration(
key_id=api_key_id,
private_key_path=api_secret_file,
signing_scheme=signing_scheme,
signing_algorithm=signing_algorithm,
hash_algorithm=hash_algorithm,
signed_headers=[
intersight.signing.HEADER_REQUEST_TARGET,
intersight.signing.HEADER_HOST,
intersight.signing.HEADER_DATE,
intersight.signing.HEADER_DIGEST,
]
)
)
return intersight.ApiClient(configuration)
api_key = os.environ.get('INTERSIGHT_KEY')
api_key_file = "/Users/qsnyder/Downloads/SecretKey.txt.old.txt"
api_instance = cond_api.CondApi(api_client)
try:
api_response = api_instance.get_cond_alarm_list()
pprint(api_response)
except intersight.ApiException as e:
print("Exception calling alarm list: %s\n" % e)
The output of this code is lengthy and includes a lot of superfluous information regarding
the alarms seen by Intersight that may or may not be appropriate for the given application.
Looking through the examples within the GitHub README, you can see that selectors
are supported, allowing you to parse the resulting query without needing to run through
the machinations of JSON parsing using keys and values. By changing the code to include
query selectors, you can filter the information returned as part of the response, as shown in 16
Example 16-20.
Example 16-20 Using Query Selectors Within the Intersight SDK
import intersight
import re
from intersight.api import cond_api
from pprint import pprint
configuration = intersight.Configuration(
host=endpoint,
signing_info=intersight.signing.HttpSigningConfiguration(
key_id=api_key_id,
private_key_path=api_secret_file,
signing_scheme=signing_scheme,
signing_algorithm=signing_algorithm,
hash_algorithm=hash_algorithm,
signed_headers=[
intersight.signing.HEADER_REQUEST_TARGET,
intersight.signing.HEADER_HOST,
intersight.signing.HEADER_DATE,
intersight.signing.HEADER_DIGEST,
]
)
)
return intersight.ApiClient(configuration)
api_key = "5c1c24e273766a3634d3f8d0/5c1c242e73766a3634d3f02e/61a927537564612d33b9d
2be"
api_key_file = "/Users/qsnyder/Downloads/SecretKey.txt.old.txt"
api_instance = cond_api.CondApi(api_client)
query_selector="CreationTime,Description"
try:
api_response = api_instance.get_cond_alarm_list(select=query_selector)
pprint(api_response)
except intersight.ApiException as e:
print("Exception calling alarm list: %s\n" % e)
The results are filtered to include only the creation time and the description of the alarm.
This data is still incredibly raw, as you can see in Example 16-21, and can benefit from for-
matting the date, for instance, but it illustrates the query selection available within the SDK.
Example 16-21 Raw Returned Response from the Use of Query Selectors
{'class_id': 'cond.Alarm',
'creation_time': datetime.datetime(2021, 12, 2, 17, 21, 8, 115000, tzinfo=tzutc()),
'description': 'Connection to Adapter 1 eth interface 1 in '
'server 4 missing',
'moid': '61a9a2a265696e2d3216210a',
'object_type': 'cond.Alarm'},
UCS Manager
Cisco Unified Computing System Manager (UCS Manager or UCSM) provides a single point
of management for all UCS form factors, including blades and rack-mount servers. By con-
necting the devices to the UCS Fabric Interconnects, you can centralize device profiles, net-
16
work configuration, and systems management through a single plane. UCSM configuration
is provided through a well-documented model, known as the Management Information Tree
(MIT). This model exposes all configuration objects available within the UCSM UI through
API endpoints.
While the UI and capabilities have been expanded since its initial release in 2009, UCSM’s
APIs were designed during a time in which “unspoken standards” hadn’t been fully decided,
so if you have used OAS-compliant APIs, using UCSM may feel a bit clunky because the
data returned is only available via XML. However, a fully documented API explorer tool,
called Visore, is included on each UCSM instance. Additionally, several different tools
available to interact with the UCSM API don’t require complex parsing of XML, but sim-
ply knowledge of Python or PowerShell. Ansible modules are also available for UCSM via
Ansible Galaxy.
model upon which all UCS interactions (via API or UI) are built. Figure 16-27 shows the
Visore landing page when you first access the tool.
which document what the parent MO of this object is and which MOs are children of this
object, both of which are provided in a tree-style format. This information allows you to
understand the distinguished name (DN) path to access each configuration item within the
UCS MIT. Below the hierarchy tables is the properties table, which is what is output when
this specific MO is queried. This page is shown in Figure 16-29.
16
As a result, you can quickly understand and develop queries to glean the desired information
from the API. Figure 16-30 illustrates the XML required to generate a query for the
topSystem MO.
This request generates the same data as seen in the Visore viewer, but with XML tag
delineation:
<configResolveClass cookie="999f4f8e-d32a-442c-b108-
2084603ea17c" response="yes" classId="topSystem"> <outCon-
figs> <topSystem address="192.168.111.7" currentTime="2021-12-
07T04:18:08.080" descr="" dn="sys" ipv6Addr="::" mode="cluster"
name="UCSPE-192-168-111-7" owner="" site="" systemUp-
Time="00:09:20:19"/> </outConfigs> </configResolveClass>
The XML can be difficult to read at times, especially when it is all placed together on a
single line and especially if the MO query includes many child objects and classes of the
parent. However, when you use command-line tools, you are able to format the resulting
response without having to either save it locally or copy/paste to another tool. By piping the
output of the response to xmllint, you can print formatted output directly to stdout, similar
to what is shown in Example 16-22.
Example 16-22 Formatting UCSM XML Responses Using xmllint
This same process can be followed for all classes and MOs available within the UCSM API.
It is possible to translate the cURL requests from the CLI to be used in Postman and have
Postman store the cookie as an environment variable to be used for subsequent requests. To
do so, follow these steps:
Figure 16-31 Placeholder Variable Created Within Postman for the UCSM Cookie
Step 3. Create a new POST request, pointing to the API entrypoint address ( http://
UCSM-IP/nuova). Add the same XML payload to include the username and
password of the UCSM, in XML format, as in Figure 16-32.
postman.setEnvironmentVariable("cookie", withoutTime);
Figure 16-33 Creating the Test to Gather the UCSM API Cookie from the Response Payload
Step 5. After you save the request, you can send it toward the UCSM. When the
response is received, the full XML body is displayed within the response win-
dow, but the cookie value is stored within the environment (you can confirm
this by clicking the Environment Quick Look button next to the environment
name in the upper right of the Postman window).
Step 6. The final step is to transpose the topSystem request to Postman, similar to what is
shown in Figure 16-34. You do this by creating a new POST request to the same
URL as the login (remember, UCSM does not have unique entrypoints within the
API; the returned data is determined by the class and classID within the XML
payload). The value for cookie is then replaced with the newly created cookie vari-
able within the UCS Postman environment. The response from the request is for-
matted similarly to the xmllint formatted request from the cURL command line.
16
Figure 16-34 Using the Newly Gathered UCSM API Cookie to Query the topSystem MO
The DevNet team offers several different Learning Labs that enable you to learn and use the
UCSM API and SDKs that accompany the same Sandbox used for Intersight exploration.
Using this singular Sandbox, you can explore both the original UCSM platform and the
next-generation compute management system, Intersight.
Using a recent version of Python (3.9.7 in this example), you can set up an environment to
support the UCSM SDK. If you use a Python virtual environment, the process would be like
that shown in Example 16-23.
Example 16-23 Creating a Python Virtual Environment for Accessing the UCSM API
When the SDK is installed, it may be beneficial to explore the SDK using the Python REPL
(or bpython package) to understand how objects are created and accessed. You can enter the
code to connect to a UCSM as shown in Example 16-24.
Example 16-24 Accessing the UCSM SDK from bpython
This code connects the SDK to the UCSM instance. When it is connected, you can begin to
explore simple queries of MOs within the MIT. Carrying forward the topSystem example,
you can explore several ways to access the information from this MO. The SDK supports
accessing objects by classID, like you did using the XML queries. You do this using the
query_classid() method, as shown in Example 16-25.
Example 16-25 Querying the topSystem MO Using the UCSM SDK in bpython
>>> print(classObject[0].name)
UCSPE-10-10-20-40
Note that the object created is of class list and needs to be indexed to correctly print the
output. When the root of the object is accessed, the full table of information is presented in
a much easier-to-digest form than the XML payload from the cURL/Postman calls. Addition-
ally, any individual item can be accessed by accessing the method called by the name of the
MO (such as .name to gather hostname information).
The UCSM SDK also supports accessing MOs by the distinguished name (DN) path within
the MIT. Recall Figure 16-28, which is a snapshot of the Visore page for the topSystem
classID, or the output from Example 16-25. Within this table, there is a reference for the
DN, the unique path from the top level of the MIT to the specific MO. If you choose to
access the topSystem object by DN using the SDK, the code would look like that shown in
Example 16-26.
>>> print(dnObject.name)
UCSPE-10-10-20-40
This example looks similar to queries done by classID, but the resulting output object does
not need to be indexed; it is ready to be accessed in full or through a method to gather the
specific values needed. To carry this example one level deeper, you can query a specific
computeRackUnit by DN. You can utilize Visore to gather the DN and query it, with results
similar to Example 16-27.
Example 16-27 Querying a Specific UCS Server by DN Using the UCSM SDK
availability :available
available_memory :49152
check_point :discovered
child_action :None
conn_path :A,B
conn_status :A,B
descr :
discovery :complete
discovery_status :
dn :sys/rack-unit-10
enclosure_id :0
fan_speed_config_status :
fan_speed_policy_fault :no
flt_aggr :2
fsm_descr :
fsm_flags :
fsm_prev :DiscoverSuccess
fsm_progr :100 16
fsm_rmt_inv_err_code :none
fsm_rmt_inv_err_descr :
fsm_rmt_inv_rslt :
fsm_stage_descr :
fsm_stamp :2021-12-02T17:24:55.696
fsm_status :nop
fsm_try :0
id :10
int_id :86468
kmip_fault :no
kmip_fault_description :
lc :discovered
lc_ts :1970-01-01T00:00:00.000
local_id :
low_voltage_memory :not-applicable
managing_inst :A
memory_speed :not-applicable
mfg_time :not-applicable
model :HX220C-M5SX
name :
num_of40_g_adaptors_with_old_fw :0
num_of40_g_adaptors_with_unknown_fw:0
num_of_adaptors :2
num_of_cores :16
num_of_cores_enabled :16
num_of_cpus :2
num_of_eth_host_ifs :0
num_of_fc_host_ifs :0
num_of_threads :16
oper_power :off
oper_pwr_trans_src :unknown
oper_qualifier :
oper_qualifier_reason :N/A
oper_state :unassociated
operability :operable
original_uuid :1b4e28ba-2fa1-11d2-e00a-b9a761bde3fb
part_number :
physical_security :chassis-open
policy_level :0
policy_owner :local
presence :equipped
revision :0
rn :rack-unit-10
sacl :None
serial :RK93
server_id :10
slot_id :0
status :None
storage_oper_qualifier :unknown
total_memory :49152
usr_lbl :
uuid :1b4e28ba-2fa1-11d2-e00a-b9a761bde3fb
vendor :Cisco Systems Inc
version_holder :no
veth_status :A,B
vid :0
You can see that this method of accessing the API provides a much cleaner interface for
gathering information and selecting specific pieces of information from the API. Finally,
because the SDK opens a persistent session connection to the UCSM, it is good program-
matic practice to close the connection before exiting the REPL:
>>> ucs_session.logout()
True
>>> exit()
URLs for your system OS. Each URL is also available in QR code format in the “References”
section at the end of this chapter:
■ Linux: https://docs.microsoft.com/en-us/powershell/scripting/install/installing-
powershell-on-linux?view=powershell-7.2
■ macOS: https://docs.microsoft.com/en-us/powershell/scripting/install/installing-
powershell-on-macos?view=powershell-7.2
■ Windows: https://docs.microsoft.com/en-us/powershell/scripting/install/installing-
powershell-on-windows?view=powershell-7.2
It is also possible to run PowerShell as a container within the Docker runtime. This capability
provides portability across systems without worrying about installation permissions (assum-
ing Docker is installed). Although this method is demonstrated in the following examples,
the steps performed after the container is running apply to any operating system.
You can download the (Linux-based) PowerShell core container from Microsoft’s DockerHub
page and run it in interactive mode by using the process shown in Example 16-28.
Example 16-28 Pulling the Docker PowerShell Container 16
~ » docker pull mcr.microsoft.com/powershell
Using default tag: latest
latest: Pulling from powershell
7b1a6ab2e44d: Pull complete
f738ed9de711: Pull complete
68dfbdc8ea02: Pull complete
501a0230e302: Pull complete
Digest: sha256:bbf28e97eb6ecfcaa8b1e80bdc2700b443713f7dfac3cd648bfd3254007995e2
Status: Downloaded newer image for mcr.microsoft.com/powershell:latest
mcr.microsoft.com/powershell:latest
https://aka.ms/powershell
Type 'help' to get help.
PS />
When the container is running in interactive mode, the process to install the UCS PowerTool
is the same within the container as it would be in a bare-metal OS, as seen in Example 16-29.
Example 16-29 Installing the UCS PowerTool into the Running Docker PowerShell
Container
Untrusted repository
You are installing the modules from an untrusted repository. If you trust this
repository, change its InstallationPolicy value by running the Set-PSRepository cmd-
let. Are
you sure you want to install the modules from 'PSGallery'?
[Y] Yes [A] Yes to All [N] No [L] No to All [S] Suspend [?] Help (default is
"N"): y
After the prompt to install the module, you may be prompted to accept several different
license agreements. After you accept them, you can use the UCS PowerTool to work with
the UCSM API. First, you must import the module, and then you must establish a connec-
tion to the UCSM, as shown in Example 16-30.
Example 16-30 Connecting to a UCSM Instance Using UCS PowerTool
NumPendingConfigs : 0
Ucs : UCSPE-10-10-20-40
Cookie : 1638934478/e08f56e8-f796-49c8-8496-e4760882c6d0
Domains : org-root
LastUpdateTime : 12/8/2021 3:35:28 AM
Name : 10.10.20.40
NoSsl : False
NumWatchers : 0
Port : 443
Priv : {aaa, admin, ext-lan-config, ext-lan-policy…}
PromptOnCompleteTransaction : False
Proxy :
RefreshPeriod : 600
SessionId :
TransactionInProgress : False
Uri : https://10.10.20.40
UserName : ucspe
Version : 4.0(4c)
VirtualIpv4Address : 10.10.20.40
WatchThreadStatus : None
PS />
When you’re connected to the UCS, you are able to perform all standard CRUD operations.
However, to compare similar examples, continue gathering topSystem data. Gathering the
same information is simple enough because there is a noun built for most classIDs within the
PowerTool, as shown in Example 16-31.
Example 16-31 Querying the topSystem MO Using UCS PowerTool
PS /> Get-UcsTopSystem
Address : 10.10.20.40
CurrentTime : 2021-12-08T03:37:04.720 16
Descr :
Ipv6Addr : ::
Mode : cluster
Name : UCSPE-10-10-20-40
Owner :
Sacl :
Site :
SystemUpTime : 05:10:29:44
Ucs : UCSPE-10-10-20-40
Dn : sys
Rn : sys
Status :
XtraProperty : {}
It’s also possible to extract the raw XML data, in addition to the formatted table of data, by
using the -Xml switch, with results similar to those shown in Example 16-32.
Example 16-32 topSystem Output in XML Format
Address : 10.10.20.40
CurrentTime : 2021-12-08T03:38:58.207
Descr :
Ipv6Addr : ::
Mode : cluster
Name : UCSPE-10-10-20-40
Owner :
Sacl :
Site :
SystemUpTime : 05:10:31:38
Ucs : UCSPE-10-10-20-40
Dn : sys
Rn : sys
Status :
XtraProperty : {}
By piping the command output to the Select-Object command, you can print only the desired
fields of information (similar to accessing the specific methods within the Python SDK):
Name
----
UCSPE-10-10-20-40
It is also possible to query specific paths within the API. All the filters and query parameters
are accepted, as well as filtering-specific commands by DN, much like finding a specific
compute rack unit within the overall computeRackUnit classID, as shown in Example 16-33.
Example 16-33 Accessing a Specific Server Instance Using the UCS PowerTool
AdminPower : policy
AdminState : in-service
AssetTag :
AssignedToDn :
Association : none
Availability : available
AvailableMemory : 49152
CheckPoint : discovered
ConnPath : {A, B}
ConnStatus : {A, B}
Descr :
Discovery : complete
DiscoveryStatus :
EnclosureId : 0
FanSpeedConfigStatus :
FanSpeedPolicyFault : no
Id : 10
KmipFault : no
KmipFaultDescription :
Lc : discovered
LcTs : 1970-01-01T00:00:00.000
LocalId :
LowVoltageMemory : not-applicable
ManagingInst : A
MemorySpeed : not-applicable
MfgTime : not-applicable
Model : HX220C-M5SX
Name :
NumOf40GAdaptorsWithOldFw : 0
NumOf40GAdaptorsWithUnknownFw : 0
NumOfAdaptors : 2 16
NumOfCores : 16
NumOfCoresEnabled : 16
NumOfCpus : 2
NumOfEthHostIfs : 0
NumOfFcHostIfs : 0
NumOfThreads : 16
OperPower : off
OperPwrTransSrc : unknown
OperQualifier :
OperQualifierReason : N/A
OperState : unassociated
Operability : operable
OriginalUuid : 1b4e28ba-2fa1-11d2-e00a-b9a761bde3fb
PartNumber :
PhysicalSecurity : chassis-open
PolicyLevel : 0
PolicyOwner : local
Presence : equipped
Revision : 0
Sacl :
Serial : RK93
ServerId : 10
SlotId : 0
StorageOperQualifier : unknown
TotalMemory : 49152
UsrLbl :
Uuid : 1b4e28ba-2fa1-11d2-e00a-b9a761bde3fb
Vendor : Cisco Systems Inc
VersionHolder : no
VethStatus : A,B
Vid : 0
Ucs : UCSPE-10-10-20-40
Dn : sys/rack-unit-10
Rn : rack-unit-10
Status :
XtraProperty : {}
Finally, when the interaction is complete, it is good programmatic practice to close the con-
nection with the UCSM:
PS /> Disconnect-Ucs
Ucs : UCSPE-10-10-20-40
InCookie : 1638937107/e815357b-1047-49ab-8620-27b6e78a41c7
Name : 10.10.20.40
OutStatus : success
SessionId :
Uri : https://10.10.20.40
Version : 4.0(4c)
DNA Center
The Cisco DNA Center solution is the controller for enterprise fabric deployments. It serves
as the follow-on management solution from Prime Infrastructure. DNA Center provides con-
figuration management, software image management, performance and fault monitoring, and
inventory tracking. Traditional route/switch and wireless inventory can be managed. DNA
Center serves as an on-premises management solution with open and extensible APIs that
complement your intent-based networking (IBN) requirements.
The DNA Center APIs span provisioning, monitoring, performance, and inventory functions
and are generally broken into Intent and Integration APIs, Event and Notification Webhooks,
and a multivendor SDK, as shown in Figure 16-35.
■ Path Trace: Provides flow analysis between two endpoints in the network
■ Tasks: Enables you to initiate activities via an API request, including status and com-
pletion results
QWxhZGRpbjpvcGVuIHNlc2FtZQ==
Therefore, the authorization HTTP header you use is
16
Figure 16-42 Cisco DNA Center Postman Collection Inside Postman App
Several Postman environment variables are used. They need to be reconciled for the collec-
tion to be used. To do so, you create a new environment by clicking the Environments menu
option from the left navigation panel. Click the plus (+) icon in that fly-out menu option to
create a new environment. Fill in the environment settings as shown in Figure 16-43.
Figure 16-43 Postman Environment Settings for Cisco DNA Center on DevNet
Sandbox Lab
The settings in Figure 16-43 are for the DevNet Always On DNAC 2.2 Sandbox. Feel free to
use it for your education or create another environment with settings relevant to your DNA
Center. As you hover over the {{variable}} entries, you should note that Postman shows a pop-
up with the variable resolved.
If you return to the Authentication request and click the Send button (on the right), note
the completion of a request for an API token. The Tests tab has been filled with simple
JavaScript code to extract the API response and create a new token environment variable.
If you navigate back to the Environment section, you can see the new entry.
A nice feature of Postman is the Translate to Code feature. Along the right navigation col-
umn, the </> icon represents this feature. Click this icon to create a code snippet. A Python-
Requests version is suggested and results in the code shown in Example 16-34.
Example 16-34 Default Postman Translate to Code Output
import requests
url = "https://sandboxdnac.cisco.com/api/system/v1/auth/token"
payload = ""
headers = {
'Authorization': 'Basic ZGV2bmV0dXNlcjpDaXNjbzEyMyE='
}
print(response.text)
Because we prefer more secure coding practices, we suggest that you do not put creden-
tials and authorization codes inside your script. The possibility of posting this into an open
GitHub repository is a risk too great! So, we suggest the following method: put your user-
name and password in shell environment variables and tweak the script to read them. Here is
an example providing similar functionality to the Postman generic code snippet but with a
touch better security:
$ export DNAC_USER=devnetuser
$ export DNAC_USERPASSWORD=Cisco123!
import requests
import os
import base64
url = "https://sandboxdnac.cisco.com/api/system/v1/auth/token"
payload = ""
16
print(response.text)
{"Token":"eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9..."}
You now have a template for a script to do other DNA Center REST API work.
https://pypi.org/project/dnacentersdk/
https://github.com/cisco-en-programmability/dnacentersdk
recommended, but if you have a current environment at version 3.6 or higher, that is suffi-
cient. Also, consider using a virtual environment (venv) when setting up your project.
The quick install method uses the PyPi project to obtain the dnacentersdk package, as
shown in Example 16-36.
Example 16-36 Creating a Cisco DNA Center SDK Project
Now your environment is ready for the follow-on SDK examples. The next section addresses
authorization. If you haven’t already covered the DNA Center REST API enablement steps,
go back to the “Enabling API/SDK Access to DNA Center” section. You must ensure that the
feature is enabled and that you have a user (with password) defined.
SDK Authorization
The DNA Center SDK enables you to obtain the username and password credentials via envi-
ronment variables, which is recommended over putting your credentials inside your Python
script. For example, if you’re using the DevNet Always On Sandbox Lab for DNA Center,
these would be the applicable environment variables:
export DNA_CENTER_USERNAME=devnetuser
export DNA_CENTER_PASSWORD=Cisco123!
export DNA_CENTER_BASE_URL=https://sandboxdnac.cisco.com:443
Now you can build a sample Python script that uses the SDK. You can reference the API
docs from the community portal to see the various functions and options. At the time of
this writing, this link refers to the latest API information:
https://dnacentersdk.readthedocs.io/en/latest/api/api.html#dnacenterapi-v2-2-3-3
Navigating down to the Devices section would be a safe approach and would result in 16
practical information you might want to obtain for putting into an asset management sys-
tem. Specifically, you can use the devices.get_device_list() method for this first approach.
Now, create the Python script in Example 16-37 as getDNACdevicelist.py in your virtual
environment.
Example 16-37 Python Script to Extract Cisco DNA Center Devices
pp = pprint.PrettyPrinter(indent=4)
api = DNACenterAPI()
my_devices = api.devices.get_device_list()
for device in my_devices.response:
pp.pprint(device)
This simple script imports the necessary libraries: the DNA Center SDK and pretty print. It
configures the pretty print feature to indent four spaces. You create the DNA Center session,
passing authentication credentials via the environment variables previously set. You then call
the devices group get_device_list() and store the list into a my_devices variable. Finally, you
pretty print out the list device by device.
Now you can execute the script shown in Example 16-38.
'waasDeviceMode': None}
{ 'apEthernetMacAddress': None,
'apManagerInterfaceIp': '',
'associatedWlcIp': '',
'bootDateTime': '2021-10-28 18:10:20',
'collectionInterval': 'Global Default',
'collectionStatus': 'Managed',
'description': 'Cisco IOS Software [Amsterdam], Catalyst L3 Switch '
'Software (CAT9K_IOSXE), Version 17.3.3, RELEASE SOFTWARE '
'(fc7) Technical Support: http://www.cisco.com/techsupport '
'Copyright (c) 1986-2021 by Cisco Systems, Inc. Compiled '
'Thu 04-Mar-21 12:32 by mcpre',
'deviceSupportLevel': 'Supported',
'errorCode': None,
'errorDescription': None,
'family': 'Switches and Hubs',
'hostname': 'leaf1.abc.inc',
'id': 'aa0a5258-3e6f-422f-9c4e-9c196db*****', 16
'instanceTenantId': '5e8e896e4d4add00ca2*****',
'instanceUuid': 'aa0a5258-3e6f-422f-9c4e-9c196db*****',
'interfaceCount': '0',
'inventoryStatusDetail': '<status><general code="SUCCESS"/></status>',
'lastUpdateTime': 1639144220380,
'lastUpdated': '2021-12-10 13:50:20',
'lineCardCount': '0',
So, you can see the SDK simplified a lot of actions behind the scenes: authentication, data
encoding, error handling, and rate limiting. You could, of course, go directly to the REST
API, but you would still need to perform these functions. The SDK reduces the code require-
ments for developers, like you.
AppDynamics
AppDynamics (a.k.a. AppD) is a cloud-enabling, full-stack application performance monitor-
ing (APM) solution. It is used in a multicloud world to improve performance through visibil-
ity and detection of issues via real-time monitoring. AppD provides end-to-end visibility of
application traffic, enabling developers and architects to deploy and manage solutions with
assurance. It also uses machine learning in executing automated root-cause analysis. AppD
agents in your environment discover and track individual transactions with servers, data-
bases, and other application dependencies. The user experience is tracked and can be attrib-
uted to specific code blocks, database calls, name resolution, or other dependencies. The
insight can be mapped to business metrics, providing deep insight into impact and severity.
As a rich APM solution, AppD has an extensive API. We cover some of the basics for
authentication and basic information gathering, but a whole book could be written about
the depth of its API. The AppDynamic APIs are documented at https://docs.appdynamics.
com/21.12/en/extend-appdynamics/appdynamics-apis.
The system provides for short-term, API-generated access tokens that expire within five
minutes by default or for longer-term, UI-generated temporary access tokens that can be
defined by the administrator for hour, day, or year expirations. You can define your API
clients by logging in to your AppDynamics portal and accessing the gear icon in the upper-
right corner. From the drop-down navigation panel, select Administration. Then on the left
tab option, select API Clients. Next, click the Create (or +) button and define your client
name with an optional description. Click the Generate Secret button and the copy gadget
to the left of it. Make sure to retain this client secret in a secure password manager. Further-
more, click the Add (+) button under Roles and select the appropriate roles for the API user.
Finally, click Save in the upper-right corner. Figure 16-44 depicts the steps just described.
■ URL: https://<CONTROLLER_HOST>:8090/controller/api/oauth/access_token
grant_type=client_credentials
client_id=<APIUserName>@<CustomerName>
client_secret=<ClientSecret>
api/oauth/access_token" -d 'grant_type=client_
credentials&client_id=<APIUserName>@<CustomerName>&client_
secret=<ClientSecure>'
{"access_token": "eyJraWQiOiIyY2Q3YjY0YS0zYTAzLTRiNzMtYmN
lMS02MjgwZTVmMzAyMTMiLCJhbGciOiJIUzI1NiJ9.****M",
"expires_in": 300}
appserver[centos]$
A Postman utility equivalent is provided in Figure 16-45.
16
Additionally, this Python script would be a helpful baseline for progressing to other API
methods. You can set the following environment variables to perform a more secure authen-
tication method:
$ export APPD_CONTROLLER=10.10.20.2:8090
$ export APPD_API_USER=devcore
$ export APPD_CUSTOMER=customer1
$ export APPD_CLIENT_SECRET=fa42bc81-f3a8-4cbd-9ac7-cf9766bcd93d
The full script is shown in Example 16-39.
Example 16-39 Python Script to Generate an AppDynamics Access Token
import os
import requests
url = f'http://{APPD_CONTROLLER}/controller/api/oauth/access_token'
payload = f'grant_type=client_credentials&client_id={APPD_API_USER}%40{APPD_
CUSTOMER}&client_secret={APPD_CLIENT_SECRET}'
headers = {
'Content-Type': 'application/x-www-form-urlencoded'
}
print(access_token)
eyJraWQiOiIyY2Q3YjY0YS0zYTAzLTRiNzMtYmNlMS02MjgwZTVmMzAyMTMiLCJhbG-
ciOiJIUzI1NiJ9.****
appserver[centos]$
With a slight addition to the authentication template, as shown in Example 16-40, you can
move on to getting all the web applications that AppDynamics is currently monitoring.
import os
import requests
url = f'http://{APPD_CONTROLLER}/controller/api/oauth/access_token'
payload = f'grant_type=client_credentials&client_id={APPD_API_USER}%40{APPD_
CUSTOMER}&client_secret={APPD_CLIENT_SECRET}'
headers = {
'Content-Type': 'application/x-www-form-urlencoded'
}
16
response = requests.request("POST", url, headers=headers, data=payload)
data = response.json()
access_token = data['access_token']
url2 = f'http://{APPD_CONTROLLER}/controller/rest/applications'
headers2 = { 'Authorization': 'Bearer ' + access_token }
response2 = requests.request("GET", url2, headers=headers2)
print(response2.text)
<applications><application>
<id>6</id>
<name>Supercar-Trader</name>
<accountGuid>2cd7b64a-3a03-4b73-bce1-6280e5f30213</accountGuid>
</application>
</applications>
Note the API’s default output is XML. If you prefer JSON, you can append the URL with
?output=JSON.
You can make additional enhancements to this script by using the API to extract the applica-
tions involved and the nodes of those applications with detailed information, as shown in
Example 16-41. This enhancement may be useful for documentation purposes.
Example 16-41 Python Script to Get AppDynamics Application Nodes
import os
import requests
import xml.etree.ElementTree as ET
url = f'http://{APPD_CONTROLLER}/controller/api/oauth/access_token'
payload = f'grant_type=client_credentials&client_id={APPD_API_USER}%40{APPD_
CUSTOMER}&client_secret={APPD_CLIENT_SECRET}'
headers = {
'Content-Type': 'application/x-www-form-urlencoded'
}
url2 = f'http://{APPD_CONTROLLER}/controller/rest/applications'
headers2 = { 'Authorization': 'Bearer ' + access_token }
response2 = requests.request("GET", url2, headers=headers2)
print(response2.text)
root = ET.ElementTree(ET.fromstring(response2.text))
for app in root.findall('application'):
appname = app.find('name').text
print(appname)
url3 = f'http://{APPD_CONTROLLER}/controller/rest/applications/{appname}/nodes'
response3 = requests.request("GET", url3, headers=headers2)
print(response3.text)
<tierName>Enquiry-Services</tierName>
<machineId>301</machineId>
<machineName>appserver.localdomain</machineName>
<machineOSType>Linux</machineOSType>
<machineAgentPresent>false</machineAgentPresent>
<appAgentPresent>true</appAgentPresent>
<appAgentVersion>Server Agent #21.11.0.33247 v21.11.0 GA compatible with 4.4.1.0
r7f7d2e9ac67aabc2b2b39b7b6fcf9b071104bf79 release/21.11.0</appAgentVersion>
<agentType>APP_AGENT</agentType>
</node>
In this output, you can see a lot of helpful information related to server name, operat-
ing system type, whether the AppD agent is installed, and what version, along with other
characteristics.
We encourage you to continue to investigate the AppD API for extracting information that
may be beneficial to your projects.
References
URL QR Code
https://www.postman.com/ciscodevnet/workspace/cisco-
dna-center/collection/382059-634a2adf-f673-42eb-8c36-
290f38e37971?ctx=documentation
https://www.postman.com/ciscodevnet/workspace/cisco-
devnet-s-public-workspace/collection/8697084-bf06287b-
a7f3-4572-a4d5-84f1c652109a?ctx=documentation
https://documenter.getpostman.com/view/30210/
SVfWN6Yc
URL QR Code
https://docs.microsoft.com/en-us/powershell/scripting/
install/installing-powershell-on-linux?view=powershell-7.2
https://docs.microsoft.com/en-us/powershell/scripting/
install/installing-powershell-on-macos?view=powershell-7.2
https://docs.microsoft.com/en-us/
powershell/scripting/install/
installing-powershell-on-windows?view=powershell-7.2
16
Final Preparation
The first 16 chapters of this book cover the technologies, protocols, design concepts, and
considerations required to be prepared to pass the Developing Applications Using Cisco
Core Platforms and APIs v1.0 DevNet Professional 350-901 DEVCOR Exam. Although these
chapters supply the detailed information, most people need more preparation than simply
reading the first 16 chapters of this book. This chapter details a set of tools and a study plan
to help you complete your preparation for the exams.
This short chapter has two main sections. The first section lists the exam preparation tools
useful at this point in the study process. The second section lists a suggested study plan now
that you have completed all the earlier chapters in this book.
Getting Ready
Here are some important tips to keep in mind to ensure you are ready for this rewarding exam!
■ Build and use a study tracker: Consider taking the exam objectives shown in each
chapter and building yourself a study tracker! This will help ensure you have not
missed anything and that you are confident for your exam! As a matter of fact, this
book offers a sample Study Planner as a website supplement.
■ Think about your time budget for questions in the exam: When you do the math,
you realize that on average you have one minute per question. Although this does not
sound like enough time, realize that many of the questions are very straightforward,
and you will take 15 to 30 seconds on those. This builds time for other questions as
you take your exam.
■ Watch the clock: Check in on the time remaining periodically as you are taking the
exam. You might even find that you can slow down pretty dramatically as you have
built up a nice block of extra time.
■ Get some ear plugs: The testing center might provide them, but get some just in case
and bring them along. There might be other test takers in the center with you, and you
do not want to be distracted by their screams. Some people have no issue blocking out
the sounds around them, so they never worry about this, but it is an issue for some.
■ Plan your travel time: Give yourself extra time to find the center and get checked in.
Be sure to arrive early. As you test more at that center, you can certainly start cutting
it closer time-wise.
■ Get rest: Most students report success with getting plenty of rest the night before the
exam. All-night cram sessions are not typically successful.
■ Bring in valuables but get ready to lock them up: The testing center will take your
phone, your smart watch, your wallet, and other such items. They will provide a secure
place for them.
Step 1. Go to http://www.PearsonTestPrep.com.
Step 2. Select Pearson IT Certification as your product group.
Step 3. Enter your email/password for your account. If you don’t have an account on
PearsonITCertification.com or CiscoPress.com, you need to establish one by
going to PearsonITCertification.com/join.
Step 4. In the My Products tab, click the Activate New Product button.
Step 5. Enter the access code printed on the insert card in the back of your book to
activate your product.
Step 6. When the product is listed in your My Products page, click the Exams button
to launch the exam settings screen and start your exam.
http://www.pearsonitcertification.com/content/downloads/pcpt/engine.zip
To access the book’s companion website and the software, simply follow these steps:
Note that the offline and online versions synch together, so saved exams and grade results
recorded on one version are available to you on the other as well.
■ Study Mode
Study Mode enables you to fully customize your exams and review answers as you are tak-
ing the exam. This is typically the mode you would use first to assess your knowledge and
identify information gaps. Practice Exam Mode locks certain customization options, as it is
presenting a realistic exam experience. Use this mode when you are preparing to test your
exam-readiness. Flash Card Mode strips out the answers and presents you with only the
question stem. This mode is great for late-stage preparation when you really want to chal-
lenge yourself to provide answers without the benefit of seeing multiple-choice options.
This mode does not provide the detailed score reports that the other two modes do, so you
should not use it if you are trying to identify knowledge gaps.
In addition to these three modes, you can select the source of your questions. You can
choose to take exams that cover all of the chapters, or you can narrow your selection to
just a single chapter or the chapters that make up specific parts in the book. All chapters are
selected by default. If you want to narrow your focus to individual chapters, simply deselect
all the chapters and then select only those on which you wish to focus in the Objectives
area.
You can also select the exam banks on which to focus. Each exam bank comes complete
with a full exam of questions that cover topics in every chapter. The two exams printed in
the book are available to you as well as two additional exams of unique questions. You can
have the test engine serve up exams from all four banks or just from one individual bank by
selecting the desired banks in the exam bank area.
There are several other customizations you can make to your exam from the exam settings
screen, such as the time of the exam, the number of questions served up, whether to ran-
domize questions and answers, whether to show the number of correct answers for multiple-
answer questions, or whether to serve up only specific types of questions. You can also
create custom test banks by selecting only questions that you have marked or questions on
which you have added notes.
Premium Edition
In addition to the free practice exam provided on the website, you can purchase additional
exams with expanded functionality directly from Pearson IT Certification. The Premium
Edition of this title contains an additional two full practice exams and an eBook (in both
PDF and ePub format). In addition, the Premium Edition title also has remediation for each
question to the specific part of the eBook that relates to that question.
Because you have purchased the print version of this title, you can purchase the Premium
Edition at a deep discount. There is a coupon code in the book sleeve that contains a one-
time-use code and instructions for where you can purchase the Premium Edition.
To view the premium edition product page, go to www.informit.com/title/9780137370344.
Step 1. Review key topics and DIKTA? questions: You can use the table that lists the
key topics in each chapter, or just flip the pages looking for key topics. Also,
reviewing the DIKTA? questions from the beginning of the chapter can be
helpful.
Step 2. Use the Pearson Cert Practice Test engine to practice: The Pearson Cert
Practice Test engine can be used to study using a bank of unique exam-realistic
questions available only with this book.
Summary
The tools and suggestions listed in this chapter have been designed with one goal in mind: to
help you develop the skills required to pass the Developing Applications Using Cisco Core
Platforms and APIs v1.0 DevNet Professional 350-901 DEVCOR Exam. This book has been
developed from the beginning to not just tell you the facts but to also help you learn how
to apply the facts. No matter what your experience level leading up to when you take the
exams, it is our hope that the broad range of preparation tools, and even the structure of the
book, helps you pass the exam with ease. We hope you do well on the exam!
Chapter 1
1. D. Software architecture includes functional and business requirements and non-
functional requirements, which include building blocks and structures. All of these
answers relate to this concept.
2. B. Although the synchronous data link control is a real communication protocol cre-
ated in the 1970s for IBM Mainframe computer connectivity, the correct response
here is software development lifecycle. At the time of this writing, social data lan-
guage controller does not exist.
3. A. Functional requirements define the functionality and business purpose of an
application. Answer B is incorrect because high availability is an attribute of the
application and is considered part of the nonfunctional requirements. How to
develop an application and what programming language to use are not part of the
functional requirements.
4. D. Scalability, modularity, and high availability are part of a long list of nonfunctional
requirements briefly discussed in this chapter and in more detail in Chapter 2, “Soft-
ware Quality Attributes.”
5. B. Flexibility, speed, and collaboration among the various independent teams are the
main characteristics of Agile development.
6. D. The Lean process relies on rapid development and a lot less planning at the early
stages of the development cycle. Answers A, B, and C are clear characteristics of the
Lean development model.
7. D. DevOps has four metrics for assessing performance. Three are listed here, and the
fourth one is mean time to response.
8. C. White-box testing is different from black-box testing in that full knowledge of the
system under testing is needed; that’s also a requirement of unit testing.
9. D. Code reviewers should be looking at a variety of aspects or concepts related to
high-quality code, including functionality, complexity, naming, testability of code,
style, and the presence of proper comments and documentation.
10. B. Patterns ensure consistency of coding practices and the approach or context of
the solution. They are meant to reduce complexity (not increase it). There are various
patterns to choose from, and you should choose one that works well with the con-
text of the architecture you’re working on.
Chapter 3
1. D. Maintainability of software depends on various factors, including proper com-
ments, documentation, and modular design. A modular design without adequate
documentation is useless.
2. D. Structured programming is the traditional way of using a set of functions and is
not a characteristic of object-oriented programming.
3. C. SOLID is software design principles for building maintainable code. The five
SOLID principles are discussed in detail in the “Maintainability Design and Imple-
mentation” section.
4. B. The text of the question is a simple definition of the open-closed principle (or
OCP).
5. E. The dependency inversion principle states that high-level modules should not
depend on low-level modules. Both should depend on abstractions, and abstractions
should not depend on details. Details should depend on abstractions.
6. D. Latency, round-trip time, and throughput are all indicators of performance. Moni-
toring these parameters may also be an indicator of system- or network-related issues
that need attention or further optimization.
7. B. Network oversubscription is the safest answer. You can also argue that oversub-
scribed wireless systems lead to the same effect.
8. A and B. Both caching and rate limiting are ways to manage system load and prevent
performance degradations and outages. Echo replies can measure latency but are not
related to managing load in this question. Open Shortest Path First (OSPF) is a rout-
ing protocol.
9. D. Observability is a combination of logging, metrics, and tracing. They are all part
of the three answers A, B, and C.
10. C. The three pillars of observability are logging, metrics, and tracing. The other
answers are not related.
11. B and C. Relational databases store data in rows and columns like spreadsheets,
whereas nonrelational databases have four different types: one of them is document-
oriented stores designed for storing, managing, and retrieving document-oriented
information. The other types of nonrelational databases are key-value stores, wide-
column stores, and graph stores.
Chapter 4
1. D. Although the primary use of version control is to track code changes, version
control, in practice, is often used to store configuration, release, and deployment
information.
2. B. GitHub adds development workflow functionality associated with Git. This func-
tionality includes pull requests, review/approvals, searchability, and tags. GitHub
is a repository hosting service toolset that includes access control and collabora-
tion features. GitLab is a repository hosting manager toolset developed by GitLab
Incorporated.
3. D. A version control system is used for managing the code base, streamlining the
development process, and tracking all changes. It also tracks who made the changes. A
4. B. When a file is deleted, it can no longer be modified. There are many ways conflicts
can be generated. When two programmers commit changes to the same line, for
example, a merge conflict can be generated. Two developers opening the same file
does not cause a conflict; however, two developers saving changes to the same area
of a file may cause a conflict.
5. B and C. A branching strategy needs to define the types of branches and all the rules
governing usage by the development team. In addition, every developer has to agree
to the strategy. These are basic branching and branching strategy concepts, and no
successful branching can be achieved without them. All developers need to under-
stand branching strategies; otherwise, conflicts will continually arise.
Chapter 5
1. A. You can use the PATCH request to require partial resources or to modify an
existing resource. Answers B and D are incorrect because a GET request retrieves
resource details and Delete removes resources. There is no cancel HTTP method,
which means answer C is incorrect.
2. B. The PUT method is idempotent; it is not safe because changes can be made,
resulting in misconfiguration and errors if the wrong details are sent to a resource. It
replaces whatever exists at the target URL with something else. Answers A and C are
incorrect because they are not safe and idempotent. Answer D is incorrect because
an HTTP PUT method will modify but not remove (delete) resources.
3. A. With application/x-www-form-urlencoded, the keys and values are encoded in
key-value tuples separated by an ampersand (&), with an equal sign (=) between the
key and the value. Answer B is incorrect because multipart/form-data is a special
type of body whereby each value is sent as a block of data. Answers C and D are
incorrect because the request body format is not indicated as XML.
4. C. Swagger is a set of open-source tools for writing REST-based APIs. It enables you
to describe the structure of your APIs so that machines can read them. Answers A,
B, and D are incorrect because multiple languages can be used to document APIs
with Swagger UI, and it can support more than browser/Postman testing.
5. A and C. REST represents Representational State Transfer; it is a relatively new aspect
of writing web APIs. Answer B and D are incorrect because REST/RESTful APIs are
not sets of rules that developers follow when they create their APIs.
RESTful refers to web services written by applying REST architectural concepts; they
are called RESTful services. They focus on system resources and how the state of a
resource should be transported over the HTTP protocol to different clients written in
different languages. In a RESTful web service, HTTP methods like GET, POST, PUT,
and DELETE can be used to perform CRUD operations.
6. A. In the context of transaction processing, the acronym ACID refers to the four key
properties of a transaction: atomicity, consistency, isolation, and durability.
Chapter 6
1. B. A software development kit (SDK) helps speed up and improve the adoption and
developer experience of an API. Answer A is incorrect because asynchronous API
performance refers to overall throughput. Answers C and D could be partially cor-
rect; B has the correct two main benefits of an API SDK.
2. C. The OpenAPI Specification (OAS) defines a standard programming language–
agnostic interface to RESTful APIs, which allows both humans and computers to dis-
cover and understand the capabilities of the service without access to source code,
documentation, or through network traffic inspection. Answers A, B, and D are
incorrect as OpenAPI Specification (OAS) for RESTFUL APIs.
3. A and B. An API client abstracting and encapsulating all the necessary REST API
endpoints calls for authenticating. Other resources might include code samples, test-
ing and analytics, and documentation. Answers C and D are incorrect because both
headers and parameters are additional or optional options.
4. D. The bearer token is an opaque string, which is generated by the server response to
a developer login request, and not intended to have any meaning to clients using it.
Answers A, B, and C are incorrect because OAuth uses a single string, which has no
meaning and may be of varying lengths. A
5. A. True. A WebSocket uses a bidirectional protocol, in which there are no predefined
message patterns such as request/response. The client and the server can send mes-
sages to each other.
6. A. The back-end infrastructure already exists within an organization and is used as
a foundation for defining the API. This means that answers B and C are incorrect.
Answer D is incorrect because a mock server is not a back-end infrastructure but is
rather a fake server.
7. A and C. Outside-in API design allows for a lot more elasticity than inside-out API
design, covering more use cases in single API calls, thus leading to fewer APIs being
made by the consumer and less traffic on the wire and back-end systems for the pro-
vider. Answers B and C are incorrect because outside-in does not increase API adop-
tion rate, and in an outside-in approach to API design the UI is often the first to be
built.
8. A. True. The HTTP endpoints indicate how you access the resource, whereas the
HTTP method indicates the allowed interactions (such as GET, POST, or DELETE)
with the resource. HTTP methods should not be included in endpoints.
9. A and B. With page-based pagination, the developer can choose to view the required
number of pages and items on the page. This could be a single page or a preset range
of pages. An API could allow the developer to select a starting page—for example,
page 10 through page 15. This is referred to as offset pagination. Answers C and D
are incorrect because page-based is often used when the list of items is of a fixed
and predetermined length.
10. E. Rate limiting protects an API from inadvertent or malicious overuse by limiting
how often each developer or account can call the API. Rate limiting is also used for
scalability, performance, monetization, authentication, and availability.
Chapter 7
1. C. The idea is that DevOps is a cultural shift to enable cross-functional teams
experienced in a variety of areas to develop and operate an application in produc-
tion. Answers A, B, and D are aspects of DevOps but need not occur if DevOps is
adopted (they are somewhat optional).
2. B. DevOps and SRE share a common outcome, to deliver higher uptimes and reli-
ability to an application or service—and the use of tooling/observability/training/
etc. to help enable that uptime and resilience. DevOps encourages cross-functional
teams, whereas SRE encourages a separate team to liaise between development and
operations. Answer A is incorrect because it is generally looked at as part of SRE as
a new team is instituted. Answer D is generally looked at as part of DevOps culture.
Answer C is incorrect because the two overlap.
3. D. CI is meant to build/compile the code, perform standard unit testing against the
code, and then ensure there are no security holes against the dependencies or devel-
oped code. Answer A refers to deployment of an application. Answers B and C refer
to the act of application development and merging of code branches (which need not
have a CI component).
4. B. CI/CD pipeline definition files are generally written in YAML. Answers A and C
can represent the same information as YAML, but it is not used. Answer D is a pro-
gramming language and is not used to define the pipelines.
5. C. Pilot covers the specific methods in which applications can be deployed via CD.
Answers A, B, and D are valid ways to release applications or incremental updates to
applications to an environment (production, test, etc.).
6. B. This question specifically discusses what the name serverless means as it pertains
to the focus on the underlying infrastructure needed to deploy an app. Answer A is
incorrect because we still need to use servers. Answer C is incorrect because there
does need to be packaging.
7. A and D. There is a discussion around the levels of abstraction and the benefits pro-
vided versus the trade-offs. As we abstract away management of the underlying infra
in the cloud, we don’t have to worry about the details of the platforms, but we must
conform to the cloud provider in what is allowed and exposed from the abstraction,
which is the opposite of answers B and C.
8. D. The idea of the 12-factor app is to create a set of standards and recommendations
to be followed when designing an application meant to be deployed across multiple
clouds. Answer A is incorrect because they are not specific “tests” that can be run.
Answer B is incorrect because the 12 aspects don’t directly deal with user experi-
ence. These rules don’t just apply to a CI/CD pipeline, so answer C is also incorrect.
Chapter 8
1. B. Information system security can be summed up in three fundamental components:
confidentiality, integrity, and availability.
2. A and D. The data is either at rest (for example, stored on a hard drive or a data-
base) or in motion (for example, flowing or in transit though the network between
two nodes). You may encounter some references that use a third state called in use,
indicating that data is being processed by a CPU or a system. As for encryption and
clear text, data can be in motion and encrypted, or clear text while it’s in motion or
at rest.
3. D. Data at rest is stored on a hard drive, tape, database, or in other ways.
4. A. PII refers to personally identifiable information. The other answers do not apply
or relate to the topic.
5. B. GDPR gives European Union (EU) citizens control over their own personal data.
It gives citizens the right to withdraw consent, to have easier access to their data, to
understand where their data is stored, and to know if their data has been compro-
mised by a cyberattack within 72 hours, depending on the relevance of the attack.
6. D. IT secrets are used as “keys” to unlock protected applications or application data.
Passwords, account information, and API keys are examples. A VIN is not considered
a secret because it is always displayed on every car’s windshield.
7. B. The main responsibility of a certificate authority (CA) is issuing and signing cer-
tificates. Certificates are used to protect account and personal information. A
8. D. Injection attacks are common and are one of the main issues that OWASP warns
about. Injection attacks can come in different formats, such as database, LDAP, or OS
command injections.
9. D. A PoH (or Proof of History) attack is not a cryptographic attack. It is closely
related to Blockchain and crypto currency. Brute-force, statistical, and implementa-
tion attacks are types of cryptographic attacks.
10. C. There is no such thing as a four-legged authorization. The most common authori-
zation flows are two-legged and three-legged. The three-legged OAuth flow requires
four parties: the authorization server, client application, resource server, and resource
owner (end user).
Chapter 9
1. C. Option C has the common categories of the PDIOO model in the correct order.
The other options are contrived or out of order.
2. A. SNMPv3 uses the MD5 or SHA authentication method. Option B is incorrect for
SNMPv3 but is correct for SNMPv1 and 2c. Option C is incorrect; there is no key-
based authentication. Option D is incorrect; AES-128 is a function of data encryp-
tion, not authentication.
3. B. Option B is the proper expansion of the acronym, as defined by Google, which
coined the term. The other options are contrived.
4. C. OOB stands for out-of-band networking, which is a way to partition management/
administrative traffic from other traffic. Segment routing and VLANs are technolo-
gies used to segment or partition traffic, but they are generally or more commonly
used with user traffic. Answer A is incorrect because FWM (firewall management)
firewalls are useful for restricting traffic, but firewall management is not a type of
network for separating administrative traffic from other production traffic.
5. B. PnP is a common and recommended process for zero-touch deployments. Options
A and D are contrived answers. Option C is a user-initiated feature common to SMB
products.
6. A. Intent-based networking is aligned to business requirements that are easily trans-
lated into policies reflecting native device configurations. All the other options are
incorrect; the software image of a device does not make it intent-based; the rout-
ing protocols do not make a network intent-based, even if the routing protocols are
defined similarly in an IBN; and AI does not intuit business requirements. It may
inform you of what is more predominant or what is out of the norm.
7. D. NETCONF is an IETF working group specification and protocol standardized
to normalize configuration management across multiple vendors using XML sche-
mas and YANG models. Option A is incorrect; it is a contrived answer. Option B is
incorrect; XML was the first (and consistent) data encoding method for NETCONF.
Option C is incorrect; it is a contrived answer. Python may be used to implement
NETCONF-based management, but it is only an option.
Chapter 10
1. B. In environments where devices perform different functions using a model to define
a device’s configuration and desired state, model-driven management is highly effi-
cient and effective. Answers A, C, and D are incorrect. Templates can be effective
but are less flexible. They also require more routine maintenance, especially as soft-
ware images change. Atomic-driven management is a contrived answer, and generally,
managing devices atomically (one by one) leads to a higher degree of maintenance.
Distributed EMSs also is a contrived answer. Having multiple element management
systems can be less efficient, especially when they are not integrated or used.
2. B. Software, infrastructure, and operations are defined by Google’s catalyst of this
role. Answers A, C, and D are incorrect. Firmware is not a focus (while software
image management may be addressed). While close, answer C puts more focus on
network engineering than SRE promotes. Likewise, traffic and SecOps are not gen-
eral focus areas for SRE (although security can be a component of it).
3. D. Agile promotes requirements gathering, adaptive (and collaborative) planning,
quick delivery, and a CI/CD approach. Answers A, B, and C are incorrect. The
“defined process” part implies a rigorous structure that is inflexible; this is not Agile.
Although Agile does provide flexibility, it does not give developers a “free pass”
without accountability. Answer C reflects more of a test-driven methodology.
4. A. Kanban provides a more visual, graphical approach of software development;
many people attribute the “board” and “cards” approach to identify work and prog-
ress. Answers B, C, and D are incorrect. Neither Agile nor Waterfall is specifically a
visual, graphical-based approach to software development. Answer D is contrived.
Although the term illustrative does imply a visual, graphical approach, it is the
Kaban methodology that is the accepted industry term.
5. A. Concurrency provides the ability to do lots of tasks at once, and parallelism is
defined as working with lots of tasks at once. Answers B, C, and D are incorrect.
Exchanging implies swapping away from tasks; switching also implies changing focus
among tasks. The threading part of answer C is accurate, but the sequencing part is
incorrect. Answer D is the opposite of answer A.
6. A. OpenAPI was previously known as Swagger. Answers B, C, and D are incorrect.
REST was never known as the CLI, but it’s funny. SDN was catalyzed in the Clean
Slate project, but it is broader than RESTful web services. OpenWeb and CORBA are
not specifications for RESTful web services. A
7. C. Basic authentication takes a concatenation of username and password with a
colon (:) separating them, and Base64 encodes the string. The openssl utility is help-
ful for performing that function. Answers A, B, and D are incorrect. Basic authentica-
tion does not use the md5 hash function. There is no specification of what encoding
method to use with the openssl utility. Also, basic authentication does not use the
ampersand (&) to join the username and password.
8. B. XML is the Extensible Markup Language. Answers A, C, and D are incorrect. It is
not extendable or machine. Nor is it extreme or machine or learning—but it sounds
fun. Likewise, it is not extraneous or modeling.
9. D. JSON records and objects are denoted with curly braces. Answers A, B, and C
are incorrect. Angle braces are seen in XML, not JSON. Square brackets note lists in
JSON, not records. Simple quotes note strings in JSON, not records.
10. B. HEAD and GET methods in REST operations are both idempotent (able to be
run multiple times receiving the same result). Answers A, C, and D are incorrect. In
A and C, POST is not idempotent (both need to be for the correct answer). In D,
PATCH is not idempotent, but HEAD is idempotent (both need to be for the correct
answer).
11. A and C. REST and JDBC are APIs. Answers B and D are incorrect. RMON is a
legacy management protocol. SSH is a useful protocol, but it is predominately used
for human interaction with a device; it is not optimal for programmatic use as an API
would be.
Chapter 11
1. B. The Internet Engineering Task Force (IETF) created the NETCONF working
group in May 2003 to tackle the configuration management problem across vendors.
Option A is incorrect because the ANSI organization is a private nonprofit organiza-
tion that oversees the development of voluntary consensus standards for products,
services, processes, systems, and personnel in the United States. It did not initiate the
work for NETCONF. Option C is incorrect because the ITU is a specialized agency
of the United Nations responsible for all matters related to information and com-
munication technologies, but it did not initiate the work for NETCONF. Option D
is incorrect because the TM Forum is a global industry association for service pro-
viders and their suppliers in the telecommunications industry; it did not initiate the
work for NETCONF.
2. D. The IETF RFC defining NETCONF initially was RFC 4741 in December 2006.
RFC 6241 provided updates to the base protocol in June 2011. Option A is incorrect
because they are related to SNMP. Option B is incorrect because they are related to
SNMP. Option C is incorrect because they are related to RESTCONF.
3. A. Model-driven configuration management allows for the definition of relationship
and desired configuration distinctives across networks/devices/services. Answers B,
C, and D are incorrect. A model describes network/device/service relationships and
Chapter 12
1. D. SNMPv3 uses MD5/SHA for authentication and DES/3DES/AES for encryp-
tion and data security. MDT gRPC has SSL/TLS integration and promotes its use
to authenticate the server and encrypt the data exchanged between client and
server. These capabilities afford similar data security. Answer A is incorrect because
SNMPv1 uses community string authorization and no encryption, which affords
little security. MDT can provide TLS. Answer B is incorrect because SNMPv2c uses
community string authorization and no encryption, which affords little security.
Answer C is incorrect because answers A and B are.
2. B. MDT uses the HTTP/2 application protocol. Answer A is incorrect because MDT
uses the more advanced HTTP/2 application protocol rather than HTTP. Answer C is A
incorrect because SPDY was an original experimental protocol, but HTTP/2 became
the final model. Answer D is incorrect because MDT does not use secure copy
(SCP). Answer E is incorrect because MDT does not use secure FTP (sFTP).
3. A. In MDT, the device is the authoritative source of the telemetry. In dial-out mode,
the device pushes telemetry to the receiver/collector. Answer B is incorrect because
it describes dial-in mode with an inaccurate representation of the device being pas-
sive. Answer C is incorrect because it describes a dial-in mode model. Answer D is
incorrect because in dial-out mode the telemetry receiver is still considered a server.
Answer E is incorrect because it describes dial-in mode where the receiver dials in to
the network device and subscribes dynamically to one or more sensor paths.
4. C. When building a telemetry dial-out configuration, you must create a destination
group, sensor group, and subscription. Answer C is not a major task because flow
spec is not necessary and is a contrived detractor reminiscent of OpenFlow. Answers
A, B, and D are incorrect because they are major tasks needed to configure telemetry
dial-out.
5. C. Google Protocol Buffers (or protobufs) is the correct answer. Answers A, B, and D
are incorrect because they are contrived answers. Answer E is incorrect because it is
a contrived answer of a legacy protocol, Gopher, with other unaffiliated terms.
6. D. A sensor path is a combination of YANG data model for the feature of interest
with the specific metric node (leaf, leaf-list, container, and so on). Answer A is incor-
rect because a sensor path is not related to SNMP. Answer B is incorrect because
a sensor path is more than a hierarchy to an interface/module/interface. Answer C
is incorrect because a sensor path does not define the receiving telemetry collector
settings.
7. B and D. YANG data models are published on GitHub by de facto convention across
several vendors and standards bodies. They are also published and searchable on
yangcatalog.org, an IETF project since 2015’s IETF 92 YANG Model Coordination
Group and Hackathon. Answer A is incorrect because YANG models are not pub-
lished to stackoverflow. Answer C is incorrect because YANG models are not
published to yang.org.
8. D. The TIG stack is Telegraf, InfluxDB, and Grafana. Answers A, B, and C are incor-
rect because none of the references map to TIG stack actual definitions.
9. C. The on-change MDT policy provides for event-driven telemetry, which pushes
metrics only on change. A periodic policy pushes telemetry at a predefined interval
and can repeat nonchanging values, which leads to poor disk usage and processing
overhead. Answer A is incorrect because periodic frequency is not a frugal deploy-
ment policy. Answer B is incorrect because it is a reference to a database export
process. Answer D is incorrect because MDT does not use gzip for compression or
policy.
10. B. The most accurate estimation of disk and processing utilization is gleaned from
baselining the traffic volume and receiver CPU after the subscriptions are configured
and then extrapolating over time. Answers A and C are incorrect because the data
size and payload can vary; there is no static equation. Answer D is incorrect because
there is no such calculator.
Chapter 13
1. C. IaC solutions come as both declarative and imperative modeled solutions. Answer
A is incorrect because IaC solutions can be declaratively modeled. Answer B is
incorrect because IaC solutions can be imperatively modeled. Answer D is incorrect
because IaC solutions can be declaratively or imperatively modeled.
2. E. An imperative model is a programming paradigm that uses implicit commands
and flow directives that change a device or environment’s state. Answer A is correct
but not only by itself, as answer C is also correct. Answer B is incorrect; imperative
model solutions do describe control flow. Answer C is correct but not only by itself,
as answer A is also correct. Answer D is incorrect because it contains answer B,
which is incorrect.
3. D. A declarative model is a programming paradigm that expresses the desired state
of a device or environment but does not describe the control flow. Answer A is cor-
rect but not only by itself, as answer B is also correct. Answer B is correct but not
only by itself, as answer A is also correct. Answer C is incorrect because it describes
an imperative model solution. Answer E is incorrect because it contains the incorrect
answers A and C.
4. B. This response provides the correct definition of both items. Provisioning gener-
ally refers to the action of performing device changes, whereas configuration man-
agement refers to the function or concept. Answer A is incorrect because the ZTD
function is zero touch deployment, not zero task deployment. Answer C is incorrect
because ZTD does not implement XModem for file transfer. Answer D is incor-
rect because configuration management does not require Gateway Load Balancing
Protocol.
5. A, C, and E. Chef is agent-based, Puppet supports both modes, and Ansible is agent-
less, depending on the SSH feature of a device. Answer B is incorrect because Chef
requires an agent. Answer D is incorrect because Ansible does not require an agent;
it is agentless.
6. B. The Puppet Facter tool’s default output is to JSON. Answer A is incorrect because
Puppet Facter output defaults to JSON, not ANSI. Answer C is incorrect because
Puppet Facter’s output defaults to JSON, not XML. Answer D is incorrect because
Puppet output defaults to JSON, not YAML.
7. C. A Puppet configuration or definition file is called a manifest. Answer A is incor-
rect because the file used to define Puppet operations/definitions/parameters is
called a manifest. Answer B is incorrect because the Puppet file for operations/defini-
tions/parameters is not a DDL; it is a manifest. Answer D is incorrect because it is the
operations/definition/parameters file for Ansible, not Puppet.
8. A and C. Ansible playbooks support YAML and INI data exchange format. The
YAML style allows for greater functionality with key-value definitions. Answers B
and D are incorrect because Ansible playbooks support YAML and INI, not JSON,
and zsh is a type of shell.
9. C. The YAML style of Ansible playbooks allows for inline references to Ansible
Vault parameters. Answer A is incorrect because INI format does not allow for inline A
references of Ansible Vault definitions. Answers B and D are incorrect because they
are not valid data exchange formats for Ansible inventory files.
10. C. You use terraform init to initialize a project working directory. The terraform
plan command reads the configuration file and compares it to the current state, and
the terraform apply command deploys the desired state. Answers A, B, and D are
incorrect because they are not valid Terraform commands.
Chapter 14
1. D. Tracking changes, along with who made them, and housing the data repositories
are all functions of the SCM.
2. C. Almost everything related to the software product is tracked by the SCM. The
market research data related to the marketability of your product may not be tracked
by the SCM.
3. C. Chef, Ansible, and SaltStack are examples of an SCM. Muppet is not an SCM and
should not be confused with Puppet, which is an SCM.
4. B. Ansible has five main components: the control node, inventory files, playbooks,
modules, and managed nodes.
5. C. The CLI command ansible -version displays both software versions. The other
commands do not work. We recommend that you familiarize yourself with the help
command: ansible -h.
6. A. Terraform is an example of a declarative Infrastructure as Code (IaC) automation
and orchestration tool. Ansible is an example of an imperative one.
7. B. One of the strengths of Terraform is its simplicity and straightforward lifecycle.
The lifecycle is init, plan, apply, and destroy.
8. D. Technical debt consists of decisions made to satisfy short-term goals but have
long-term consequences. The first three answers are examples of that. A bad data-
base decision is a test environment that’s not representative of the production
environments.
Chapter 15
1. A and D. Microsoft Hyper-V and VMware ESXi are Type-1 hypervisors. They are
characterized as bare-metal hypervisors acting as a software layer between the under-
lying hardware and the overlying virtual machines (with self-contained operating
systems and applications). Answer B is incorrect because Oracle VM VirtualBox is a
Type-2 hypervisor running over the top of another operating system, such as Micro-
soft Windows, macOS, or Linux. Answer C is incorrect because Parallels Desktop is
a Type-2 hypervisor running over the top of macOS.
2. B and C. Oracle VirtualBox and VMware Workstation are Type-2 hypervisors. They
are characterized as being installed over a foundational operating system and provid-
ing virtualized operating system service to overlying guest virtual machines. Answers
A and D are incorrect because QEMU with KVM and Citrix XenServer are
considered Type-1 hypervisors.
3. B. False. Docker aligns more to application containerization. LXC aligns more to
operating system containerization.
4. B and D. Moving workloads to the network edge with application hosting/contain-
erization can help with data sovereignty requirements and reduce costs where WAN
traffic is metered and cost-prohibitive. Answer A is incorrect because application
hosting/containerization can be used where low latency is desirable. Answer C is
incorrect because application hosting/containerization is more suited to distributed
solutions.
5. B. Cisco IOS-XE release 16.2.1 is the minimum version supporting application host-
ing and Docker containers. Answers A, C, and D are incorrect because application
hosting is supported on Catalyst 9300 at a minimum of Cisco IOS-XE 16.2.1.
6. C. Cisco NX-OS release 9.2.1 is the minimum version supporting Docker contain-
ers. Answers A, B, and D are incorrect because Docker containers are supported on
NX-OS–based switches starting on release 9.2.1.
7. A, B, and D. Docker containers can be deployed on IOx-supported Catalyst 9000
series switches with Cisco DNA Center, the command-line interface, or the IOx
Local Manager. Answer C is incorrect because there is no such Docker Deployer
deployment option. Answer E is incorrect because there is no such Prime KVM
deployment option.
8. D. The docker save command is used to create an image that can be copied to
another system for import (for example, docker load in traditional environments).
This process is necessary to prepare an image from a local Docker system for import
into Cisco DNA Center, IOx Local Manager, or a network device CLI. Answer A
is incorrect because there is no docker archive command. Answer B is incorrect
because the docker create command creates a writeable container layer over the
specified image and prepares it for running the specified command. Answer C is
incorrect because the docker export command exports a container’s filesystem, not
the image, as a tar archive.
9. E. The Cisco IOS XE command app-hosting appid <name> is used to configure an
application and enter application hosting configuration mode. It was introduced in
IOS XE 16.12.1 on the Cisco Catalyst 9300 series switches. Answers A and D are
incorrect because these commands are not supported in Cisco IOS XE. Answers B
and C are incorrect because no such command syntax exists.
10. B. The interface AppGigabitEthernet1/0/1 is created for application hosting. It is used
to trunk or pass specific VLAN traffic into the Docker environment. Answer A is
incorrect because it is an incomplete interface reference. Answers C and D are incor-
rect because they are not legitimate interface references.
11. B. False. The IOx Local Manager manages the Docker container lifecycle of a single
hosting device. It is a best practice to create a central software repository for con-
tainer image tracking, archiving, and distribution.
Chapter 16 A
1. E. Webex SDKs are available for Apple iOS, Google Android, Java, and Python envi-
ronments, along with several other platforms and languages not mentioned here.
2. C. You can publish a Webex bot in the Webex App Hub. The other sites will have
the complete application. Subcomponents like bots and APIs will be posted at the
Webex App Hub.
3. C. X-auth-access-token and X-auth-refresh-token are returned in the response to a
POST request with a valid FMC username and password. None of the other answers
are returned in the API response from an FMC login.
4. B. Access to the FMC is supported with only one concurrent login using a username,
regardless of the method used to access the FMC.
5. C. Meraki Dashboard v1 API uses the bearer token authentication with the API key.
Answer A is incorrect because the Meraki Dashboard v1 API does not use HTTP
authentication; it uses bearer token authentication with the API key. Also, the header
key is not Authentication; it is Authorization. Answer B is incorrect because the
header key is not Authentication; it is Authorization. Answer D is incorrect because
the Meraki Dashboard v1 API does not use HTTP authentication; it uses bearer
token authentication with the API key.
6. A. You can find organization-level device information via the SDK under
<api_session>.organizations.getOrganizationDevices(orgId). Answer B is incorrect
because there is no api.device.getDevices method. Answer C is incorrect because
api.getOrganizationDevices is not a correct method; it is subordinate to organiza-
tions. Answer D is incorrect because it is a contrived method—Meraki does not use
the orgid inside the method path.
7. D. The API key and private key are required to compute the hash values that are sent
to the REST API with every request.
8. D. You can find the list of supported tools and SDKs on the Downloads page of the
online Intersight API documentation.
9. B. The UCS API only supports XML output natively. SDKs and other tools display
the output in different formats through text manipulation on the output.
10. C. You can filter the output from running a command by invoking | Select-Object
<value> after the command. Answer A is incorrect because the PowerTool is sup-
ported across different platforms. Answer B is incorrect because the PowerTool does
not use an API key to connect to a UCSM instance (it uses a username/password
combination). Answer D is incorrect because the PowerTool can also display output
information in more ways than just the native XML output through formatting.
11. A. The DNA Center REST API requires an authorization header with basic access
authentication. Answer B is incorrect because the authorization header is used
with basic authentication for DNA Center REST API access. Answer C is incorrect
because DNA Center does not use bearer token authentication; it uses HTTP authen-
tication. Answer D is incorrect because DNA Center does not use OAuth token
authentication; it uses HTTP authentication.
12. D. The Cisco DNA Center SDK package is called dnacentersdk on PyPI. Answers A,
B, and C are incorrect because they are contrived examples.
13. B and C. The client secret is used to generate a short-lived (generally five-minute)
access token created via the API. The temporary access token is created in the Web-
UI and generally is used for longer-term access. Answer A is incorrect because App-
Dynamics does not generate the client secret through the API. Answer D is incorrect
because the AppDynamics client secret is generated for the user. Answer E is incor-
rect because the AppDynamics temporary access token is generated for the user.
14. C. The default output of the AppDynamics API is XML. You can specify that it
use JSON by appending '?output=JSON' to the calling URL. Answer A is incorrect
because the AppDynamics API data can be retrieved by appending '?output=JSON'
to the calling URL, but it is not the default encoding. Answer B is incorrect because
SAML is an open standard for exchanging authentication and authorization data
between an identity provider and a service provider. It is not an encoding method for
API data from AppDynamics. Answer D is incorrect because YAML is a data encod-
ing method commonly used for configuration file declarations, but it is not the data
encoding method used for the AppDynamics API.
Updates
Over time, reader feedback allows Pearson to gauge which topics give our readers the most
problems when taking the exams. To assist readers with those topics, the authors create new
materials clarifying and expanding on those troublesome exam topics. As mentioned in the
Introduction, the additional content about the exam is contained in a PDF on this book’s
companion website, at http://www.ciscopress.com/title/9780137370443.
This appendix is intended to provide you with updated information if Cisco makes minor
modifications to the exam upon which this book is based. When Cisco releases an entirely
new exam, the changes are usually too extensive to provide in a simple update appendix. In
those cases, you might need to consult the new edition of the book for the updated content.
This appendix attempts to fill the void that occurs with any print book. In particular, this
appendix does the following:
■ Mentions technical items that might not have been mentioned elsewhere in the book
■ Covers new topics if Cisco adds new content to the exam over time
■ Provides a way to get up-to-the-minute current information about content for the
exam
■ Website has a later version: Ignore this Appendix B in your book and read only the latest
version that you downloaded from the companion website.
Technical Content
The current Version 1.0 of this appendix does not contain additional technical coverage.
A
agent-based Technologies that do require additional software modules or functions to
perform work. Agent-based technologies might not be able to experience the breadth of func-
tionality needed using existing, embedded functions like SSH. Therefore, the agent extends the
functionality desired through its installed software/module. Early Puppet and Chef implementa-
tions required agents to be installed on the managed nodes.
agentless Technologies that do not require additional software modules or functions to per-
form work. Oftentimes agentless technologies depend on existing functionality, such as SSH or
NETCONF, to act as the endpoint’s processing receiver. Common agentless solutions in network
IT are Ansible, Terraform, and recent Puppet and Chef implementations.
Ansible An agentless configuration management tool that enables IaC, software provisioning,
and application deployment. Ansible was acquired by RedHat in 2015. It was initially released
in 2012 and is written mainly in Python. Ansible uses playbooks written in YAML to define
tasks and actions to perform on managed endpoints.
API inside-out design A type of design that commences with the infrastructure or database
followed by the back-end classes and services. The user interface (UI) is typically the last bit to
get built.
API outside-in/user interface (API first approach) A type of design that begins with UI
creation, and then the APIs are built with the database schema.
application performance monitoring (APM) A discipline or tool set for measuring vari-
ous granular parameters related to performance of application code, runtime environments, and
interactions.
C
caching The capability to store data as close as possible to the users so that subsequent or
future requests are answered faster.
certificate authority (CA) Third-party or neutral organization that certifies that other enti-
ties communicating with each are in fact who they say they are.
Chef A company and tool name for a configuration management solution written in Ruby. Ini-
tially released in 2009, it supports configuration management for systems, network, and cloud
environments. It also supports CI/CD, DevOps, and IaC initiatives. A recipe defines how a Chef
server manages environment configuration. Progress acquired Chef mid-2020.
clustering A technology for combining multiple servers (or resources), making them appear
as a single server.
cohesion In software engineering, the interaction and relationships within a module and the
ability for a module to complete its tasks within the module.
cold standby A redundancy concept in which a redundant resource is available as a spare and
is ready to take over in case of failure of the active resource.
continuous delivery (CD) The automated process involved in moving the software that has
passed through the continuous integration pipeline to a state in which it is moved to a staging
area for live testing. This often involves packaging the software into a format in which it can be
deployed and moving the resulting package to a remote repository or fileshare.
control plane The conceptual layer of network protocols and traffic that involve path deter-
mination and decision-making.
cross-site scripting (XSS) A type of injection attack in which attackers inject malicious
scripts into a web application to obtain information about the application or its users.
D
data at rest A data state in which data is being stored in a database, hard drive, or tape.
data in motion A data state in which data is in transit between two nodes.
data plane The conceptual layer of network protocols and traffic that involve the actual user
traffic. Forwarding decisions are followed in the data plane.
declarative model A style of programming, network engineering, and more broadly, IT man-
agement that expresses the logic and desired state of a device or network, instead of describing
the control flow.
and deployment principles, such as CI/CD and Agile software development, with an idea that
the combination of teams leads to greater empathy between individuals, leading to higher
uptime and support of the end application or service.
dial-in A model-driven telemetry model where the subscribing telemetry receiver initiates a
telemetry session with the telemetry source, which streams the telemetry data back.
dial-out A model-driven telemetry mode where the telemetry source configures a destination,
sensor path, and subscription defining the metrics to be streamed to the receiving telemetry
collector. The telemetry source initiates the session.
digital certificate Also known as a certificate; a file that verifies the identity of a device or
user and enables data exchange over encrypted connections.
E
event-driven telemetry (EDT) A mode of telemetry that initiates the sending of metrics on-
change rather than by periodic cadence.
Extensible Markup Language (XML) A data-encoding method and markup language defin-
ing rules in a form that is humanly readable but also programmatic. There are synergies among
HTML and XML for creating stylesheets that are dynamic to different device capabilities for
representing content. Generally, JSON is preferred in more recent cloud and infrastructure
development.
F
format The way that data is represented, such as JSON and XML.
functional requirements The conditions that specify the business purpose of the function-
ality of software or an application.
G
General Data Protection Regulation (GDPR) A regulation that gives European Union (EU)
citizens control over their own personal data.
H
hypervisor Software that creates and executes virtual machines (VMs). A hypervisor enables
a host system to support multiple guest VMs by virtually sharing and allocating its resources,
such as CPU, memory, and storage.
I
imperative model A style of programming, network engineering, and more broadly, IT
management that uses exact steps or control statements to define a desired state. This model
requires operating system syntax knowledge for operation and provisioning.
Infrastructure as Code (IaC) The act of defining one or more pieces of infrastructure
(network, compute, storage, platform, and so on) through configuration files that are deployed
using programming languages or higher-level configuration management platforms (such as
Terraform).
injection attack A common type of attack discussed by OWASP. When user data is not
frequently validated, then injection or extraction of sensitive records is possible.
J
JavaScript Object Notation (JSON) A data-encoding method that is easy for humans to
read and is conducive to programmatic use.
K
Kubernetes An open-source container management and operations platform, originally cre-
ated by Google. Kubernetes provides the foundational infrastructure and APIs such that appli-
cations and supporting services (such as clustering and distributed file storage) are able to run
across one or more hosts.
L
latency The length of time taken for a system to complete a specified task.
Linux Containers (LXC) A type of virtualization that was realized mid-2008. LXC is oper-
ating-system-based where all container instances share the same kernel of the hosting compute
node. The guest operating systems may execute in a different user space. This can be mani-
fested as different Linux distributions with the same kernel.
load balancing Sometimes generically used as server load balancing; a technique for distrib-
uting load among a number of servers or virtual machines for the purpose of scalability, avail-
ability, and security.
M
management plane The conceptual layer of network protocols and traffic that involve the
administrative functions of a network device, such as Network Time Protocol (NTP), syslog
event messaging, SNMP, and NETCONF.
mean time between failures (MTBF) How likely it is for a system to fail and what events
can contribute to the failure.
mean time to repair or mean time to recovery (MTTR) How much time is needed for the
system to recover from failure or for the issue causing the failure to be repaired.
method The intent of an API call; often referred to as a “verb.” It describes the operations
available, such as GET, POST, PUT, PATCH, and DELETE.
metrics System performance parameters. Latency, response time, sessions per second, and
transactions per second are examples of metrics.
model-driven telemetry (MDT) A function that uses data models, such as YANG, to repre-
sent configuration and operational state. Associated with streaming telemetry, MDT provides a
structure for defining telemetry receivers, sensor paths (metrics), and subscriptions necessary to
encode and transport the data.
multithreading Dividing tasks or requests into threads that can be processed in parallel.
N
NETCONF The Network Configuration Protocol (NETCONF) is a network management
protocol developed and standardized by the IETF as RFC 4741, later revised as RFC 6241. It
enables functionality to provision, change, and delete the configuration of network devices
through remote procedure calls of XML-encoded data.
Network Configuration Protocol (NETCONF) An IETF working group standard and proto-
col. It allows cross-vendor management focused on configuration and state data.
nonfunctional requirements The conditions that describe how a system should perform the
functions described in the functional requirements.
O
object The resource a user is trying to access. It is often referred to as a “noun” and is typi-
cally a Uniform Resource Identifier (URI).
observability The ability to measure the state of a system based on the output or data it gen-
erates (i.e., logs, metrics, and traces).
Open Authentication (OAuth) An open standard defined by IETF RFC 6749. Two versions
are in use today: OAuth 1.0 and 2.0. OAuth2.0 is not backward compatible with OAuth 1.0 (RFC
5849). OAuth is designed with HTTP in mind and allows users to log in to multiple sites or
applications with one account.
OpenAPI Specification (OAS) Formerly known as the Swagger Specification, this is a pow-
erful format for describing RESTful APIs. A standard, programming language–agnostic inter-
face description for HTTP APIs.
P
pagination The process of splitting data sets into discrete pages with a set of paginated
endpoints. Therefore, an API call to a paginated endpoint is called a paginated request. API end-
points paginate their responses to make the result set easier to handle.
personally identifiable information (PII) Any information that can be used to identify a
person—name, password, Social Security number, driver’s license number, credit card number,
address, and so on.
plan Specific to Terraform, a command process that creates an execution plan allowing a
designer to review changes Terraform would make to an environment.
playbook A configuration file used by Ansible, written in YAML. It defines the tasks and
actions to be performed in provisioning or management functions.
public key infrastructure (PKI) A type of asymmetric cryptography algorithm that requires
the generation of two keys. One key is secure and known only to its owner. It is called the pri-
vate key. The other key, called the public key, is available and known to anyone or anything that
wishes to communicate with the private key owner.
R
rate limiting Limiting requests or controlling the rate at which they are passed to the
processor.
REST (Representational state transfer) A software architectural style that conforms to con-
straints for interacting with APIs. RESTful APIs are commonly used to GET or POST/PUT data
with a device for obtaining state or changing configuration.
round-trip time (RTT) The time taken for round-trip travel between two network nodes or
the length of time taken to complete a set of tasks.
S
sensor-path The unique path of a YANG model and the hierarchy/structure required to iden-
tify a configuration item or metric.
serverless The abstraction of the underlying infrastructure from the application or service
being run on the infrastructure. This abstraction enables operations personnel to focus on the
outcome of the application being served, rather than the supporting operating system, patching,
and system and software dependencies, which are placed under the responsibility of the cloud
provider.
site reliability engineering (SRE) A functional concept catalyzed by Google in which net-
work operations personnel maintain software development skills with the intent of developing
network monitoring and management solutions necessary to sustain IT operations.
software architecture The set of structures needed to reason about the system, which com-
prise software elements, relations among them, and properties of both.
software development lifecycle (SDLC) A process or framework for designing and imple-
menting software.
SOLID Object-oriented software design principles, which include the single responsibility
principle (SRP), open-closed principle (OCP), Liskov’s substitution principle (LSP), interface
segregation principle (ISP), and dependency inversion principle (DIP).
source code manager A specific platform implementation of a VCS protocol that enables
tracking, management, and collaboration of source code. Common git-based SCMs include
GitHub, BitBucket, and GitLab.
spec file Specific to Ansible, a YAML-based file that defines the XML/XPath syntax used to
map XML-structured data to JSON for use in playbooks.
subscription A desired session between telemetry source and receiver defining the desired
sensor(s)/metric(s), encoding, destination, and frequency.
SwaggerHub A part of the Swagger toolset; it includes a mix of open-source, free, and com-
mercial tools. SwaggerHub is an integrated API development platform, which enables the core
capabilities of the Swagger framework to design, build, document, and deploy APIs. Swagger-
Hub enables teams to collaborate and coordinate the lifecycle of an API. It can work with ver-
sion control systems such as GitHub, GitLab, and Bitbucket.
T
technical debt A term used to describe short-term decisions that can possibly affect the
quality of the software on the long run.
throughput The amount of load (utilization) a system is capable of handling during a time
period.
tracing The ability to track multiple events or a series of distributed events through a system.
Transport Layer Security (TLS) TLS is a successor to Secure Sockets Layer protocol. It
provides secure communications on the Internet for such things as e-mail, Internet faxing, and
other data transfers. Common cryptographic protocols are used to imbue web communications
with integrity, security, and resilience against unauthorized tampering.
Type-1 hypervisor Architecture that typically involves the hypervisor kernel acting as a
shim-layer between the underlying hardware serving compute, memory, network, and storage,
from the overlying operating systems. Sample solutions are Microsoft Hyper-V, Xen, and
VMware ESXi.
Type-2 hypervisor Architecture that involves running the hypervisor over the top of a con-
ventional, hosted operating system (OS). Other applications, besides the hypervisor, may also
run on the hosted OS as other programs or processes. One or more guest operating systems run
over the hypervisor.
V
version control A process for tracking and managing changes of code or files during the
development process.
version control system (VCS) A specific protocol-based system that allows for source code
to be tracked, checked in, and worked on in a collaborative manner. A VCS specifically refers to
the higher-level protocol, such as git or subversion, rather than a specific platform implementa-
tion of the protocol.
W
web scraping Data scraping used for extracting data from websites. Also known as web har-
vesting or web data extraction.
Y
YANG A data modeling language for the definition of data sent from NETCONF and REST-
CONF. It was released by the NETMOD working group of the IETF as RFC 6020 and later
updated in RFC 7950. YANG can model configuration data or network element state data.
Z
zero-touch provisioning (ZTP) A network function that enables an unconfigured device to
bring itself to a defined level of functionality on a network through configurations provided via
file servers.
device code flow (OAuth 2.0), 281–283 YANG Suite installation, 415–423
DevOps, 290 documentation
in evolution of network management for application performance, 78–79
and software development, 8 DNA Center APIs, 631–635
key practices in, 8 DNA Center SDKs, 635–637
responsibilities of, 194–196 Firepower, 583–585
vs. SRE, 198 Intersight APIs, 603–605
dial-in mode (streaming telemetry), 392 Intersight SDKs, 605
configuring, 402–406 for maintainability, 59
definition of, 394 Meraki APIs, 593–594
dial-out vs.395 Meraki SDKs, 594–596
dial-out mode (streaming telemetry), researching sensor paths, 407
392
UCS Manager APIs, 611–617
configuring, 398–402
UCS Manager PowerShell SDKs,
definition of, 394 622–628
dial-in vs.395 UCS Manager Python SDKs, 617–622
digital certificates. See certificates Webex, 573–575
DIP (dependency inversion principle), DOM-based XSS, 266
65–66
downloading
disaster recovery, 47
Ansible, 474–481
disaster recovery planning (DRP),
Pearson Test Prep software, 649–650
50–51
Puppet, 453–458
disk space usage, EDT vs. MDT,
440–441 YANG models, 369–371
disposability in 12-factor application DRP (disaster recovery planning),
design, 241 50–51
distributed tracing, 77 durability as ACID property, 148
DNA Center, 628–639
API documentation, 631–635 E
application hosting with, 538–547
eager loading, 70–71
enabling access, 630–631
ECS (Elastic Container Service),
purpose of, 628–629
227–234
SDK authorization, 637–639
edge computing
SDK documentation, 635–637
application containers
Docker
Cisco DNA Center for applica-
containers, 530–531. See also applica- tion hosting, 538–547
tion containers
Cisco IOx Local Manager for
installing, 414–415 application hosting, 547–552
microservices
definition of, 14
N
modular design and, 40–41 naming conventions for maintainability,
mobile application security, 262–266 59
model-driven configuraiton manage- native models, 366
ment, atomic vs.351–354 NETCONF, 334. See also RESTCONF
model-driven telemetry (MDT). See APIs, 158–159
MDT (model-driven telemetry)
definition of, 322
model-view-controller (MVC), defini-
implementation, 354–364
tion of, 15
on IOS XE, 355–356
modifiability as quality attribute, 30,
33 on IOS XR, 356–357
modularity in application design, 36–41 manual usage, 358–364
benefits of, 36–37 on NX-OS, 357–358
best practices, 37–40 layers in, 349–351
definition of, 36 content, 349
maintainability and, 59 messages, 350–351
microservices and, 40–41 operations, 350
scalability and, 43–44 transport, 351
monitoring. See also logging; streaming management solutions with, 382–383
telemetry mapping to RESTCONF operations,
application containers, 564 372–373
for application performance, 73–79 in MDT, 396
documentation, 78–79 extracting capabilities with
Python, 410–413
logging, 74–76
manually extracting capabilities,
metrics, 76–77
408–410
tracing, 77–78
origin of, 348–349
with Embedded Event Manager,
YANG models and, 365–371
299–300
network APIs. See APIs (application
evolution from SNMP to streaming
programming interfaces)
telemetry, 386–391
network controllers, 334
for fault detection, 46
network management
MTBF (mean time between failures), 45
atomic vs. controller-based network-
MTTR (mean time to repair), 45, 47
ing, 303–305
multiprocessing, 72
evolution of, 5–8
multithreading, 72
improvements in, 288–290
MVC (model-view-controller), defini-
intent-based networking, 305–306
tion of, 15
Y Z
YANG models, 334 ZTD (zero-touch deployment), 449
data types in, 365–366 ZTP (zero-touch provisioning),
downloading, 369–371 300–303
Memory Tables
Chapter 1
Table 1-2 Simple Comparison Between Functional and Nonfunctional Requirements
Functional Nonfunctional
System quality attribute specific
Mandatory
Performance or quality specific
User requirements
Test for performance, security, etc.
Describe as can or shall
Chapter 2
Table 2-4 Modular Design Best Practices
Address consistency
Chapter 3
Table 3-2 Performance Parameters from Networking and Software Perspectives
Table 3-4 Syslog Message Severities as Defined by the IETF RFC 5424
Chapter 9
Table 9-2 Logical Plane Models
Chapter 10
Table 10-2 Operational Lifecycle
Lean
Scrum
Waterfall
OpenStack
Appendix C: Memory Tables 7
Puppet
Ansible
C
Chapter 11
Table 11-2 NETCONF Protocol Operations
Operation Description
Retrieves running configuration and device state information
Retrieves all or part of a specified configuration
Loads all or part of a specified configuration to the specified target
configuration
Creates or replaces an entire configuration datastore with the contents
of another complete configuration datastore
Deletes a configuration datastore
Locks an entire configuration datastore of a device
Releases a configuration lock previously obtained with the <lock>
operation
Requests graceful termination of a NETCONF session
Forces the termination of a NETCONF session
Component Explanation
The default, secure HTTP transport, as specified by
RFC 8040.
The DNS name or IP address for the RESTCONF
agent; also provide the port (such as 8443) if using a
nonstandard port other than 443.
The main branch for RESTCONF requests. The
de facto convention is restconf, but it should be
verified to ensure proper operation.
The RESTCONF API resource type for data. An
operations resource type accesses RPC operations.
The base model container being used; inclusion of
the module name is optional.
An individual element from within the container.
Query parameters that modify or filter returned
results; see Table 11-7.
Appendix C: Memory Tables 9
Chapter 12
Table 12-2 SNMP and Streaming Telemetry Comparison
Chapter 13
Table 13-2 Puppet Platform Support Matrix
Chapter 1
Table 1-2 Simple Comparison Between Functional and Nonfunctional Requirements
Functional Nonfunctional
Use case or business process specific System quality attribute specific
Mandatory Not mandatory/affected by trade-offs
Functionality specific Performance or quality specific
User requirements User experience
Test for functionality Test for performance, security, etc.
Describe as can or shall Describes as must or should
Chapter 2
Table 2-4 Modular Design Best Practices
Chapter 3
Table 3-2 Performance Parameters from Networking and Software Perspectives
Table 3-4 Syslog Message Severities as Defined by the IETF RFC 5424
D
Severity Keyword Level Description Syslog Definition
emergency 0 System unusable LOG_EMERG
alert 1 Immediate action needed LOG_ALERT
critical 2 Critical conditions LOG_CRIT
error 3 Error conditions LOG_ERR
warning 4 Warning conditions LOG_WARNING
notification 5 Normal but significant condition LOG_NOTICE
informational 6 Informational messages only LOG_INFO
debugging 7 Debugging messages LOG_DEBUG
Chapter 9
Table 9-2 Logical Plane Models
Chapter 10
Table 10-2 Operational Lifecycle
Chapter 11
Table 11-2 NETCONF Protocol Operations
Operation Description
get Retrieves running configuration and device state information
get-config Retrieves all or part of a specified configuration
edit-config Loads all or part of a specified configuration to the specified target
configuration
copy-config Creates or replaces an entire configuration datastore with the contents
of another complete configuration datastore
delete-config Deletes a configuration datastore
lock Locks an entire configuration datastore of a device
unlock Releases a configuration lock previously obtained with the <lock>
operation
close-session Requests graceful termination of a NETCONF session
kill-session Forces the termination of a NETCONF session
Component Explanation
https:// The default, secure HTTP transport, as specified by
RFC 8040.
DeviceNameOrIP The DNS name or IP address for the RESTCONF
agent; also provide the port (such as 8443) if using a
nonstandard port other than 443.
<ROOT> The main branch for RESTCONF requests. The
de facto convention is restconf, but it should be
verified to ensure proper operation.
data The RESTCONF API resource type for data. An
operations resource type accesses RPC operations.
[YANG MODULE:]CONTAINER The base model container being used; inclusion of
the module name is optional.
LEAF An individual element from within the container.
[?<OPTIONS>] Query parameters that modify or filter returned
results; see Table 11-7.
Appendix D: Memory Tables Answer Key 9
Chapter 12
Table 12-2 SNMP and Streaming Telemetry Comparison
Chapter 13
Table 13-2 Puppet Platform Support Matrix
Dashboard Basics
Dashboards can serve a useful function in an operations center. Often, operations teams use
traditional, commercial management tools and their web-based portals in a kiosk mode as
a quick analog to a dashboard. Sometimes their metrics do not map to the priority metrics
needed to run the business or glean new insights into its operation. By embracing DevOps
principles, sometimes complemented with some open-source projects, you can create dash-
boards that reflect your specific purpose.
You are encouraged to use commercially available solutions where available and appropriate.
There is little reason to reinvent the wheel to obtain common metrics such as CPU, memory,
interface, error rates, and client MAC addresses. However, there may be other parts of your
IT environment and the instrumentation within that could be appealing for your project
intentions. As an example and a partial mea culpa, I use a situation from a CiscoLive event
many years ago.
Many large events at conference venues use IT service providers in concert with the venue’s
own IT staff to design, provision, and monitor the event’s network. This is especially true
when the event host is not in the technology industry. After a few years of using these
providers, we at Cisco had to ask ourselves, “Can’t we design, install, and operate the event
network ourselves? It’s often our gear. We have access to all the top talent in the industry at
the event.” So, we set off to build our own NOC team for CiscoLive events. True to form,
as it is for most of our customers, taking on this task was a learning experience in what was
important to monitor and what was “nice to have.” One year we learned that monitoring
the Internet routing table size was important even though it was not a common metric in
the commercially available tools at the time—even those from Cisco. Additionally, around
2016, CiscoLive became the first technology event of scale to bring dual 100 Gbps links
to the venue for attendee use. We were keenly interested in seeing how optical power levels
contributed to operational performance. So, we decided to collect those metrics and create
dashboards, thresholds, and alerts that would let us know if there were impactful changes
upstream from our event network.
Newer terminology like observability resonates with the network management and operations
team. Getting data into dashboards is a multistep consideration that involves these actions:
■ Metric identification
■ Metric extraction
■ Data normalization
■ Data storage/extraction
■ Data rendering
Metric identification involves determining which data points are important to track and seek-
ing to find how they are instrumented or if they are even available.
Metric extraction uses that insight to develop a plan, method, script, or program to obtain
that information. The metrics might be available as SNMP MIB objects. Or they could be
available through API calls, log extraction, or the execution of CLI/shell commands.
Data normalization involves taking the raw data, possibly with other metrics to create new
data expressions or values. At this stage, before storing or using the information, standard-
izing the information is important. If you see data that can be in degrees Fahrenheit, you
might want to convert, or normalize, to degrees Celsius so that all measurements are con-
sistent across all collection types. Data that is in percentages might need to be consistently
applied as decimal values less than or equal to 100 and rounded to two decimals (depending
on what is warranted and/or desirable). Otherwise, a common convention may be less than
or equal to 1.00.
Data storage and extraction depend on the data collected. For many years, time-series data
was purposely jammed into relational databases. However, it became clear over time that
relational databases were suboptimal for date-stamped information. The time-series database
concept was developed, and now more optimal solutions are available, such as InfluxDB,
Prometheus, and the open-source Apache Pinot.
Data rendering involves the actual dashboarding and visualization in which you tradition-
ally relate. Solutions in this space are Telegraf and Grafana. The solutions provide common
visualizations like panels, gauges, graphs, pie charts, and line charts. Other valuable visualiza-
tions can be heat maps, sparkle graphs, and bubble plots.
In any case, using these open-source solutions provides you more opportunity to focus on
the data instrumentation and telemetry that’s desired and worry less about how to store it
and develop visualizations.
Consider, again, the previous description of the CiscoLive NOC scenario using routing table
and optical power-level metrics. Initially, we automated the execution of show ip route
summary and show hw-module subslot <#/#> transceiver status commands to obtain the
information. Later, we tried SNMP MIB objects; then eventually streaming telemetry became
the optimal method. In any of these cases, we took the metric as a snapshot in time and pro-
grammatically injected it into a time-series database like InfluxDB.
InfluxDB offers a line protocol method to take in data as optionally tagged values. For
more detail, see the guide at https://docs.influxdata.com/influxdb/v1.8/write_protocols/
line_protocol_tutorial/.
Our main activity was to take that data and render it in our Python scripts as REST API calls
to the server similar to the following.
Because we did not specify a timestamp, InfluxDB would assume the current time. If we
were collecting and normalizing the data quickly, this was a safe assumption.
We could then use Grafana with the time-series graphs to depict the routing table size over
snapshots every 10 minutes. We were especially interested in drops of the table count con-
secutively over time because they may indicate an upstream removal of our advertised net-
work. Figure E-1 shows an optical power-level graph (from an event years ago).
So, in the final analysis of dashboarding, you must ask yourself: What metrics are important
to my business? How can I obtain the data? How do I want to depict the data?
We encourage you to think outside the box. Use metrics and automation in ways to differen-
tiate your product or service to the benefit of your customers. Take chances. Be bold. Fail on
occasion. But recover and move on.
E
Exclusive Offer – 40% OFF
Cisco Press
Video Training
ciscopress.com/video
Use coupon code CPVIDEO40 during checkout.
ciscopress.com/video
CiscoPress.com – Learning Solutions for Self-Paced Study, Enterprise, and the Classroom
Cisco Press is the Cisco Systems authorized book publisher of Cisco networking technology,
Cisco certification self-study, and Cisco Networking Academy Program materials.
At CiscoPress.com you can
• Shop our books, eBooks, software, and video training.
• Take advantage of our special offers and promotions (ciscopress.com/promotions).
• Sign up for special offers and content newsletters (ciscopress.com/newsletters).
• Read free articles, exam profiles, and blogs by information technology experts.
• Access thousands of free chapters and video lessons.
Connect with Cisco Press – Visit CiscoPress.com/community
Learn about Cisco Press community events and programs.
1. Go to www.ciscopress.com/register.
2. Enter the print book ISBN: 9780137370443.
3. Answer the security question to validate your purchase.
4. Go to your account page.
5. Click on the Registered Products tab.
6. Under the book listing, click on the Access Bonus Content link.
If you have any issues accessing the companion website, you can contact
our support team by going to pearsonitp.echelp.org.
Where are the companion
content files?
Thank you for purchasing this
Premium Edition version of
Cisco Certified DevNet Professional
DEVCOR 350-901 Official Cert Guide
This product comes with companion Please note that many of our
content. You have access to these files companion content files can be very
by following the steps below: large, especially image and video files.
1. Go to ciscopress.com/account
If you are unable to locate the files for this
and log in.
title by following the steps at left, please
2. Click on the “Access Bonus visit ciscopress.com/support
Content” link in the Registered and select the chat, phone, or web ticket
Products section of your account options to get help from a tech support
page for this product, to be representative.
taken to the page where your
downloadable content is available.