0% found this document useful (0 votes)
30 views29 pages

CSAD Lesson 6 On-Prem Platform Monitoring - 631

Uploaded by

ronerowilmer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views29 pages

CSAD Lesson 6 On-Prem Platform Monitoring - 631

Uploaded by

ronerowilmer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

1

2
In this lesson, you’ll learn how to monitor and track the health, performance, and
reliability of the platform, and how to troubleshoot common issues to keep the
platform running properly.

4
1. Tomcat webserver is used to communicate between analysts and the
platform using the web interface. These are communicate between
platform components.
Location: /securonix /tomcat/logs/ Catalina.out

2. Internal app audit reviews activities in SNYPR to track activity for the
purpose of compliance with privacy regulations, as well as ensure the
integrity of the platform.
They are located via Menu > Reports > Auditing.

5
The Hadoop component logs are used to monitor the Hadoop components.

1. Location: Cloudera Manager

• These Logs are provided on the server where the component is hosted.
Each component creates its own subdirectory to host its log files.

o Subdirectory location: /var/logs/component

• You can also view the Hadoop datastore connection parameters and
configurations within the hibernate.cfg.xml file.

2. Spark job logs: There are two ways to view component log files:

• Cloudera manager

• CLI

Each spark job has its own log file located in the same directory as the Spark

6
job executables:

• $SECURONIX_HOME/Spark_Jobs/sparkjobs

• You can fine tune properties in Spark jobs in snypr_apps_properties

6
1. As you learned earlier, Cloudera Manager is the primary console from
which you will monitor and manage your Hadoop components. From
the home screen, you can:

• Review a status summary of your cluster

• See the health issues

• Or click a component to manage it

7
SNYPR-EYE is a centralized application used to monitor
your SNYPR deployment including:
• System Components: CPU, memory, and disk

• SNYPR Services: Tomcat, Apache, MySQL, NTP, and Syslog-ng

• Hadoop Services: HDFS, Kafka, HBase, Spark, and Zookeeper

• SNYPR Applications: Ingestion, enrichment, behavior profiling, and risk


scoring

• Data Analytics: Violation trends

SNYPR-EYE includes application servers and a


configuration database for all tasks related to deploying,

9
monitoring, and managing SNYPR.
It supports both Single-tenant and Multi-tenant
environments.

9
The SNYPR-EYE dashboard displays detailed information about
the following:
• SNYPR Application Statistics
• Hadoop Services

• Application Alerts
• Tenant Status and Environment
This diagram provides an overview of the data flow in SNYPR to show how data is
processed by various Spark applications.

12
1. Understanding the flow of data in SNYPR from end to end helps to identify and
troubleshoot problems along the pipeline.

2. Data flows through the following components:

• Console and RIN if you are using Remote ingestion

• Kafka

• Spark

• Solr and

• HDFS

3. We saw how to monitor the individual Hadoop components in Cloudera Manager


earlier in the lesson, and how to administer and tune each Spark Job earlier in the
training. This is a good basis for understanding how to ensure those are running at
optimal performance as we look at some basic monitoring and troubleshooting.

13
The Job Monitor screen displays a list of the latest jobs with additional details
including the following:

• Job Details
• Schedule Details
• Today's Run Statistics
• Published Events History

By default, jobs are displayed by data import type.

14
1. Start basic monitoring and troubleshooting of the platform from the Job Monitor.
The Job Monitor displays the results of your import jobs at each stage and across
each of the components:

• Published: This stage identifies the number of messages that have been
published to the Kafka Raw Topic, the Remote Ingester (RIN), or the
Console.

• Parsed: In this stage, the logs waiting in the Kafka Raw Topic will be picked
up by the Enrichment Spark job to get normalized, enriched, and stored
temporarily in the Kafka Enriched Topic.

• Unparsed: In this stage, logs that are not parsed by the Enrichment Spark
job are stored in HDFS under "/user/securonix/snypr/unparsed" for further
investigation.

• Indexed: In this stage, all the logs waiting in the Kafka Enriched Topic are
picked up by the indexer Spark job to process and index the data and make
it available for searching through Spotter.

15
• Stored: In this stage, the Ingestion Spark job reads logs from the Kafka
Enriched Topic and stores the data in HDFS under the directory
"/user/securonix/snypr/resources-enriched-incoming" in Avro format.

1. You can use this screen to determine in which stage jobs are failing or if they are
failing to run at all. This will help you pinpoint which component or Spark job is
experiencing the problem.

15
Click a data import type to filter the list. Click X to remove a filter.

Click the Filter icon to select whether to view a jobs list or view Jobs by Datasource.

16
Check the RIN to ensure the data it receives is being forwarded to Kafka.

• Verify the ingestion node is receiving the expected feed to its interface
using TCPDUMP:
tcpdump -nni 'designated interface' port 'syslog-ng port' -penA

• -penA: shows data in readable way

• Capture any packets and match keywords related to your data source (ex.
Palo Alto Firewall):
tcpdump -penA -i 'designated interface' port 514 | grep "TRAFFIC“

• Capture any packets where the source host is the IP Address of the sender
(ex. RSA forwarder):
tcpdump -penA -i 'designated interface' src 'ip address of sender'

17
Check the Kafka topic and offset to determine if messages are being stored in the
topic and verify logs captured by syslog-ng are received into the Kafka Raw Topic by
running Kafka tools commands on a node that has the Kafka broker installed:

• List existing Kafka topics: kafka-topics --zookeeper localhost:2181 –list

• Describe a topic: kafka-topics --zookeeper localhost:2181 --describe --topic


mytopic

• Purge a topic: kafka-topics --zookeeper localhost:2181 --alter --topic


mytopic --config retention.ms=1000

... wait a minute ...

kafka-topics --zookeeper localhost:2181 --alter --topic mytopic --


delete-config retention.ms

• Delete a topic: kafka-topics --zookeeper localhost:2181 --delete --topic


mytopic

18
• Get the number of messages in a topic: kafka-run-class
kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic mytopic --
time -1 --offsets 1 | awk -F ":" '{sum += $3} END {print sum}’

• Get the earliest offset still in the topic: kafka-run-class


kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic mytopic --
time -2

• Get the latest offset still in the topic: kafka-run-class


kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic mytopic --
time -1

• Consume messages with the console consumer: kafka-console-consumer -


-new-consumer --bootstrap-server localhost:9092 --topic mytopic --from-
beginning

• View the details of a consumer group: kafka-run-class


kafka.admin.ConsumerGroupCommand --bootstrap-server localhost:9092 -
-group mygroup --new-consumer –describe

• List the Kafka Consumer Groups known to Kafka: kafka-consumer-groups -


-new-consumer --bootstrap-server localhost:9092 –list

18
1. Check the Spark jobs to identify issues that occur while the jobs are processing
data.

2. Check that the enrichment job is receiving and processing data from the Kafka
Raw topic from Yarn Resource Manager.

19
Click the Applications tab to view the applications running in the cluster. Click an
application ID to view in the Application Master.
• On this screen you can observe how many events per second are streaming
through the enrichment job.
• The number of active batches identify the queue of messages waiting in the Kafka
messaging bus to be processed by the Spark application, which indicates that the
Spark job is busy processing messages from the Kafka raw topic.
• Fewer active batches is better because it indicates the Spark job is keeping up with
processing. If this number is high, you might have an issue.
• When the Enrichment job finishes parsing, normalizing, and enriching the data, it
will push the data to the Kafka Enriched topic.
• You can check the Enriched topic using the Kafka tool command: kafka-console-
consumer --zookeeper <your-zookeeper- quorum> --topic <your-securonix-
enriched-topic>

Navigate to Cloudera Manager > Yarn > Applications.


• Click Event_Indexer_job. Now check the streaming tab.
• You will see the number of messages that have been received by the raw topic and
are being processed by the Indexer job.

20
Check Solr to verify data is being indexed and is searchable.

21
You can verify Solr has received the data by searching Spotter.
Navigate to Spotter in SNYPR. You will see the available data sources on the Summary
screen.
Click the data source to see the results.

22
Finally, verify data is being stored and is available for querying in HDFS. Use the
Cloudera Manager to check that the Event Ingestion job is running.

• Click the Application Master and go to the Streaming tab.


• From here you can see the status of the job.
• You can use Hue to query HDFS.
• Type a query into the white box
• Click Play to see the results.

23
Query the Security Data Lake using Impala or Hive to ensure data has been
stored.

Data is aggregated by resource group/year/and day of year.

24
This concludes the online Certified SNYPR Administrator course. Take the exam on the
training website to complete the certification.

You might also like