0% found this document useful (0 votes)
123 views

You Are Asked To Enter Student Names Below: 1. 2. 3. 4

Uploaded by

Mer Utsav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
123 views

You Are Asked To Enter Student Names Below: 1. 2. 3. 4

Uploaded by

Mer Utsav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

CYT245 Assignment 1.

Learn AlienVault IP Reputation database

Teamwork policy

You are asked to enter student names below:

1. 2.
3. 4.

Team leader is ______________________________________________

Only one submission from the team is expected. It will be done by the current team leader;
however this role should be rotated from one teamwork to another. Screen0 (see below)
should be made on the Leader’s computer.

Pre-requisite

You have the Anaconda-Python-Pandas environment ready to go (after the CYT175).

Review the instruction from Lab5 and Lab6 of CYT175. Make sure that you can start Jupyter
Notebook. If needed, review the demos referred in the CYT175 Labs 5, 6, and 7 task
descriptions.

Make sure that your local Jupyter host server is up and running.

Step 0. In start menu type in Jupyter notebook, then start. Make screenshot of this starting
screen. The screenshot must contain indication of the laptop ownership (like user name):

CYT245-Overview of Cyber Threat Intelligence ©2022 Tatiana Outkina


Create new Jupyter notebook.

Task description

Major source of information:

• “Data Driven Security” text book, Chapter 3.


• Python scripts and data file are available in the Lab task description, zipped file book.rar
attached to the lab task description.

CYT245-Overview of Cyber Threat Intelligence ©2022 Tatiana Outkina


Objectives of Assignment 1

• Primary – Learn the content of IP Reputation Database, recommended to be used as the


feed to Threat Intelligence practice.
• Secondary – using samples of code, to make next step in learning Python and Pandas
tools.

Start Workflow

Note: Screenshots are required for each step. Include them into your submission

Step 1. Unzip the book.rar and move the folder book to your Anaconda environment.

Doing that, you make samples of code and data easily available.

Step 2. Open the Python script file and run Listing 1 portion in your notebook. Resolve error
messages if you have them. This way you are making the sample of data available for next
steps.

Step 3. Run the Listing 3-3. You set relative path for the downloaded data.

Step 4. Run Listing 3-5. At this point of time you will obtain the result showing first 5 rows from
the file.

This code defines the structure of IP Reputation Database. Run the code and observe the result.
Answer the following questions:

1. What is Pandas name for the IP Reputation Database csv file?


2. What are Columns names of the Pandas data frame?

Step 5. Run Listing 3-6. You will see HTML formatted output of the same data frame.

Question:

1. What are Python code line lines that allow doing so (copy and paste from the code)

Step 6. Run Listing 3-8. You are now start exploring data. This portion of code demonstrates
understanding of quantitative category of data, in other words, data with values that can be
used for calculation. There is a need to generate so called the basic “descriptive statistics” (see
the definition below) on the variables. It will be used for reporting and visualization purposes.
The Run the code and see the results of calculation.

Answer the following questions:


CYT245-Overview of Cyber Threat Intelligence ©2022 Tatiana Outkina
1. What is the Pandas function to generate descriptive statistics?

From the Pandas documentation:

Descriptive statistics include those that summarize the central tendency, dispersion and
shape of a dataset’s distribution, excluding NaN values.

More details at

https://pandas.pydata.org/pandas-
docs/stable/reference/api/pandas.core.groupby.DataFrameGroupBy.describe.html?highlight
=describe

Step 7. Listing 3-10.

It might happen that you receive the syntax error if you run this Listing. More complicated data
object definition is used here. It belongs to the qualitative category of data. In Pandas this class
should be declared as Categorical, and that is not what we prepared to do now. But still, take a
look at the code and the result, shown in the book. First you see the results showing the
number of malicious nodes calculated by Reliability, Risk, Type, and Country separately. With
the last outcome you can see the number of malicious nodes by Country.

Step 8. Run Listing 3-14. Number of records from the data frame will be shown as the graph,
named Summary by Country.

Questions:

• If a country does not have valid country code, will the records be taken for
calculation?

Step 9. Listing 3-15. Error expected. The result shows Reliability chart for top 10 countries (see
Figure 3-6).

Step 10. Listing 3-16. Error expected. The result shows Risk chart for top 10 countries (see
Figure 3-7).

Step 11. Run Listing 3-18. The result will show data by country in percentage. In this top ten list
you will notice that in accordance to this data sample China and US give almost 46% of the
malicious nodes in the list.

Question:

• What line of Python code do this calculation (copy and paste)?

END of the Lab Workflow

CYT245-Overview of Cyber Threat Intelligence ©2022 Tatiana Outkina


Submission and Rubrics

• This Lab can be completed individually or as the Team work – up to 4 people. Max
Score – 5%

• Submission includes MS Word document uploaded to the BB. The name of the
document must follow Submission Upload Requirements (see below).

Submission includes:

• Steps 1 to 6, then 8 and 11 are run by Jupyter and screenshots are present.
• Answers to the Questions included into the Steps accordingly.
• Full collection of screenshots (total 8 screenshots, some error messages are still
allowed, but majority should be ok) and correct answers – 5%
• Partially completed screenshots or not correct answers will result in some extraction
accordingly (not less than 6 screenshots and right answers) – 4%
• Less than 6 screenshots – 3%

Submission Upload Requirements

Make online submission to BB, only one submission from your team.

If you have more than one document, wrap it up to ZIP, 7ZIP, or RAR folder

Name the file you will uploading as indicated below. The name must include:

• Course ID (CYT715)
• Section (Monday or Friday)
• What is this (e.g. lab1, assignment 1, etc )
• Authors by name(s)

Sample: CYT175MLab1_PeterJohnMohammadSue

Note: submissions that do not follow the requirements will not be accepted

CYT245-Overview of Cyber Threat Intelligence ©2022 Tatiana Outkina

You might also like