Flow Log Parser

Overview

FlowLogParser is a Java application designed to process and analyze network flow logs using concurrent processing. It parses flow log files, categorizes traffic based on ports and protocols, and generates statistical reports.

Assumptions

Protocol Handling

Default Protocol Support:
- Limited to 8 predefined protocols in default PROTOCOL_MAP:
- Any protocol, port combination not in the map is labeled as "Untagged"
- Custom protocol mappings can be provided via optional protocol_map_file

Flow Log Format

Flow Log Version:
- Assumes VPC Flow Logs version 2 format
- No header row in flow log file
- Each line must contain minimum 8 fields
- Lines with fewer than 8 fields are skipped
- Fields are space-separated
Field Requirements:
- protocol (field 8): Must be a numeric protocol identifier
- Invalid or malformed fields result in line being skipped
- No validation of other fields as they're not used in analysis

Processing Behavior

Counting Logic:

- Each valid line increments both tag and port-protocol counters
- Duplicate lines are counted separately
- "Untagged" is used when no matching tag is found in lookup table

Compile

# Clone the repository 
git clone https://github.com/nachivrn/FlowLogParser.git

cd FlowLogParser

javac src/FlowLogParser.java -d .

javac -cp .:lib/junit-platform-console-standalone-1.8.2.jar src/FlowLogParserTest.java -d .

Run Program

# Run the code
# Usage: java FlowLogParser <flow_log_file> <lookup_table_file> <output_file> [protocol_map_file]
java FlowLogParser ./data/flowlogfile.txt ./data/lookuptable.csv ./data/output.txt

# Check output
cat ./data/output.txt

# Run the Unit & Functional tests
java -jar lib/junit-platform-console-standalone-1.8.2.jar --class-path . --select-class FlowLogParserTest

Testing Details

Unit Tests Implemented

Basic Functionality Tests
- Validates lookup table loading
- Tests single line processing
- Validates concurrent processing
Edge Cases
- Tests duplicate entries
- Test malformed input
- Test empty files
Error Handling Tests
- Invalid protocol numbers
- Missing fields
- Malformed input lines

Performance Testing

Tested with 100,000+ flow log records approximately 10MB and lookup table with 10,000 entries
Measured processing time for large datasets
Confirmed proper thread utilization

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
lib		lib
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Flow Log Parser

Overview

Assumptions

Protocol Handling

Flow Log Format

Processing Behavior

Counting Logic:

Compile

Run Program

Testing Details

Unit Tests Implemented

Performance Testing

About

Uh oh!

Releases

Packages

Languages

nachivrn/FlowLogParser

Folders and files

Latest commit

History

Repository files navigation

Flow Log Parser

Overview

Assumptions

Protocol Handling

Flow Log Format

Processing Behavior

Counting Logic:

Compile

Run Program

Testing Details

Unit Tests Implemented

Performance Testing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages