0% found this document useful (0 votes)

86 views3 pages

Time-Travelling File System Assignment

Uploaded by

Asmit Karmakar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views3 pages

Time-Travelling File System Assignment

Uploaded by

Asmit Karmakar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

COL106: Data Structures and Algorithms Assignment: Time-Travelling File System

Long Assignment 1:
Time-Travelling File System

1 Introduction
In this assignment, you will implement a simplified, in-memory version control system inspired by Git.
Your system will manage versioned files with support for branching and historical inspection. The
primary goal is to apply your understanding of Trees, HashMaps, and Heaps to a complex, practical
application.

2 System Architecture
Your file system will manage a collection of files, each with its own version history represented as a tree.
The system must utilize the following core data structures:
• Tree: To maintain the version history of each file. A node in the tree represents a specific version.
• HashMap: To provide fast, 𝑂 (1) average-time lookups of versions by their unique ID.
• Heaps: To efficiently track system-wide file metrics, such as the most recently or frequently edited
files.
Note: You must implement the above data structures (along with the operations needed for this
project) yourself from scratch, i.e., you are not allowed to use C++ Libraries which already implement
these data structures.

File and Version Data Models

Each file object in your system will contain the following members:

// File Structure
TreeNode * root; // Your implementation of the tree
TreeNode * active_version ;
map <int , TreeNode *> version_map ; // Your implementation of the HashMap
int total_versions ;

Each version (a node in the tree) must store the following information:

// Version ( TreeNode ) Structure

int version_id ;
string content ;
string message ; // Empty if not a snapshot
time_t created_timestamp ;
time_t snapshot_timestamp ; // Null if not a snapshot
TreeNode * parent ;
vector < TreeNode *> children ;

1
COL106: Data Structures and Algorithms Assignment: Time-Travelling File System

3 Command Reference
Your program must read and execute a series of commands from stdin.

3.1 Core File Operations

CREATE <filename> Creates a file with a root version (ID 0), empty content, and an initial snapshot
message.

READ <filename> Displays the content of the file’s currently active version.

INSERT <filename> <content> Appends content to the file. This creates a new version if the active
version is already a snapshot; otherwise, it modifies the active version in place.

UPDATE <filename> <content> Replaces the file’s content. Follows the same versioning logic as
INSERT.

SNAPSHOT <filename> <message> Marks the active version as a snapshot, making its content im-
mutable. It stores the provided message and the current time.

ROLLBACK <filename> [versionID] Sets the active version pointer to the specified versionID. If
no ID is provided, it rolls back to the parent of the current active version.

HISTORY <filename> Lists all snapshotted versions of the file chronologically, showing their ID,
timestamp, and message.

3.2 System-Wide Analytics

RECENT FILES Lists files in descending order of their last modification time.

BIGGEST TREES Lists files in descending order of their total version count.

4 Key Semantics
• Immutability: Only snapshotted versions are immutable. Non-snapshotted versions can be edited in
place.
• Versioning: Version IDs are unique per file and assigned sequentially, starting from 0.
You must handle the cases of incorrect / inconsistent input as you deem appropriate (how you are
handling must be mentioned in the README).

5 Submission
This is an individual assignment. You must submit a compressed file (.zip/.rar) containing your project
code (.cpp, .hpp files, if any) along with a working shell script to compile your code. You must also add
a README containing the instructions on how to run your code and use different commands. Note that
the user must be able to input commands at runtime, i.e., from stdin and the commands must follow
the given syntax.
The deadline for the submission is September 11, 23:59 IST (Thursday). The submission will be on
moodlenew.

2
COL106: Data Structures and Algorithms Assignment: Time-Travelling File System

6 Evaluation
The evaluation for the project will be based on a Viva (dates to be announced later) which will involve (but
not limited to) questions regarding your code, checking output on some specific sequence of commands,
quality of the code etc.

Common questions

System-wide analytics commands like RECENT and BIGGEST TREES enhance usability by providing users with insights into file activities and their historicity. The RECENT command lists files based on their last modification time, aiding users in quickly identifying files that require attention or review. The BIGGEST TREES command lists files according to their total version count, helping users ascertain the complexity or development depth of different files. These commands leverage Heaps for efficient calculation, improving the visibility of system dynamics and aiding strategic decision-making regarding file management .

The TREE data structure plays a central role in maintaining a file's version history by organizing versions in a hierarchical manner. Each node in the tree represents a version and holds pointers to its 'parent' and 'children', establishing a parent-child relationship that reflects the evolution of file states over time. This structure allows efficient traversal and manipulation of the version timeline, facilitates branching, and enables easier rollback or retrieval of previous states. By organizing versions as tree nodes, the system can efficiently manage complex version histories and branch merges .

The INSERT and UPDATE commands differ primarily in their handling of non-snapshotted versions. INSERT appends content to the file. If the active version is a snapshot, INSERT creates a new version with the appended content; otherwise, it modifies the non-snapshotted active version in place. UPDATE, on the other hand, fully replaces the file’s content. Like INSERT, if the active version is already a snapshot, UPDATE results in the creation of a new version. Both commands respect the immutability of snapshotted versions by requiring new version creation when attempting to modify them .

Version IDs are assigned sequentially, starting from 0, and are unique per file. This sequential assignment is significant as it provides a clear, chronological order of version creation, simplifying both navigation through a file's version history and ensuring that version comparisons are straightforward. It aids in maintaining a coherent version control system where the temporal order of changes is explicit .

The SNAPSHOT command is significant because it marks the current active version of a file as immutable. By doing so, it records a stable state of the file content that cannot be altered, thus ensuring data integrity over time. This snapshot is associated with a message and a timestamp, providing contextual information and a historical marker that is essential for auditing and rollback purposes. This command is key in preserving specific versions as milestones within the file's version history .

Implementing the core data structures from scratch is necessary to deepen understanding of their internal workings and to gain fundamental insights into their operations, complexities, and optimizations. It fosters a more thorough grasp of how data structures interact within the system, allows customization tailored to the specific application, and enhances problem-solving skills by overcoming implementation challenges. This approach underlines pedagogical goals, ensuring students are well-versed in algorithmic foundations and can innovate beyond standard library usage .

Handling incorrect or inconsistent input is crucial for maintaining the robustness of the system. Challenges include ensuring that invalid commands do not disrupt the file system's state, potentially leading to data loss or corruption. The system must be equipped to validate inputs and provide informative feedback to guide correct usage. This requires implementing input validation mechanisms and error handling processes that can gracefully reject nonsensical inputs, while also possibly logging such events for further analysis. Proper handling improves user experience and system stability substantially .

The system ensures immutability by marking versions as snapshots. A version becomes immutable once it is snapshotted; this is indicated by storing a snapshot message and the current time as the 'snapshot_timestamp'. This prevents any further modifications to that version's content, ensuring its integrity. Non-snapshotted versions remain mutable and can be edited in place .

The primary data structures used in the time-travelling file system are Trees, HashMaps, and Heaps. Trees are used to maintain the version history of each file, where each node represents a specific version, enabling efficient traversal and management of file versions. HashMaps facilitate fast, O(1) average-time lookups of these versions by their unique ID, crucial for quick navigation and version retrieval. Heaps are used to track system-wide file metrics, such as the most recently or frequently edited files, providing efficient access to such information. Each of these data structures contributes to the system's ability to handle file versions, history retrieval, and system analytics efficiently .

To ensure O(1) average-time lookups for versions in the HashMap, strategies such as using open addressing or chaining for collision resolution can be implemented. Open addressing reduces the need for extra memory by resolving collisions within the array, while chaining uses linked lists for managing collisions efficiently. Ensuring consistent hashing function performance and minimizing collisions by choosing appropriate load factors and resize operations are also critical strategies. These optimizations maintain efficient hashmap operations even as the dataset grows .

Tree Implementation and Traversal in C
No ratings yet
Tree Implementation and Traversal in C
30 pages
Persistent Document Versioning with AVL Trees
No ratings yet
Persistent Document Versioning with AVL Trees
26 pages
Data Structure
No ratings yet
Data Structure
17 pages
Java Assignment on Advanced Data Structures
No ratings yet
Java Assignment on Advanced Data Structures
3 pages
C++ Programs for Data Structures and File I/O
No ratings yet
C++ Programs for Data Structures and File I/O
12 pages
Data Structures with Python Syllabus
No ratings yet
Data Structures with Python Syllabus
5 pages
Data Structures Lab Manual for Students
No ratings yet
Data Structures Lab Manual for Students
41 pages
Data Structures and Algorithms Overview
No ratings yet
Data Structures and Algorithms Overview
18 pages
Understanding File Systems and Management
No ratings yet
Understanding File Systems and Management
38 pages
Binary Tree Operations Assignment
No ratings yet
Binary Tree Operations Assignment
6 pages
C++ Dictionary, Hashing, BST, RB Tree Implementation
No ratings yet
C++ Dictionary, Hashing, BST, RB Tree Implementation
39 pages
C++ FIFO Scheduling Project Guide
No ratings yet
C++ FIFO Scheduling Project Guide
2 pages
Stack vs Queue: Key Differences
No ratings yet
Stack vs Queue: Key Differences
13 pages
Data Structures Final Exam Paper
No ratings yet
Data Structures Final Exam Paper
8 pages
Dsa Nov Dec13
No ratings yet
Dsa Nov Dec13
3 pages
Advanced Data Structures Lab Manual
No ratings yet
Advanced Data Structures Lab Manual
62 pages
Data Structures Course Syllabus
No ratings yet
Data Structures Course Syllabus
133 pages
MTech Advanced Data Structures Exam 2024
No ratings yet
MTech Advanced Data Structures Exam 2024
72 pages
HNDIT Data Structures & Algorithms Exam
No ratings yet
HNDIT Data Structures & Algorithms Exam
14 pages
Sorting Algorithms and Data Structures
No ratings yet
Sorting Algorithms and Data Structures
7 pages
Understanding Data Structures and Algorithms
No ratings yet
Understanding Data Structures and Algorithms
6 pages
B-Tree Properties and Sorting Algorithms
No ratings yet
B-Tree Properties and Sorting Algorithms
11 pages
Data Structures: Queues, Stacks, Trees
No ratings yet
Data Structures: Queues, Stacks, Trees
192 pages
Data Structures Course Overview
No ratings yet
Data Structures Course Overview
218 pages
CS 3323 Fall 2012 Final Exam
No ratings yet
CS 3323 Fall 2012 Final Exam
7 pages
Data Structures Final Exam - CS 201
No ratings yet
Data Structures Final Exam - CS 201
8 pages
Data Structures Exam Revision Guide
No ratings yet
Data Structures Exam Revision Guide
4 pages
Overview of Data Structures and Algorithms
No ratings yet
Overview of Data Structures and Algorithms
18 pages
Graph Algorithms and Data Structures Guide
No ratings yet
Graph Algorithms and Data Structures Guide
15 pages
B-Tree and Data Structure Overview
No ratings yet
B-Tree and Data Structure Overview
11 pages
Data Structures and Algorithms Guide
No ratings yet
Data Structures and Algorithms Guide
5 pages
Data Structures: Types and Operations
No ratings yet
Data Structures: Types and Operations
36 pages
Data Structures Quiz Overview
No ratings yet
Data Structures Quiz Overview
4 pages
Data Structures & Algorithms Exam Review
No ratings yet
Data Structures & Algorithms Exam Review
11 pages
Data Structures with Python Syllabus
No ratings yet
Data Structures with Python Syllabus
26 pages
Advanced Data Structures Midterm Exam
No ratings yet
Advanced Data Structures Midterm Exam
150 pages
CS1201 Data Structures Overview
No ratings yet
CS1201 Data Structures Overview
4 pages
Data Structures Lab Exam Questions
No ratings yet
Data Structures Lab Exam Questions
4 pages
Data Structures and Algorithms Lab Guide
No ratings yet
Data Structures and Algorithms Lab Guide
12 pages
Final Review for C++ Data Structures
No ratings yet
Final Review for C++ Data Structures
25 pages
Binary Search Tree Operations Guide
No ratings yet
Binary Search Tree Operations Guide
124 pages
C++ Algorithms: Sorting & Trees
No ratings yet
C++ Algorithms: Sorting & Trees
10 pages
CS-218 Data Structures Final Exam 2020
100% (2)
CS-218 Data Structures Final Exam 2020
7 pages
Time-Space Trade-Off in Data Structures
100% (1)
Time-Space Trade-Off in Data Structures
21 pages
Advanced Data Structures Lab Record
No ratings yet
Advanced Data Structures Lab Record
99 pages
C/C++ Tree and Hash Table Lab Tasks
No ratings yet
C/C++ Tree and Hash Table Lab Tasks
2 pages
Data Structures Concepts and Algorithms
No ratings yet
Data Structures Concepts and Algorithms
26 pages
Data Structures Interview Questions Guide
No ratings yet
Data Structures Interview Questions Guide
9 pages
Week-wise C Programming Course Plan
No ratings yet
Week-wise C Programming Course Plan
33 pages
MCS-021 Data and File Structures Guide
No ratings yet
MCS-021 Data and File Structures Guide
22 pages
Simplified Git Version Control System
No ratings yet
Simplified Git Version Control System
5 pages
Binary Tree and Graph Traversal Programs
No ratings yet
Binary Tree and Graph Traversal Programs
10 pages
Data Structures C Programming Exam Guide
No ratings yet
Data Structures C Programming Exam Guide
3 pages
Binary Search Tree Operations Guide
No ratings yet
Binary Search Tree Operations Guide
8 pages
Data Structures and Algorithms in C
No ratings yet
Data Structures and Algorithms in C
243 pages
Technical Aptitude Questions Ebook
No ratings yet
Technical Aptitude Questions Ebook
175 pages
DSA Program Implementations in C++
No ratings yet
DSA Program Implementations in C++
45 pages
Jadavpur University Notice: Activities Suspended
No ratings yet
Jadavpur University Notice: Activities Suspended
1 page
Ema AI Resident Program for Graduates
No ratings yet
Ema AI Resident Program for Graduates
5 pages
MAC Unit Design for Digital Logic Lab
No ratings yet
MAC Unit Design for Digital Logic Lab
4 pages
Downloading Paid Microsoft Store Apps
0% (1)
Downloading Paid Microsoft Store Apps
3 pages
Solving Recurrence Relations with Generating Functions
No ratings yet
Solving Recurrence Relations with Generating Functions
4 pages
Advanced Drone Workshop Registration Update
No ratings yet
Advanced Drone Workshop Registration Update
2 pages
Vidyalaya Report: August-September 2019
No ratings yet
Vidyalaya Report: August-September 2019
5 pages
Narendrapur Admission Test Results 2022
No ratings yet
Narendrapur Admission Test Results 2022
16 pages
Madhyamik Exam Results 2022
No ratings yet
Madhyamik Exam Results 2022
3 pages
Intro to Astronomy Workshop 2025
No ratings yet
Intro to Astronomy Workshop 2025
1 page
Python Arrays: A Comprehensive Guide
No ratings yet
Python Arrays: A Comprehensive Guide
1 page
Windows 12 Setup Guide
No ratings yet
Windows 12 Setup Guide
44 pages
Lab Performance Evaluation Rubric
No ratings yet
Lab Performance Evaluation Rubric
2 pages
(Ebook) Java How To Program, 7th Edition by Harvey M. Deitel, Paul J. Deitel ISBN 9780132222204, 0132222205
No ratings yet
(Ebook) Java How To Program, 7th Edition by Harvey M. Deitel, Paul J. Deitel ISBN 9780132222204, 0132222205
90 pages
Freelance Quickstart FAQs and Tips
No ratings yet
Freelance Quickstart FAQs and Tips
12 pages
TXRX Driver Guide Version 1.21
No ratings yet
TXRX Driver Guide Version 1.21
17 pages
CV of V.S. Ramakrishna Swamy
No ratings yet
CV of V.S. Ramakrishna Swamy
2 pages
Internet Security Management Concepts
No ratings yet
Internet Security Management Concepts
4 pages
Vendor Management Software Design Spec
No ratings yet
Vendor Management Software Design Spec
29 pages
Ip Practicle File
No ratings yet
Ip Practicle File
37 pages
NSDS Web App: Lotus Notes Admin Overview
No ratings yet
NSDS Web App: Lotus Notes Admin Overview
3 pages
Angular Data Table CRUD in ASP.NET MVC
No ratings yet
Angular Data Table CRUD in ASP.NET MVC
13 pages
Media and Information Literacy Module
No ratings yet
Media and Information Literacy Module
17 pages
Linux Junior Admin LPIC Training Course
No ratings yet
Linux Junior Admin LPIC Training Course
3 pages
Tajima Machine Error Code Guide
No ratings yet
Tajima Machine Error Code Guide
10 pages
Simple Task Scheduling System Report
No ratings yet
Simple Task Scheduling System Report
19 pages
Microsoft Fresh Start for Windows 10/11
No ratings yet
Microsoft Fresh Start for Windows 10/11
1 page
Cloud Migration Strategies and Economics
No ratings yet
Cloud Migration Strategies and Economics
68 pages
Home and Guide Keys on Keyboards
100% (1)
Home and Guide Keys on Keyboards
2 pages
The Ultimate Guide To Arduino Library
85% (13)
The Ultimate Guide To Arduino Library
76 pages
Digital Learning Ebook
No ratings yet
Digital Learning Ebook
56 pages
Brother MFC-L5710DW Laser Printer Features
No ratings yet
Brother MFC-L5710DW Laser Printer Features
13 pages
Lascar Panel Meter Operation Guide
No ratings yet
Lascar Panel Meter Operation Guide
23 pages
Understanding EPS Bounding Boxes in LaTeX
No ratings yet
Understanding EPS Bounding Boxes in LaTeX
4 pages
Boosting Hotel Booking Conversions
No ratings yet
Boosting Hotel Booking Conversions
17 pages
Status Management in Business Transactions: PDF Download From SAP Help Portal: Created On January 30, 2014
No ratings yet
Status Management in Business Transactions: PDF Download From SAP Help Portal: Created On January 30, 2014
4 pages
Isometric Drawing Exercises for MECH 211
No ratings yet
Isometric Drawing Exercises for MECH 211
6 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
119 pages
Crimson Computer Shop Business Plan
67% (3)
Crimson Computer Shop Business Plan
17 pages
Steps to Create an EC2 Instance
No ratings yet
Steps to Create an EC2 Instance
19 pages

Time-Travelling File System Assignment

Uploaded by

Time-Travelling File System Assignment

Uploaded by

COL106: Data Structures and Algorithms Assignment: Time-Travelling File System

File and Version Data Models

// Version ( TreeNode ) Structure

3.1 Core File Operations

3.2 System-Wide Analytics

Common questions

How do the system-wide analytics commands, such as RECENT and BIGGEST TREES, enhance the file system's usability?

Explain the role of the TREE data structure in maintaining a file's version history.

How does the command INSERT differ from UPDATE in handling versioning logic?

In what sequence are version IDs assigned to file versions, and why is this method significant?

What is the significance of the SNAPSHOT command in the file system's version control?

Why is it necessary to implement the core data structures from scratch instead of using existing C++ libraries?

Discuss the challenges and implications of handling incorrect or inconsistent input in the time-travelling file system.

How does the system ensure immutability for certain versions of files?

What are the primary data structures used in the time-travelling file system, and how do they contribute to its functionality?

What strategies can be implemented in the HashMap to ensure O(1) average-time lookups for versions?

You might also like