DocumentStore

Built a search engine using many different data structures, in 5 stages over the course of the semester.

Stage 1: In-Memory Document Store

Built a storage mechanism which supports “getting” and “putting” data. Documents are stored in memory in a hash table and can only be retrieved using the key (URI) with which they are stored. Documents can be plain text, a.k.a. a String, or binary data, e.g. images.

Stage 2: Add Undo Support to the Document Store Using a Stack

Added support for two different types of undo:

undo the last action, no matter what document it was done to undo the last action on a specific Document. Stage 3: Keyword Search Using a Trie.

Added key word search capability to the document store. That means a user can call DocumentStore.search(keyword) to get a list of documents in the document store that contain the given keyword. The data structure used for searching is a Trie.

Stage 4: Memory Management, Part 1: Tracking Document Usage via a Heap.

Used a min Heap to track the usage of documents in the document store. Only a fixed number of documents are allowed in memory at once, and when that limit is reached, adding an additional document results in the least recently used document being deleted from memory.

Stage 5: Memory Management, Part 2: Two Tier Storage (RAM and Disk) Using a BTree.

Previously, a document that had to be removed from memory due to memory limits was simply erased from existence. In this stage, it writes it to disk and bring it back into memory if it is needed. It continues to use a MinHeap to track the usage of documents in the document store, and continues to use the Trie for keyword search. The HashTable is replaced with a BTree for storing your documents. While the BTree itself will stay in memory, the documents it stores can move back and forth between disk and memory, as dictated by memory usage limits.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
project		project
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DocumentStore

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DocumentStore

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages