Package edu.caltech.nanodb.storage.heapfile

This package provides a naive implementation of the heap file organization for NanoDB. All heap files in nanodb are accessed by pages, as is the case with virtually all NanoDB files. This implementation is naive because:

  • The implementation doesn't include any management of free space, to improve the performance of inserting new tuples.
  • The implementation doesn't include other basic functionality like write-ahead logging or computing statistics on the data.

Following is a description of the storage format for heap files:

Overview

Page 0 is the header page, containing the table's schema and statistics information. All other pages are data pages, storing tuples using a slotted-page structure in each page. Relevant classes are:

  • The edu.caltech.nanodb.storage.heapfile.HeapFileTableManager class implements higher-level operations such as storing a table's schema into a table file. It also manages the process of scanning through a table file, and inserting/deleting/modifying tuples.
  • The HeaderPage class provides lower-level access to values stored in the header page, as well as constants for accessing various parts of the header page.
  • Similarly, the DataPage class provides lower-level access to values stored in the header page, as well as constants for accessing various parts of the header page.
  • The HeapFilePageTuple class implements the Tuple interface for access and manipulation of tuple data stored in the slotted page format.

The Header Page

The header page has the following structural layout:

Offset in PageTypeDescription
0 unsigned byte File type, set to edu.caltech.nanodb.storage.DBFileType#HEAP_DATA_FILE. (See the FileManagerImpl for code that accesses and manipulates this value.) 1 unsigned byte Encoded page size p, where the actual page size is 2p. (See the FileManagerImpl for code that accesses and manipulates this value.) 2 (HeaderPage.OFFSET_SCHEMA_SIZE) unsigned short

The number of bytes in the header page occupied by the table schema. This value should not be 0, although the API allows it to be.

(See HeaderPage.getSchemaSize(edu.caltech.nanodb.storage.DBPage) and HeaderPage.setSchemaSize(edu.caltech.nanodb.storage.DBPage, int) for accessing and manipulating this value.)

4 (HeaderPage.OFFSET_STATS_SIZE) unsigned short

The number of bytes in the header page occupied by table statistics. This value may be 0 if the table currently has no statistics.

(See HeaderPage.getStatsSize(edu.caltech.nanodb.storage.DBPage) and HeaderPage.setStatsSize(edu.caltech.nanodb.storage.DBPage, int) for accessing and manipulating this value.)

6 (HeaderPage.OFFSET_SCHEMA_START) [table schema] The schema of the table, as written by the SchemaWriter helper class. [after table schema] [table statistics] The table's statistics, as written by the StatsWriter helper class.