In this assignment you will complete the implementation of a write-ahead logging system for NanoDB that will allow the database to provide transaction atomicity and durability. The specific components to implement are:
Toward the end of the assignment, you will again have to work with some low-level file structures. You are encouraged to use logging statements (e.g. logger.debug(...)
) generously as you write your code; it will make debugging much easier for you in the long run.
This section will give you an overview of how transaction processing works in NanoDB. It is long, but your team should read it carefully since it touches on all key details.
NanoDB includes a very basic implementation of transaction processing and write-ahead logging that captures the basic concepts of the ARIES write-ahead logging mechanism. Transaction management is provided by the code in the edu.caltech.nanodb.transactions
package, and write-ahead logging is provided by the code in the edu.caltech.nanodb.storage.writeahead
package.
The transaction manager leverages the same event-handling framework that index management uses, but it uses the before/after command events rather than the before/after row events. Transaction management is straightforward:
The user can issue a "BEGIN [ WORK ]
" or "START TRANSACTION
" command to manually start a transaction. Subsequent DML statements will be part of the same transaction until either a "COMMIT [ WORK ]
" or "ROLLBACK [ WORK ]
" command is issued, ending the transaction.
The user can also issue DDL/DML commands without using the above operations, and the transaction manager will automatically enclose each command within its own transaction. (Note that DDL is not transacted and cannot be rolled back.)
The entry points for starting, committing, and rolling back transactions are the following methods on the TransactionManager
class:
startTransaction()
- starts a transactioncommitTransaction()
- commits the current transactionrollbackTransaction()
- rolls back the current transactionThe TransactionStateUpdater
class implements the logic described above for starting and ending transactions, and the [ Begin
| Commit
| Rollback
] TransactionCommand
classes in the commands
package also call the above methods on the transaction manager.
The write-ahead logger is managed as a component of the transaction manager; no code outside of the transaction manager really should need to interact with it. The main class to be aware of is the WALManager
class. It exposes writeXXXX()
operations to write various log records to the WAL, a rollbackTransaction()
method that uses the WAL to rollback the current transaction, and a doRecovery()
entry-point that performs recovery processing when NanoDB starts.
The write-ahead logger implements an extremely simplified version of the ARIES logging and recovery mechanism. The logger does not support checkpointing, although it does support all record types discussed in class: begin/commit/rollback, update, and redo-only update. Other relevant details are as follows:
The write-ahead log is stored in one or more files in the datafiles
directory, the files being named wal-nnnnn.log
. (In this assignment, and in the source code, the term "write-ahead log" refers to the entire aggregate log, not an individual file.) The file number nnnnn
starts at 0 and increments to 65535, at which point it wraps around back to 0. Each file is constrained to be no more than approximately 10MB in size, but the end of a file will never be a partial record; the last record in the file will always be complete.
Each log record is identified by a LogSequenceNumber
(LSN) value, consisting of the WAL file number, and the offset from the start of the file that the record is stored at. Note that the LSN does not include a page number; the page number can be computed from the file-offset and the WAL file's page size, when it is needed.
The write-ahead log is loaded and saved through the Storage Manager, just as with all data files. This means that WAL pages will often be cached within the buffer manager to minimize IO costs. This also means that care must be taken to write out the WAL at appropriate points during the transaction lifecycle. The operation of ensuring that the WAL is written out to disk at least as far a particular LSN is called forcing the WAL to that LSN. This operation is provided on the Transaction Manager. We will discuss this operation in more detail momentarily.
As with ARIES logging, every DBPage
object in memory has a pageLSN
value associated with it, which is the most recent LSN that applies to the page. Unlike ARIES, this value is not recorded to disk! Rather, the database uses this value to enforce the write-ahead logging rule: when dirty pages must be evicted from the buffer manager, it forces the WAL out to the largest pageLSN
of the dirty pages being evicted, and then it writes the dirty pages back to disk. This ensures that all changes written by the database are properly reflected in the WAL first.
The WALManager
class updates the pageLSN
value on DBPage
objects when the writeUpdatePageRecord()
and writeRedoOnlyUpdatePageRecord()
methods are called. You shouldn't have to manage these values yourself.
Note that some dirty DBPage
objects will not have a pageLSN
value! Specifically, pages in files that do not support transactions will not have this value set; WAL pages in particular will not specify this value. In general, data pages relating to the write-ahead log are not transacted. (It wouldn't make sense to log changes to write-ahead log pages in the write-ahead log.)
Managing data pages in the context of write-ahead logging is somewhat involved. For example, when a page is modified, the original version of the page must be kept so that the write-ahead logger can generate the "old value" and "new value" to store in the log. These kinds of capabilities are provided on the DBPage
class - you will notice that when a clean page is made dirty (by DBPage.setDirty(true)
), it also makes a copy of the clean version of the page.
Similarly, when a change to a dirty data page has been logged to the WAL, it is no longer necessary to record the original version of the page because the WAL now reflects the changes. Thus, the WAL Manager calls DBPage.syncOldPageData()
when it writes an "update" record, so that the "old value" of the page matches the "new value" of the page. To see why this is important, imagine the following sequence of events against a specific page:
Tuple 1 in the page is updated. An "update" record is written to the WAL, recording the portion of the page that contains the old page-data for tuple 1, and the new page-data for tuple 1.
Next, tuple 2 in the same page is updated, in the same transaction. Again, an "update" record is written to the WAL, but it should only contain the old and new page-data for tuple 2, not the data for tuple 1.
This is why the logger calls DBPage.syncOldPageData()
after writing an "update" record, so that each time an update is recorded, it only reflects the changes since the previous "update" record.
Logging the changes to data pages is very dependent on the kind of data file being managed. For example, in heap files, adding or removing a tuple involves changing the page's slot array, as well as changing the tuple-data region of the page. Thus, any code modifying a data page page must inform the write-ahead logger when it is time to log changes to the WAL. This is done via the DBPage.logDBPageWrite()
method. (If the page is not dirty, this is a no-op.)
Currently, the heap tuple file implementation doesn't include any of these calls, so changes will not appear in the write-ahead log; therefore, they will not be durable or atomic. This will be one of the first tasks to complete, calling logDBPageWrite()
at appropriate times so that operations against tables will be properly logged in the WAL and governed by transactions.
Unfortunately, debugging the write-ahead logger tends to be difficult. You will need to look at logging output as the database performs transaction rollback, redo processing, undo processing, and so forth. Be prepared.
Sometimes you will need to see the contents of data pages that have been modified. There are a few helper functions on DBPage
to help you with this:
The toFormattedString()
method will return a formatted string containing all data in the DBPage
. Be warned; it is verbose. However, sometimes it is the easiest way to identify issues.
The getChangesAsString()
method is similar in that it will print out the contents of a DBPage
, but it will only show the differences between the old and new versions of the page. This can be very helpful to see exactly what changes were made on a particular data page.
CRASH
CommandIn order to simplify basic testing, NanoDB has a CRASH
command that will cause the database to terminate immediately without saving any buffered data. It is very easy to use:
CMD> CRASH;
Goodbye, cruel world! I'm taking your data with me!!!
$ [your shell prompt]
This command is useful for basic testing, but it will not allow exhaustive testing of your transaction processing system. The reason is simple: databases can also crash at far more inopportune times than just at the start of a command. To get us a little closer to that, you can also give an argument to the CRASH
command, the number of seconds to wait before crashing:
CMD> CRASH 5;
CMD> INSERT INTO t VALUES (1, 'a', 1.1);
CMD> INSERT INTO t VALU
Goodbye, cruel world! I'm taking your data with me!!!
$ [your shell prompt]
This version of the command starts a background thread that will crash the database after the specified number of seconds has passed. This allows a test script to go on to other things, so that the database crash can interrupt more serious operations, like writing data out to disk, etc. There is still an element of chance in it, so it'll be like playing Russian Roulette, at least until you have transaction atomicity and durability working properly. After that, crashes shouldn't matter.
FLUSH
CommandNanoDB also has a FLUSH
command that will cause the database to flush any buffered data in the Buffer Manager to disk.
CMD> FLUSH;
Flushing all unwritten data to disk.
CMD>
You can use this command to make sure that the buffer manager and write-ahead logger work together properly. This can be particularly useful when testing crashes, since only commits force the WAL out to disk. For example, you can rollback a transaction (or just leave it incomplete), flush the database to disk, and then crash the database, like this:
CMD> ROLLBACK;
CMD> FLUSH;
Flushing all unwritten data to disk.
CMD> CRASH;
Goodbye, cruel world! I'm taking your data with me!!!
Next, you can try to restart NanoDB and see if it does the proper thing during recovery processing.
Currently, NanoDB does not support transacted DDL, for a variety of reasons. Don't even try it. The same goes for indexes; indexes are not transacted.
The write-ahead logger supports logs that span multiple files, but this functionality has not been heavily tested. You are expected to support multiple logs, but if you run into strange bugs that don't seem to be in your code, please talk to Donnie and/or the TAs right away.
Before you can work on the write-ahead logger implementation, you need to enable transaction processing in NanoDB. Just like with indexes, this subsystem is controlled via a system property that must be enabled at startup; it cannot be turned on while the database is in normal operation.
Go to the PropertyRegistry.initProperties()
(server.properties
package) method, and configure these features as follow:
"nanodb.enableTransactions
" should be set to true
to turn on transaction processing
"nanodb.flushAfterCmd
" should be changed to false
so that the buffer pool is no longer flushed after every command
These changes will become active after you have recompiled NanoDB.
NOTE: After doing this, you need to delete the contents of your datafiles
directory, since this change does affect what files are created and managed in this directory.
You may also want to disable the index settings from Assignment 6, if you ran into issues with your B+ tree implementation. None of this lab depends on the index implementation working properly.
As stated earlier, heap tuple files currently don't log any of their changes. Fortunately, this is not terribly difficult to add, especially since the number of pages that change for any given operation will tend to be small.
You need to update the heap tuple file code to properly log all changes to the write-ahead log. Go through both the HeapTupleFile
and HeapTupleFileManager
classes, looking for any place that a DBPage
is written to. For example, adding/updating/removing a tuple, storing statistics, updating non-full-page lists, would all be places where changes are made to the tuple file. Wherever you finish modifying the contents of a DBPage
, add in a call to the Storage Manager's logDBPageWrite(DBPage)
method as the last step of that operation. This will ensure that the changes to the page are written to the write-ahead log.
Try to avoid calling logDBPageWrite()
on a page multiple times for a single operation. For example, it would be wasteful to call logDBPageWrite()
once after updating a page's slots, and then again after updating the page's tuple data. Rather, it should be called once, after all changes have been made to the page.
The TransactionManager
class provides a forceWAL(LogSequenceNumber)
method that is critical for atomic and durable transaction processing. This method must ensure that the write-ahead log is written out to the specified LSN, and sync'd to disk, before it returns. This method is called every time a transaction is committed to ensure that the committed transaction is reflected in the WAL, and it is also called anytime dirty pages must be written to disk during a transaction.
The only problem is, our write-ahead log can span multiple files. And, there is a second issue that the buffer manager could output WAL pages in any order it wants. We have no guarantees that the database won't crash during this process, which could leave our WAL completely corrupted. (You could say it was "WALloped," ha ha haaaaaa!) Therefore, careful thought must be given to how the WAL is written to disk, to ensure it is updated atomically.
The transaction manager also manages a separate file called txnstate.dat
, which it uses to record various critical values related to transaction management:
firstLSN
(inclusive) and nextLSN
(exclusive) are considered to be valid.Note that the WALManager
also maintains a nextLSN
value in memory, but the WALManager
's value will be larger than this value if there are unsaved WAL records/pages in the buffer manager. The txnstate.dat
file's "next LSN" value should always reflect the end of the write-ahead log that is stored on disk.
This file is intentionally small; it should always be atomically writable. This means that it should fit within one disk sector (as small as 512 bytes, but possibly up to 4KiB in size with modern disks), so that write-tearing will not be likely to occur (i.e. when a buffer that spans an atomic-write boundary is written or sync'd to the disk, and the system crashes as that boundary is being crossed).
Implement the transaction manager's forceWAL(LogSequenceNumber)
method to atomically and durably write the WAL, as far as the specified LSN. Specifically, this method must ensure that buffered WAL pages containing records up to the specified value are written out of the buffer manager, and that the involved WAL files are also sync'd to disk. This method will need to look at the current transaction state to tell what range of WAL records must be written out, and will use the buffer manager to ensure that pages containing those records are indeed written and sync'd to disk. It will also need to modify the current transaction state.
Of course, you will also need to make sure that you update the WAL in an atomic and durable way, particularly in the context of the issues outlined earlier, namely having multiple WAL files and buffer pages being written out in no particular order.
Here are additional requirements and suggestions:
Make sure to explain in comments exactly why your implementation is atomic and durable! Failure to do so will lose points. (You will also include this explanation in your design document.)
You should be aware that LSNs point to the start of a WAL record, but this method must guarantee that the entire record is written out. The LogSequenceNumber
class has a getRecordSize()
method you will find useful for this.
There will definitely be times when this operation will be a no-op. Make sure to detect when the force operation does not need to be performed, and return immediately in those cases. (Hint: LSNs are comparable; you can use lsn1.compareTo(lsn2)
to determine if one LSN comes before or after another LSN.)
Similarly, only write the pages of the WAL that must be written; if you write additional pages beyond what is specified by the argument to forceWAL()
, you will be penalized. (Remember, the idea is to avoid unnecessary IOs, because they are slow.)
The TransactionManager
provides a storeTxnStateToFile()
method, which will write and sync the txnstate.dat
file to disk. It records the LSN value stored in txnStateNextLSN
.
The BufferManager
class also provides several very helpful operations for you to use:
You can call BufferManager.getFile()
to see if a file is currently open. NanoDB always syncs files when it closes them, so syncing should only be required if a file is currently open.
The BufferManager
also provides several flavors of writeDBFile()
to allow some or all dirty pages to be written to disk, followed by an optional sync. Note that the sync is always performed if requested, because dirty pages may have been evicted from the buffer manager but still cached within the OS.
You can use the WALManager.getWALFileName(int)
method to get the filename of a WAL file based on its file number.
Once this operation is available, you can use it to make sure that the database will follow the write-ahead logging rule. This is the focus of our next task...
When the Buffer Manager needs more memory, it evicts some of the pages loaded into memory, taking care to write dirty pages back to disk. This is all fine and good, except that the Buffer Manager doesn't know anything about transactions or write-ahead logging. If the Buffer Manager's page-writes are not coordinated with the Transaction Manager's updates to the write-ahead log, then there is no way we can enforce durability and atomicity in the database.
The solution to this problem is that the Buffer Manager can notify other components when dirty pages are about to be written to disk. The Transaction Manager registers a handler on the Buffer Manager to receive notifications when dirty pages are about to be evicted; this is where we can ensure that the write-ahead logging rule is enforced. The method that the Buffer Manager will call is TransactionManager.beforeWriteDirtyPages()
.
Implement this method on the Transaction Manager so that the database will follow the WAL rule when any dirty pages are written back to disk. This is a pretty straightforward operation to complete, especially after the forceWAL()
method from the last step has been implemented. In fact, this method will call forceWAL()
; the only complexity is figuring out what LSN to pass to the method. There are more details in the comments for beforeWriteDirtyPages()
.
Once you complete this operation, the database will follow the write-ahead logging rule. Additionally, you should see the database updating the write-ahead log with details of committed transactions, which would make them durable.
Except... we don't have any recovery processing yet. Not only that, but if we end up needing to roll back a transaction, NanoDB can't do that either. So really, we don't have much durability at the moment, and our atomicity story is also weak to nonexistent.
For the final two tasks, you will need to become familiar with the format of the write-ahead log. The Javadocs for the edu.caltech.nanodb.storage.writeahead
package describe the log format in detail; you should refer to this documentation as you complete the last two steps.
The WALManager
class provides a rollbackTransaction()
method used during normal operation to rollback the current transaction a client is participating in. You will need to complete the implementation of this method. Note that this method is not used during recovery processing; transactions are rolled back in a different way during recovery.
The rollbackTransaction()
method operates as described in class. NanoDB manages session state for each client, and part of that session state is the LSN of the last WAL record issued for the current transaction. This is where rollback begins. Additionally, all WAL records except for the "start transaction" record include a "previous LSN" value, allowing the sequence of operations performed in the transaction to be rolled back in reverse order.
Of course, this is a relatively straightforward task, because rollback only cares about two records: the "start" record which means rollback is done, and the "update page" record which specifies a change to roll back. Any other record encountered during rollback would indicate a serious error, worthy of throwing a WALFileException
to report a corrupt write-ahead log.
The WALManager
provides a helper method getWALFileReader(LogSequenceNumber)
, that opens (or retrieves from the buffer manager) the WAL file corresponding to the specified LSN, and seeks to the file-offset specified by the LSN. The returned object is a DBFileReader
, which allows our page-based database files to be read and written in a more stream-like format. You should always use this getWALFileReader()
method to get a reader for a given LSN, to avoid unnecessary complexity in your implementation.
As the transaction is rolled back, new records must be written to the write-ahead log. Use the writeTxnRecord(WALRecordType)
and writeRedoOnlyUpdatePageRecord(DBPage,int, byte[])
methods to write these records as you rollback. Note that there are two versions of these methods; use the versions that do not require transaction info to be explicitly specified! Otherwise, the client's transaction state will not be updated properly during rollback.
The last important detail is how to undo the changes to a particular data page. The WAL record will specify the file and page number that was modified; load this page using the storage manager and then use the applyUndoAndGenRedoOnlyData()
helper to undo the changes on the page and generate the redo-only data at the same time. Note that this method has very specific requirements on the DBFileReader
position when the method is invoked; read the method's Javadoc for details!
Once you have completed the rollbackTransaction()
method, you should be able to successfully start and rollback transactions in your database. Give it a try:
Create a simple table and add a few rows to it.
Then, start a new transaction with BEGIN
, and try either changing some rows and/or adding new rows to the table. Query the table to verify that your changes are all visible!
Finally, rollback the transaction by typing ROLLBACK
. If your code is working properly, you should be able to query the table after rollback and see the original contents. Pretty cool!
If you have any issues during your testing, you will probably want to delete all files in your datafiles
directory. Corrupt or invalid write-ahead logs will leave NanoDB in a corrupted state, and then you will be on a wild-goose chase.
Once this step is finished, you will have atomic transactions in NanoDB. However, they are still not entirely durable until recovery processing is also completed. Onward!
This is probably the most complicated part of the write-ahead logging implementation, because it involves moving forward and backward through a log containing variable-size records. Just like with the index implementation, this low-level stuff can become a little grungy. Such is life.
The WALManager
provides an entry-point for recovery processing called doRecovery()
. The TransactionManager
invokes this method at database startup using the values loaded from the txnstate.dat
file. We already discussed the nextLSN
value stored in this file; it is just past the last known-good record in the WAL. The firstLSN
value is important for recovery: it is the lower bound on the set of records necessary for recovery processing. The firstLSN
value provides the following guarantees:
The firstLSN
value specifies a point in time where all data files reflect all changes that are recorded in the write-ahead log before that LSN. Thus, redo processing can start from this point. (Without checkpointing, this is the best we can do.)
Additionally, there are no incomplete transactions at the firstLSN
point in the log. This is important because when undo processing is traversing the log backwards, we know it won't have to look earlier than firstLSN
in the log.
Unfortunately, without checkpoints, we really cannot easily move firstLSN
forward until the completion of recovery processing. Thus, this is the only time that NanoDB moves this value forward.
You will notice this problem if you work with a large data-set, such as the TPC-H data-set. If you create the TPC-H tables, then load the data, and then shut down NanoDB, you will likely observe a long pause the next time you start the database. This is because NanoDB is replaying the entire write-ahead log for the entire data-load. It would be much nicer to have a way to move forward
firstLSN
during normal operation. You will have an opportunity to explore this topic when you complete the design-document.
Redo processing is implemented in performRedo()
. This method takes a RecoveryInfo
object, which keeps track of what transactions are incomplete, along with the last LSN seen for each transaction. This second detail is important: when incomplete transactions are rolled back, we must properly chain new WAL records into the earlier records for each transaction.
You must complete the implementation of this method. The implementation should be mostly straightforward; update the RecoveryInfo
object as appropriate for the WAL records you encounter, and be aware that for this method you must make sure the walReader
is always moved past each WAL record.
In situations where a change must be redone, you can use the applyRedo()
method to perform the task. As with the applyUndoAndGenRedoOnlyData()
method, the walReader
must be positioned just after the segment-count when the method is invoked. (In this case, applyRedo()
could read the number of segments itself, but it seems beneficial to make both methods behave similarly.)
No WAL records should be emitted during redo processing!
Undo processing is implemented in performUndo()
. This method takes the same RecoveryInfo
object populated by performRedo()
, since now the set of incomplete transactions should be known. This method behaves similarly to the rollbackTransaction()
method, except that it is rolling back all incomplete transactions at the same time. As before, it only cares about "start" records and "update" records; in particular, "redo-only update" records are ignored during this phase. (Of course, a "commit" or "rollback" record should never be seen for an incomplete transaction.)
You must complete the implementation of this method as well. Note that the part of this method that you have to implement is relatively small compared to the overall size of the method; the bulk of the code is devoted to navigating backward through the write-ahead logs, a rather tedious task that doesn't lend itself to factoring into another method.
As stated before, you will find this implementation very similar to the rollbackTransaction()
implementation, but there will be some important changes. You will need to emit additional WAL records as you rollback incomplete transactions, but this time you must use the information in the RecoveryInfo
object to manage each transaction's "previous LSN" value. You cannot use the writeXXXX()
methods that pull these details from the session state; your logs will be corrupt!
Important Note: Using the debugger to verify your redo/undo processing code is an excellent idea, but this recovery code runs right when the database starts up, which will be before you get a chance to connect your debugger to the database. Therefore you will need to change your nanodb
script to cause the JVM to suspend itself when it is started up, giving you a chance to connect your debugger to the JVM. In your nanodb
script you will have a line like this:
JAVA_OPTS="$JAVA_OPTS -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5009"
Change the "suspend=n
" to "suspend=y
", and then the JVM will pause at startup until a debugger connects.
Obviously, with a system as complex as write-ahead logging, there can be a lot of subtle errors in your implementation. Therefore, you should test your code as extensively as you can to make sure it works. You should test from simple to complicated, so that you can isolate issues as quickly as possible. Also, putting generous logging statements in your code will give you a lot of help in identifying issues.
Finally, don't forget that you had to add write-ahead logging to your heap-file implementation, so it would be a very good idea to test inserts, updates and deletes, involving one page and multiple pages, and so forth. That way you can ensure that your database logs all changes properly.
Here are some ideas for testing. In between tests, you may want to delete the contents of your datafiles
directory so that corrupt files don't send you on a wild-goose chase.
Start the database, then shut it down, then start it again. See if this works without error.
Start the database, create a table, insert a few rows (relying on autocommit to commit each operation separately). Shut it down.
Start it again, and make sure the table is the same, and that recovery processing ran successfully.
Start it again, and see if recovery processing is now a no-op, since the firstLSN
value in txnstate.dat
will have been moved forward by the previous startup.
Start the database, create a table. Explicitly start a transaction, and make multiple changes (e.g. insert multiple rows) within that single transaction. Then commit it.
Try the same steps as in the previous test, to see if the data stays the same after restarting the database. Make sure recovery processing runs without errors.
Start the database, create a table, insert a few rows (relying on autocommit again).
Explicitly start a transaction, perform a few changes (e.g. change values, insert rows), then rollback the transaction. Make sure the original table data is restored.
Shut down the database, then restart it. Make sure recovery processing ran successfully. Verify that the data is unchanged.
Shut down and restart the database again, and check the data again. (Remember, for the second restart, recovery processing should be a no-op.)
Repeat the same steps as for 4, but use CRASH
instead of ROLLBACK
to see if things will work properly.
The previous test tends to work with no problems, because the WAL isn't forced out until a commit or a database shutdown. Therefore, you should also try issuing a FLUSH
before the CRASH
; this will ensure that your recovery processing can handle an incomplete transaction in the WAL.
Start the database, create a table. Perform some modifications and commit them. Then perform other modifications and roll them back. Finally, perform more modifications and commit them. Make sure the data is correct.
Shut down and restart the database, and verify that recovery works correctly. Verify the data is valid.
Shut down and restart the database again, and check the data again.
Start the database, create a table. Add some data to the table.
Start a transaction, perform some operations within the transaction, then abort the transaction.
Start another transaction, perform some more operations, then abort that transaction. Make sure the original table data is still there!
Shut down (or flush and crash) the database, then restart it and make sure the contents of the table are still correct.
If you are reasonably rigorous about your testing, you should have pretty good confidence that your write-ahead logging is working correctly.
You will also find a file in schemas/writeahead/basic-tests.sql
. You can't run this file directly against NanoDB; it requires manual steps that aren't yet scriptable. But, you can use it to do some of the testing specified above.
One teammate should enable the transaction-processing properties (step 1) and commit this work right away, so that all teammates are able to work on the transaction code.
Once transactions are enabled, steps 2 through 6 can be completed in parallel. However, steps 5 and 6 probably shouldn't be started until your code is reliably generating write-ahead logs, because you will have nothing to test your code with.
Step 2 (add write-ahead logging to heap tuple files) is mostly straightforward, but requires careful review of all of the heap-file code. Leaving out logging from any operation may lead to confusing bugs. Additionally, the implementer must consider the interaction between logging and pinning when choosing where to log changes.
This is an important step to complete relatively early.
Step 3 (implement atomic force-WAL) is conceptually involved, but the actual code is pretty short. Attention must be given to both correctness and efficiency.
Step 4 (enforce the write-ahead logging rule with the buffer manager) is pretty straightforward and short to implement. It can call forceWAL()
before step 3 is completed (the API is already present even if its implementation is not), but it won't work until step 3 is done.
Step 5 (implement transaction rollback) will not work until step 2 is completed. Implementation can still be started before this point, of course.
Step 6 (implement redo/undo recovery processing) will not work until step 2 is completed. Implementation can still be started before this point, of course.
You may encounter very strange bugs during testing of this functionality, before steps 3 and 4 are working properly.
Of course, the entire team should work on step 7 (testing), as much as possible and as early as possible.
Once you are finished with the above tasks, your team needs to complete the design document for this assignment, available in your repository at "doc/lab7design.txt
".
When you are ready to submit your code, and your design document includes your team's answers to the design questions, you need to tag the version of your code that you want us to grade. Only one member of your team should perform these steps. If multiple people on your team do this, it will mess up your team's repository.
The instructions are the same as for the previous assignment, although you will use a hw7
tag this week. Again, note that if you decide to resubmit your work (e.g. because you found a bug that you needed to fix), you cannot reuse the same tag name again! Therefore, use an updated tag name "hw7-2
", "hw7-3
", etc. so that we can still find the submission. Also, don't forget to push the new tag to your submission repository.
Finally, one team member should submit this short document and submit it on the course website. Only one teammate should submit this document so that we don't get confused! You can list the Git repository hashes of your tags with this command:
git show-ref --tags