The Sparta Modeling Framework
|
Implements quick checkpointing through delta-checkpoint trees which store state-deltas in a compact format. State is retrieved through a sparta tree from ArchDatas associated with any TreeNodes. More...
#include <FastCheckpointer.hpp>
Public Types | |
typedef DeltaCheckpoint< storage::VectorStorage > | checkpoint_type |
Public Types inherited from sparta::serialization::checkpoint::Checkpointer | |
typedef Checkpoint::tick_t | tick_t |
tick_t Tick type to which checkpoints will refer | |
typedef Checkpoint::chkpt_id_t | chkpt_id_t |
tick_t Tick type to which checkpoints will refer | |
Public Member Functions | |
Construction & Initialization | |
FastCheckpointer (TreeNode &root, Scheduler *sched=nullptr) | |
FastCheckpointer Constructor. | |
~FastCheckpointer () | |
Destructor. | |
Attributes | |
uint32_t | getSnapshotThreshold () const noexcept |
Returns the next-shapshot threshold. | |
void | setSnapshotThreshold (uint32_t thresh) noexcept |
Sets the snapshot threshold. | |
Checkpointing Actions & Queries | |
void | deleteCheckpoint (chkpt_id_t id) override |
Deletes a checkpoint by ID. | |
void | loadCheckpoint (chkpt_id_t id) override |
Loads state from a specific checkpoint by ID. | |
bool | checkpointExists (chkpt_id_t id) |
Queries a specific checkpoint by ID. | |
std::vector< chkpt_id_t > | getCheckpointsAt (tick_t t) const override |
Gets all checkpoints taken at tick t on any timeline. | |
std::vector< chkpt_id_t > | getCheckpoints () const override |
Gets all checkpoint IDs available on any timeline sorted by tick (or equivalently checkpoint ID). | |
uint32_t | getNumCheckpoints () const noexcept override |
Gets the current number of checkpoints having valid IDs. | |
uint32_t | getNumSnapshots () const noexcept |
Gets the current number of snapshots with valid IDs. | |
uint32_t | getNumDeltas () const noexcept |
Gets the current number of delta checkpoints with valid IDs. | |
uint32_t | getNumDeadCheckpoints () const noexcept |
Gets the curent number of checkpoints (delta or snapshot) withOUT valid IDs. | |
std::deque< chkpt_id_t > | getCheckpointChain (chkpt_id_t id) const override |
Debugging utility which gets a deque of checkpoints representing a chain starting at the checkpoint head and ending at the checkpoint specified by id. Ths results can contain checkpoint_type::UNIDENTIFIED_CHECKPOINT to represent temporary deleted checkpoints in the chain. | |
checkpoint_type * | findLatestCheckpointAtOrBefore (tick_t tick, chkpt_id_t from) override |
Finds the latest checkpoint at or before the given tick starting at the from checkpoint and working backward. If no checkpoints before or at tick are found, returns nullptr. | |
checkpoint_type * | findInternalCheckpoint (chkpt_id_t id) |
Gets a checkpoint through findCheckpoint interface casted to the type of Checkpoint subclass used by this class. | |
Printing Methods | |
std::string | stringize () const override |
Returns a string describing this object. | |
void | traceValue (std::ostream &o, chkpt_id_t id, const ArchData *container, uint32_t offset, uint32_t size) override |
Forwards debug/trace info onto checkpoint by ID. | |
Public Member Functions inherited from sparta::serialization::checkpoint::Checkpointer | |
Checkpointer (TreeNode &root, sparta::Scheduler *sched=nullptr) | |
Checkpointer Constructor. | |
virtual | ~Checkpointer () |
Destructor. | |
const TreeNode & | getRoot () const noexcept |
Returns the root associated with this checkpointer. | |
TreeNode & | getRoot () noexcept |
Non-const variant of getRoot. | |
const Scheduler * | getScheduler () const noexcept |
Returns the sheduler associated with this checkpointer. | |
uint64_t | getTotalMemoryUse () const noexcept |
Computes and returns the memory usage by this checkpointer at this moment including any framework overhead. | |
uint64_t | getContentMemoryUse () const noexcept |
Computes and returns the memory usage by this checkpointer at this moment purely for the checkpoint state being held. | |
uint64_t | getTotalCheckpointsCreated () const noexcept |
Returns the total number of checkpoints which have been created by this checkpointer. This is unrelated to the current number of checkpoints in existance. Includes the head checkpoint if created. | |
void | createHead () |
Creates a head without taking an identified checkpoint. Cannot already have a head. | |
chkpt_id_t | createCheckpoint (bool force_snapshot=false) |
Creates a checkpoint at the given scheduler's current tick with a new checkpoint ID some point after the current checkpoint (see getCurrentID). If the current checkpoint already has other next checkpoints, the new checkpoint will be an alternate branch of the current checkpoint. This snapshot may be stored as a full snapshot if the checkpointer requires it, or if the snapshot threshold is exceeded, or if the force_snapshot argument is true Current tick will be read from scheduler (if not null) and must be >= the head checkpoint's tick. The current tick must also be >= the current checkpoints tick (See getCurrenTick). | |
void | forgetCurrent () |
Forgets the current checkpoint and current checkpoint (resetting to the head checkpoint) so that checkpoints can be taken at a different time without assuming simulation state continutiy with this checkpointers. This is ONLY to be used by a simulator IFF another checkpointer restores state at another cycle or the simulator resets but this checkpointer's tree is still expected to exist. | |
Checkpoint * | findCheckpoint (chkpt_id_t id) noexcept |
Finds a checkpoint by its ID. | |
virtual bool | hasCheckpoint (chkpt_id_t id) const noexcept |
Tests whether this checkpoint manager has a checkpoint with the given id. | |
const Checkpoint * | getHead () const noexcept |
Returns the head checkpoint which is equivalent to the earliest checkpoint taken. | |
chkpt_id_t | getHeadID () const noexcept |
Returns the checkpoint ID of the head checkpoint (if it exists) which is equivalent to the earliest checkpoint taken. | |
chkpt_id_t | getCurrentID () const |
Returns the current checkpoint ID. This is mainly a debugging utility as the current ID changes when adding, deleting, and loading checkpoints based on whether the checkpoints take were deltas or snapshots. A correct integration of the checkpointer by a simulator should not depend on this method for behavior decisions. | |
tick_t | getCurrentTick () const |
Gets the tick number of the current checkpoint (see getCurrentID). This is the tick number of the latest checkpoint either saved or written through this checkpointer. The next checkpoint taken will be on the same chain as a checkpoint taken at this tick. | |
void | dumpList (std::ostream &o) const |
Dumps this checkpointer's flat list of checkpoints to an ostream with a newline following each checkpoint. | |
void | dumpData (std::ostream &o) const |
Dumps this checkpointer's data to an ostream with a newline following each checkpoint. | |
void | dumpAnnotatedData (std::ostream &o) const |
Dumps this checkpointer's data to an ostream with annotations between each ArchData and a newline following each checkpoint description and each checkpoint data dump. | |
void | dumpTree (std::ostream &o) const |
Dumps this checkpointer's tree to an ostream with a line for each branch. Printout timescale is not relevant. Multi-line printouts for deep branches will be difficult to read. | |
void | dumpBranch (std::ostream &o, const Checkpoint *chkpt, uint32_t indent, uint32_t pos, std::deque< uint32_t > &continues) const |
Recursively dumps one branch (and sub-branches) to an ostream with a line for each branch. | |
Protected Member Functions | |
void | cleanupChain_ (checkpoint_type *d) |
Delete given checkpoint and all contiguous previous checkpoints which can be deleted (See checkpoint_type::canDelete). This is the only place where checkpoint objects are actually freed (aside from destruction) and it ensures that they will not disrupt the checkpoint delta chains. All other deletion is simply flagging and re-identifying checkpoints. | |
bool | recursForwardFindAlive_ (checkpoint_type *d) const |
Look forward to see if any future checkpoints depend on d. | |
checkpoint_type * | findCheckpoint_ (chkpt_id_t id) noexcept override |
Attempts to find a checkpoint within this checkpointer by ID. | |
const checkpoint_type * | findCheckpoint_ (chkpt_id_t id) const noexcept override |
const variant of findCheckpoint_ | |
void | dumpCheckpointNode_ (const Checkpoint *chkpt, std::ostream &o) const override |
Implements Checkpointer::dumpCheckpointNode_. | |
Protected Member Functions inherited from sparta::serialization::checkpoint::Checkpointer | |
const std::vector< ArchData * > & | getArchDatas () const |
Returns ArchDatas enumerated by this Checkpointer for iteration when saving or loading checkpoint data. | |
Checkpoint * | getHead_ () noexcept |
Non-const variant of getHead_. | |
const Checkpoint * | getHead_ () const noexcept |
Gets the head checkpoint. Returns nullptr if none created yet. | |
void | setHead_ (Checkpoint *head) |
Sets the head checkpointer pointer to head for the first time. | |
Checkpoint * | getCurrent_ () const noexcept |
Gets the current checkpointer pointer. Returns nullptr if there is no current checkpoint object. | |
void | setCurrent_ (Checkpoint *current) |
Sets the current checkpoint pointer. | |
Additional Inherited Members | |
Protected Attributes inherited from sparta::serialization::checkpoint::Checkpointer | |
std::map< chkpt_id_t, std::unique_ptr< Checkpoint > > | chkpts_ |
All checkpoints sorted by ascending tick number (or equivalently ascending checkpoint ID since both are monotonically increasing) | |
Scheduler *const | sched_ |
Scheduler whose tick count will be set and read. Cannnot be updated after first checkpoint without bad side effects. Keeping this const for simplicity. | |
Implements quick checkpointing through delta-checkpoint trees which store state-deltas in a compact format. State is retrieved through a sparta tree from ArchDatas associated with any TreeNodes.
With the goal of checkpoint saving and loading speed, this class does not allow persistent checkpoint files (saved between session) because the data format subject to change and very sensitive to the exact device tree configuration
A checkpoint tree may look something like the following, where each checkpoint is shown by its simulation tick number (not ID)
* t=0 (head/snapshot) --> t=100 +-> t=300 * | * `-> t=320 --> t=400 +-> t=500 * | `-> t=430 * `-> t=300 *
The procedure for using the FastCheckpointer is generally:
Then:
Implement reverse delta storage for backward checkpoint loading
Tune ArchData line size based on checkpointer performance
More profiling
Compression
Saving to disk using a templated checkpoint object storage class (allowing for non-binary)
Definition at line 65 of file FastCheckpointer.hpp.
typedef DeltaCheckpoint<storage::VectorStorage> sparta::serialization::checkpoint::FastCheckpointer::checkpoint_type |
Definition at line 69 of file FastCheckpointer.hpp.
|
inline |
FastCheckpointer Constructor.
root | TreeNode at which checkpoints will be taken. This cannot be changed later. This does not necessarily need to be a RootTreeNode. Before the first checkpoint is taken, this node must be finalized (see sparta::TreeNode::isFinalized). At this point, the node does not need to be finalized |
sched | Scheduler to read and restart on checkpoint restore (if not nullptr) |
Definition at line 90 of file FastCheckpointer.hpp.
|
inline |
Destructor.
Frees all checkpoint data
Definition at line 104 of file FastCheckpointer.hpp.
|
inline |
Queries a specific checkpoint by ID.
Definition at line 238 of file FastCheckpointer.hpp.
|
inlineprotected |
Delete given checkpoint and all contiguous previous checkpoints which can be deleted (See checkpoint_type::canDelete). This is the only place where checkpoint objects are actually freed (aside from destruction) and it ensures that they will not disrupt the checkpoint delta chains. All other deletion is simply flagging and re-identifying checkpoints.
d | Checkpoint to attempt to delete first. Function will then move through each previous checkpoint until reaching head. |
Definition at line 439 of file FastCheckpointer.hpp.
|
inlineoverridevirtual |
Deletes a checkpoint by ID.
id | ID of checkpoint to delete. Must not be checkpoint_type::UNIDENTIFIED_CHECKPOINT and must not be equal to the ID of the head checkpoint. |
CheckpointError | if this manager has no checkpoint with given id. Test with hasCheckpoint first. If id == checkpoint_type::UNIDENTIFIED_CHECKPOINT, always throws. Throws if id == getHeadID(). Head cannot be deleted |
Internally, this deletion may be effective-only and actual data may still exist in an incaccessible form as part of the checkpoint delta-tree implementation.
If the current checkpoint is deleted, current will be updated back along the current checkpoints previous-delta chain until a non deleted checkpoint is found. This will become the new current checkpoint
Implements sparta::serialization::checkpoint::Checkpointer.
Definition at line 169 of file FastCheckpointer.hpp.
|
inlineoverrideprotectedvirtual |
Implements Checkpointer::dumpCheckpointNode_.
Reimplemented from sparta::serialization::checkpoint::Checkpointer.
Definition at line 578 of file FastCheckpointer.hpp.
|
inlineoverrideprotectedvirtualnoexcept |
const variant of findCheckpoint_
Implements sparta::serialization::checkpoint::Checkpointer.
Definition at line 567 of file FastCheckpointer.hpp.
|
inlineoverrideprotectedvirtualnoexcept |
Attempts to find a checkpoint within this checkpointer by ID.
id | Checkpoint ID to search for |
Implements sparta::serialization::checkpoint::Checkpointer.
Definition at line 556 of file FastCheckpointer.hpp.
|
inline |
Gets a checkpoint through findCheckpoint interface casted to the type of Checkpoint subclass used by this class.
Definition at line 387 of file FastCheckpointer.hpp.
|
inlineoverridevirtual |
Finds the latest checkpoint at or before the given tick starting at the from checkpoint and working backward. If no checkpoints before or at tick are found, returns nullptr.
tick | Tick to search for |
from | Checkpoint at which to begin searching for a tick. Must be a valid checkpoint known by this checkpointer. See hasCheckpoint. |
CheckpointError | if from does not refer to a valid checkpoint. |
Implements sparta::serialization::checkpoint::Checkpointer.
Definition at line 364 of file FastCheckpointer.hpp.
|
inlineoverridevirtual |
Debugging utility which gets a deque of checkpoints representing a chain starting at the checkpoint head and ending at the checkpoint specified by id. Ths results can contain checkpoint_type::UNIDENTIFIED_CHECKPOINT to represent temporary deleted checkpoints in the chain.
id | ID of checkpoint that terminates the chain |
CheckpointError | if id does not refer to a valid checkpoint. |
Implements sparta::serialization::checkpoint::Checkpointer.
Definition at line 331 of file FastCheckpointer.hpp.
|
inlineoverridevirtual |
Gets all checkpoint IDs available on any timeline sorted by tick (or equivalently checkpoint ID).
Implements sparta::serialization::checkpoint::Checkpointer.
Definition at line 275 of file FastCheckpointer.hpp.
|
inlineoverridevirtual |
Gets all checkpoints taken at tick t on any timeline.
t | Tick number at which checkpoints should found. |
Implements sparta::serialization::checkpoint::Checkpointer.
Definition at line 255 of file FastCheckpointer.hpp.
|
inlineoverridevirtualnoexcept |
Gets the current number of checkpoints having valid IDs.
Implements sparta::serialization::checkpoint::Checkpointer.
Definition at line 290 of file FastCheckpointer.hpp.
|
inlinenoexcept |
Gets the curent number of checkpoints (delta or snapshot) withOUT valid IDs.
Definition at line 312 of file FastCheckpointer.hpp.
|
inlinenoexcept |
Gets the current number of delta checkpoints with valid IDs.
Definition at line 304 of file FastCheckpointer.hpp.
|
inlinenoexcept |
Gets the current number of snapshots with valid IDs.
Definition at line 297 of file FastCheckpointer.hpp.
|
inlinenoexcept |
Returns the next-shapshot threshold.
This represents the distance between two checkpoints required for the checkpointer to automatically place a snapshot checkpoint instead of a delta. A threshold of 0 or 1 results in all checkpoints being snapshots. A value of 10 results in every 10th checkpoint being a snapshot. Explicit snapshot creation using createCheckpoint can interrupt and restart this pattern.
This value is a performance/space tradeoff knob.
Definition at line 133 of file FastCheckpointer.hpp.
|
inlineoverridevirtual |
Loads state from a specific checkpoint by ID.
CheckpointError | if id does not refer to checkpoint that exists or if checkpoint could not be load. |
Implements sparta::serialization::checkpoint::Checkpointer.
Definition at line 208 of file FastCheckpointer.hpp.
|
inlineprotected |
Look forward to see if any future checkpoints depend on d.
d | checkpoint to inspect and recursively search |
Definition at line 517 of file FastCheckpointer.hpp.
|
inlinenoexcept |
Sets the snapshot threshold.
Definition at line 139 of file FastCheckpointer.hpp.
|
inlineoverridevirtual |
Returns a string describing this object.
Reimplemented from sparta::serialization::checkpoint::Checkpointer.
Definition at line 401 of file FastCheckpointer.hpp.
|
inlineoverridevirtual |
Forwards debug/trace info onto checkpoint by ID.
Implements sparta::serialization::checkpoint::Checkpointer.
Definition at line 410 of file FastCheckpointer.hpp.