The Sparta Modeling Framework
Loading...
Searching...
No Matches
sparta::serialization::checkpoint::FastCheckpointer Class Reference

Implements quick checkpointing through delta-checkpoint trees which store state-deltas in a compact format. State is retrieved through a sparta tree from ArchDatas associated with any TreeNodes. More...

#include <FastCheckpointer.hpp>

Inheritance diagram for sparta::serialization::checkpoint::FastCheckpointer:
Collaboration diagram for sparta::serialization::checkpoint::FastCheckpointer:

Public Types

typedef DeltaCheckpoint< storage::VectorStoragecheckpoint_type
 
- Public Types inherited from sparta::serialization::checkpoint::Checkpointer
typedef Checkpoint::tick_t tick_t
 tick_t Tick type to which checkpoints will refer
 
typedef Checkpoint::chkpt_id_t chkpt_id_t
 tick_t Tick type to which checkpoints will refer
 

Public Member Functions

Construction & Initialization
 FastCheckpointer (TreeNode &root, Scheduler *sched=nullptr)
 FastCheckpointer Constructor.
 
 ~FastCheckpointer ()
 Destructor.
 
Attributes
uint32_t getSnapshotThreshold () const noexcept
 Returns the next-shapshot threshold.
 
void setSnapshotThreshold (uint32_t thresh) noexcept
 Sets the snapshot threshold.
 
Checkpointing Actions & Queries
void deleteCheckpoint (chkpt_id_t id) override
 Deletes a checkpoint by ID.
 
void loadCheckpoint (chkpt_id_t id) override
 Loads state from a specific checkpoint by ID.
 
bool checkpointExists (chkpt_id_t id)
 Queries a specific checkpoint by ID.
 
std::vector< chkpt_id_tgetCheckpointsAt (tick_t t) const override
 Gets all checkpoints taken at tick t on any timeline.
 
std::vector< chkpt_id_tgetCheckpoints () const override
 Gets all checkpoint IDs available on any timeline sorted by tick (or equivalently checkpoint ID).
 
uint32_t getNumCheckpoints () const noexcept override
 Gets the current number of checkpoints having valid IDs.
 
uint32_t getNumSnapshots () const noexcept
 Gets the current number of snapshots with valid IDs.
 
uint32_t getNumDeltas () const noexcept
 Gets the current number of delta checkpoints with valid IDs.
 
uint32_t getNumDeadCheckpoints () const noexcept
 Gets the curent number of checkpoints (delta or snapshot) withOUT valid IDs.
 
std::deque< chkpt_id_tgetCheckpointChain (chkpt_id_t id) const override
 Debugging utility which gets a deque of checkpoints representing a chain starting at the checkpoint head and ending at the checkpoint specified by id. Ths results can contain checkpoint_type::UNIDENTIFIED_CHECKPOINT to represent temporary deleted checkpoints in the chain.
 
checkpoint_typefindLatestCheckpointAtOrBefore (tick_t tick, chkpt_id_t from) override
 Finds the latest checkpoint at or before the given tick starting at the from checkpoint and working backward. If no checkpoints before or at tick are found, returns nullptr.
 
checkpoint_typefindInternalCheckpoint (chkpt_id_t id)
 Gets a checkpoint through findCheckpoint interface casted to the type of Checkpoint subclass used by this class.
 
Printing Methods
std::string stringize () const override
 Returns a string describing this object.
 
void traceValue (std::ostream &o, chkpt_id_t id, const ArchData *container, uint32_t offset, uint32_t size) override
 Forwards debug/trace info onto checkpoint by ID.
 
- Public Member Functions inherited from sparta::serialization::checkpoint::Checkpointer
 Checkpointer (TreeNode &root, sparta::Scheduler *sched=nullptr)
 Checkpointer Constructor.
 
virtual ~Checkpointer ()
 Destructor.
 
const TreeNodegetRoot () const noexcept
 Returns the root associated with this checkpointer.
 
TreeNodegetRoot () noexcept
 Non-const variant of getRoot.
 
const SchedulergetScheduler () const noexcept
 Returns the sheduler associated with this checkpointer.
 
uint64_t getTotalMemoryUse () const noexcept
 Computes and returns the memory usage by this checkpointer at this moment including any framework overhead.
 
uint64_t getContentMemoryUse () const noexcept
 Computes and returns the memory usage by this checkpointer at this moment purely for the checkpoint state being held.
 
uint64_t getTotalCheckpointsCreated () const noexcept
 Returns the total number of checkpoints which have been created by this checkpointer. This is unrelated to the current number of checkpoints in existance. Includes the head checkpoint if created.
 
void createHead ()
 Creates a head without taking an identified checkpoint. Cannot already have a head.
 
chkpt_id_t createCheckpoint (bool force_snapshot=false)
 Creates a checkpoint at the given scheduler's current tick with a new checkpoint ID some point after the current checkpoint (see getCurrentID). If the current checkpoint already has other next checkpoints, the new checkpoint will be an alternate branch of the current checkpoint. This snapshot may be stored as a full snapshot if the checkpointer requires it, or if the snapshot threshold is exceeded, or if the force_snapshot argument is true Current tick will be read from scheduler (if not null) and must be >= the head checkpoint's tick. The current tick must also be >= the current checkpoints tick (See getCurrenTick).
 
void forgetCurrent ()
 Forgets the current checkpoint and current checkpoint (resetting to the head checkpoint) so that checkpoints can be taken at a different time without assuming simulation state continutiy with this checkpointers. This is ONLY to be used by a simulator IFF another checkpointer restores state at another cycle or the simulator resets but this checkpointer's tree is still expected to exist.
 
CheckpointfindCheckpoint (chkpt_id_t id) noexcept
 Finds a checkpoint by its ID.
 
virtual bool hasCheckpoint (chkpt_id_t id) const noexcept
 Tests whether this checkpoint manager has a checkpoint with the given id.
 
const CheckpointgetHead () const noexcept
 Returns the head checkpoint which is equivalent to the earliest checkpoint taken.
 
chkpt_id_t getHeadID () const noexcept
 Returns the checkpoint ID of the head checkpoint (if it exists) which is equivalent to the earliest checkpoint taken.
 
chkpt_id_t getCurrentID () const
 Returns the current checkpoint ID. This is mainly a debugging utility as the current ID changes when adding, deleting, and loading checkpoints based on whether the checkpoints take were deltas or snapshots. A correct integration of the checkpointer by a simulator should not depend on this method for behavior decisions.
 
tick_t getCurrentTick () const
 Gets the tick number of the current checkpoint (see getCurrentID). This is the tick number of the latest checkpoint either saved or written through this checkpointer. The next checkpoint taken will be on the same chain as a checkpoint taken at this tick.
 
void dumpList (std::ostream &o) const
 Dumps this checkpointer's flat list of checkpoints to an ostream with a newline following each checkpoint.
 
void dumpData (std::ostream &o) const
 Dumps this checkpointer's data to an ostream with a newline following each checkpoint.
 
void dumpAnnotatedData (std::ostream &o) const
 Dumps this checkpointer's data to an ostream with annotations between each ArchData and a newline following each checkpoint description and each checkpoint data dump.
 
void dumpTree (std::ostream &o) const
 Dumps this checkpointer's tree to an ostream with a line for each branch. Printout timescale is not relevant. Multi-line printouts for deep branches will be difficult to read.
 
void dumpBranch (std::ostream &o, const Checkpoint *chkpt, uint32_t indent, uint32_t pos, std::deque< uint32_t > &continues) const
 Recursively dumps one branch (and sub-branches) to an ostream with a line for each branch.
 

Protected Member Functions

void cleanupChain_ (checkpoint_type *d)
 Delete given checkpoint and all contiguous previous checkpoints which can be deleted (See checkpoint_type::canDelete). This is the only place where checkpoint objects are actually freed (aside from destruction) and it ensures that they will not disrupt the checkpoint delta chains. All other deletion is simply flagging and re-identifying checkpoints.
 
bool recursForwardFindAlive_ (checkpoint_type *d) const
 Look forward to see if any future checkpoints depend on d.
 
checkpoint_typefindCheckpoint_ (chkpt_id_t id) noexcept override
 Attempts to find a checkpoint within this checkpointer by ID.
 
const checkpoint_typefindCheckpoint_ (chkpt_id_t id) const noexcept override
 const variant of findCheckpoint_
 
void dumpCheckpointNode_ (const Checkpoint *chkpt, std::ostream &o) const override
 Implements Checkpointer::dumpCheckpointNode_.
 
- Protected Member Functions inherited from sparta::serialization::checkpoint::Checkpointer
const std::vector< ArchData * > & getArchDatas () const
 Returns ArchDatas enumerated by this Checkpointer for iteration when saving or loading checkpoint data.
 
CheckpointgetHead_ () noexcept
 Non-const variant of getHead_.
 
const CheckpointgetHead_ () const noexcept
 Gets the head checkpoint. Returns nullptr if none created yet.
 
void setHead_ (Checkpoint *head)
 Sets the head checkpointer pointer to head for the first time.
 
CheckpointgetCurrent_ () const noexcept
 Gets the current checkpointer pointer. Returns nullptr if there is no current checkpoint object.
 
void setCurrent_ (Checkpoint *current)
 Sets the current checkpoint pointer.
 

Additional Inherited Members

- Protected Attributes inherited from sparta::serialization::checkpoint::Checkpointer
std::map< chkpt_id_t, std::unique_ptr< Checkpoint > > chkpts_
 All checkpoints sorted by ascending tick number (or equivalently ascending checkpoint ID since both are monotonically increasing)
 
Scheduler *const sched_
 Scheduler whose tick count will be set and read. Cannnot be updated after first checkpoint without bad side effects. Keeping this const for simplicity.
 

Detailed Description

Implements quick checkpointing through delta-checkpoint trees which store state-deltas in a compact format. State is retrieved through a sparta tree from ArchDatas associated with any TreeNodes.

With the goal of checkpoint saving and loading speed, this class does not allow persistent checkpoint files (saved between session) because the data format subject to change and very sensitive to the exact device tree configuration

A checkpoint tree may look something like the following, where each checkpoint is shown by its simulation tick number (not ID)

* t=0 (head/snapshot) --> t=100 +-> t=300
*                     |
*                     `-> t=320 --> t=400 +-> t=500
*                     |                   `-> t=430
*                     `-> t=300
* 

The procedure for using the FastCheckpointer is generally:

Then:

Todo:

Implement reverse delta storage for backward checkpoint loading

Tune ArchData line size based on checkpointer performance

More profiling

Compression

Saving to disk using a templated checkpoint object storage class (allowing for non-binary)

Definition at line 65 of file FastCheckpointer.hpp.

Member Typedef Documentation

◆ checkpoint_type

Constructor & Destructor Documentation

◆ FastCheckpointer()

sparta::serialization::checkpoint::FastCheckpointer::FastCheckpointer ( TreeNode root,
Scheduler sched = nullptr 
)
inline

FastCheckpointer Constructor.

Parameters
rootTreeNode at which checkpoints will be taken. This cannot be changed later. This does not necessarily need to be a RootTreeNode. Before the first checkpoint is taken, this node must be finalized (see sparta::TreeNode::isFinalized). At this point, the node does not need to be finalized
schedScheduler to read and restart on checkpoint restore (if not nullptr)

Definition at line 90 of file FastCheckpointer.hpp.

◆ ~FastCheckpointer()

sparta::serialization::checkpoint::FastCheckpointer::~FastCheckpointer ( )
inline

Destructor.

Frees all checkpoint data

Definition at line 104 of file FastCheckpointer.hpp.

Here is the call graph for this function:

Member Function Documentation

◆ checkpointExists()

bool sparta::serialization::checkpoint::FastCheckpointer::checkpointExists ( chkpt_id_t  id)
inline

Queries a specific checkpoint by ID.

Definition at line 238 of file FastCheckpointer.hpp.

Here is the call graph for this function:

◆ cleanupChain_()

void sparta::serialization::checkpoint::FastCheckpointer::cleanupChain_ ( checkpoint_type d)
inlineprotected

Delete given checkpoint and all contiguous previous checkpoints which can be deleted (See checkpoint_type::canDelete). This is the only place where checkpoint objects are actually freed (aside from destruction) and it ensures that they will not disrupt the checkpoint delta chains. All other deletion is simply flagging and re-identifying checkpoints.

Parameters
dCheckpoint to attempt to delete first. Function will then move through each previous checkpoint until reaching head.
Postcondition
Head checkpoint will never be deleted by this function
Note
Never flags any new checkpoints as deleted
Todo:
Support compression
Todo:
canDelete is recursive at worst and might benefit from optimization

Definition at line 439 of file FastCheckpointer.hpp.

Here is the call graph for this function:

◆ deleteCheckpoint()

void sparta::serialization::checkpoint::FastCheckpointer::deleteCheckpoint ( chkpt_id_t  id)
inlineoverridevirtual

Deletes a checkpoint by ID.

Parameters
idID of checkpoint to delete. Must not be checkpoint_type::UNIDENTIFIED_CHECKPOINT and must not be equal to the ID of the head checkpoint.
Exceptions
CheckpointErrorif this manager has no checkpoint with given id. Test with hasCheckpoint first. If id == checkpoint_type::UNIDENTIFIED_CHECKPOINT, always throws. Throws if id == getHeadID(). Head cannot be deleted

Internally, this deletion may be effective-only and actual data may still exist in an incaccessible form as part of the checkpoint delta-tree implementation.

If the current checkpoint is deleted, current will be updated back along the current checkpoints previous-delta chain until a non deleted checkpoint is found. This will become the new current checkpoint

Implements sparta::serialization::checkpoint::Checkpointer.

Definition at line 169 of file FastCheckpointer.hpp.

Here is the call graph for this function:

◆ dumpCheckpointNode_()

void sparta::serialization::checkpoint::FastCheckpointer::dumpCheckpointNode_ ( const Checkpoint chkpt,
std::ostream &  o 
) const
inlineoverrideprotectedvirtual

Implements Checkpointer::dumpCheckpointNode_.

Reimplemented from sparta::serialization::checkpoint::Checkpointer.

Definition at line 578 of file FastCheckpointer.hpp.

Here is the call graph for this function:

◆ findCheckpoint_() [1/2]

const checkpoint_type * sparta::serialization::checkpoint::FastCheckpointer::findCheckpoint_ ( chkpt_id_t  id) const
inlineoverrideprotectedvirtualnoexcept

const variant of findCheckpoint_

Implements sparta::serialization::checkpoint::Checkpointer.

Definition at line 567 of file FastCheckpointer.hpp.

◆ findCheckpoint_() [2/2]

checkpoint_type * sparta::serialization::checkpoint::FastCheckpointer::findCheckpoint_ ( chkpt_id_t  id)
inlineoverrideprotectedvirtualnoexcept

Attempts to find a checkpoint within this checkpointer by ID.

Parameters
idCheckpoint ID to search for
Returns
Pointer to found checkpoint with matchind ID. If not found, returns nullptr.
Todo:
Faster lookup?

Implements sparta::serialization::checkpoint::Checkpointer.

Definition at line 556 of file FastCheckpointer.hpp.

◆ findInternalCheckpoint()

checkpoint_type * sparta::serialization::checkpoint::FastCheckpointer::findInternalCheckpoint ( chkpt_id_t  id)
inline

Gets a checkpoint through findCheckpoint interface casted to the type of Checkpoint subclass used by this class.

Definition at line 387 of file FastCheckpointer.hpp.

Here is the call graph for this function:

◆ findLatestCheckpointAtOrBefore()

checkpoint_type * sparta::serialization::checkpoint::FastCheckpointer::findLatestCheckpointAtOrBefore ( tick_t  tick,
chkpt_id_t  from 
)
inlineoverridevirtual

Finds the latest checkpoint at or before the given tick starting at the from checkpoint and working backward. If no checkpoints before or at tick are found, returns nullptr.

Parameters
tickTick to search for
fromCheckpoint at which to begin searching for a tick. Must be a valid checkpoint known by this checkpointer. See hasCheckpoint.
Returns
The latest checkpoint with a tick number less than or equal to the tick argument. Returns nullptr if no checkpoints before tick were found. It is possible for the checkpoint identified by from could be returned.
Warning
This is not a high-performance method. Generally, a client of this interface knows a paticular ID.
Exceptions
CheckpointErrorif from does not refer to a valid checkpoint.

Implements sparta::serialization::checkpoint::Checkpointer.

Definition at line 364 of file FastCheckpointer.hpp.

Here is the call graph for this function:

◆ getCheckpointChain()

std::deque< chkpt_id_t > sparta::serialization::checkpoint::FastCheckpointer::getCheckpointChain ( chkpt_id_t  id) const
inlineoverridevirtual

Debugging utility which gets a deque of checkpoints representing a chain starting at the checkpoint head and ending at the checkpoint specified by id. Ths results can contain checkpoint_type::UNIDENTIFIED_CHECKPOINT to represent temporary deleted checkpoints in the chain.

Parameters
idID of checkpoint that terminates the chain
Returns
dequeue of checkpoint IDs where the front is always the head and the back is always the checkpoint described by id. If there is no checkpoint head, returns an empty result
Exceptions
CheckpointErrorif id does not refer to a valid checkpoint.
Note
Makes a new vector of results. This should not be called in the critical path.

Implements sparta::serialization::checkpoint::Checkpointer.

Definition at line 331 of file FastCheckpointer.hpp.

Here is the call graph for this function:

◆ getCheckpoints()

std::vector< chkpt_id_t > sparta::serialization::checkpoint::FastCheckpointer::getCheckpoints ( ) const
inlineoverridevirtual

Gets all checkpoint IDs available on any timeline sorted by tick (or equivalently checkpoint ID).

Returns
vector of valid checkpoint IDs (never checkpoint_type::UNIDENTIFIED_CHECKPOINT)
Note
Makes a new vector of results. This should not be called in the critical path.

Implements sparta::serialization::checkpoint::Checkpointer.

Definition at line 275 of file FastCheckpointer.hpp.

Here is the call graph for this function:

◆ getCheckpointsAt()

std::vector< chkpt_id_t > sparta::serialization::checkpoint::FastCheckpointer::getCheckpointsAt ( tick_t  t) const
inlineoverridevirtual

Gets all checkpoints taken at tick t on any timeline.

Parameters
tTick number at which checkpoints should found.
Returns
vector of valid checkpoint IDs (never checkpoint_type::UNIDENTIFIED_CHECKPOINT)
Note
Makes a new vector of results. This should not be called in the critical path.

Implements sparta::serialization::checkpoint::Checkpointer.

Definition at line 255 of file FastCheckpointer.hpp.

Here is the call graph for this function:

◆ getNumCheckpoints()

uint32_t sparta::serialization::checkpoint::FastCheckpointer::getNumCheckpoints ( ) const
inlineoverridevirtualnoexcept

Gets the current number of checkpoints having valid IDs.

Implements sparta::serialization::checkpoint::Checkpointer.

Definition at line 290 of file FastCheckpointer.hpp.

◆ getNumDeadCheckpoints()

uint32_t sparta::serialization::checkpoint::FastCheckpointer::getNumDeadCheckpoints ( ) const
inlinenoexcept

Gets the curent number of checkpoints (delta or snapshot) withOUT valid IDs.

Definition at line 312 of file FastCheckpointer.hpp.

◆ getNumDeltas()

uint32_t sparta::serialization::checkpoint::FastCheckpointer::getNumDeltas ( ) const
inlinenoexcept

Gets the current number of delta checkpoints with valid IDs.

Definition at line 304 of file FastCheckpointer.hpp.

Here is the call graph for this function:

◆ getNumSnapshots()

uint32_t sparta::serialization::checkpoint::FastCheckpointer::getNumSnapshots ( ) const
inlinenoexcept

Gets the current number of snapshots with valid IDs.

Definition at line 297 of file FastCheckpointer.hpp.

◆ getSnapshotThreshold()

uint32_t sparta::serialization::checkpoint::FastCheckpointer::getSnapshotThreshold ( ) const
inlinenoexcept

Returns the next-shapshot threshold.

This represents the distance between two checkpoints required for the checkpointer to automatically place a snapshot checkpoint instead of a delta. A threshold of 0 or 1 results in all checkpoints being snapshots. A value of 10 results in every 10th checkpoint being a snapshot. Explicit snapshot creation using createCheckpoint can interrupt and restart this pattern.

This value is a performance/space tradeoff knob.

Definition at line 133 of file FastCheckpointer.hpp.

◆ loadCheckpoint()

void sparta::serialization::checkpoint::FastCheckpointer::loadCheckpoint ( chkpt_id_t  id)
inlineoverridevirtual

Loads state from a specific checkpoint by ID.

Note
Does not delete checkpoints. Checkpoints must be explicitly deleted by deleteCheckpoint
Exceptions
CheckpointErrorif id does not refer to checkpoint that exists or if checkpoint could not be load.
Warning
If checkpoint fails during loading for reasons other than an invalid ID, the simulation state could be corrupt
Postcondition
current checkpoint is now the checkpoint specified by id
If this checkpointer has was constructed with a pointer to a scheduler, sets that scheduler's current tick to the checkpoint's tick using Scheduler::restartAt

Implements sparta::serialization::checkpoint::Checkpointer.

Definition at line 208 of file FastCheckpointer.hpp.

Here is the call graph for this function:

◆ recursForwardFindAlive_()

bool sparta::serialization::checkpoint::FastCheckpointer::recursForwardFindAlive_ ( checkpoint_type d) const
inlineprotected

Look forward to see if any future checkpoints depend on d.

Parameters
dcheckpoint to inspect and recursively search
Returns
true if the current checkpoint or any live checkpoints are hit in the search. Search terminates on each branch when a snapshot or the end of the branch is reached. The branch to inspect (d) will not be checked itself since the point is to determine which branches down-chain depend on it.

Definition at line 517 of file FastCheckpointer.hpp.

Here is the call graph for this function:

◆ setSnapshotThreshold()

void sparta::serialization::checkpoint::FastCheckpointer::setSnapshotThreshold ( uint32_t  thresh)
inlinenoexcept

Sets the snapshot threshold.

See also
getSnapshotThreshold

Definition at line 139 of file FastCheckpointer.hpp.

◆ stringize()

std::string sparta::serialization::checkpoint::FastCheckpointer::stringize ( ) const
inlineoverridevirtual

Returns a string describing this object.

Reimplemented from sparta::serialization::checkpoint::Checkpointer.

Definition at line 401 of file FastCheckpointer.hpp.

Here is the call graph for this function:

◆ traceValue()

void sparta::serialization::checkpoint::FastCheckpointer::traceValue ( std::ostream &  o,
chkpt_id_t  id,
const ArchData container,
uint32_t  offset,
uint32_t  size 
)
inlineoverridevirtual

Forwards debug/trace info onto checkpoint by ID.

Implements sparta::serialization::checkpoint::Checkpointer.

Definition at line 410 of file FastCheckpointer.hpp.

Here is the call graph for this function:

The documentation for this class was generated from the following file: