Checkpointing¶
Configuring checkpointing¶
Checkpointing is configured using command line options and/or specifying certain parameters in the configuration file. These options are specified in the chapters above. For more detailed information on how to configure checkpointing, read the Simulator chapter (Run the simulator part).
Checkpointing enables the ability to save the state of the simulator multiple times during the simulation itself. The simulator state is saved in a binary format, based on a HDF5 storage format. The format of this file is specified below. Checkpointing is configured using 3 parameters: the checkpointing frequency, the checkpointing file and the simulator run mode.
Checkpointing frequency¶
How frequently the simulator will be saved, can be set by the checkpointing frequency parameter. This parameter can be set by using a commandline argument or specifying the parameter in the xml configuration file. This are the possible values for the parameter:
- 0 - Only save the last timestep of the simulation
- x - Save the simulator state every x timesteps
Checkpointing file¶
This paramater specifies the name of the checkpointing file. The use for the file depends on the simulator run mode parameter.
Replay timestep¶
This parameter is used when the run mode is Replay
. It specifies the timestep from which you want to start the simulation.
Simulator run mode¶
The simulator can be run in different modes. Currently, the following run modes are supported:
No starting from checkpointed file in initial mode.
- Initial - The simulator is built from scratch using the configuration file, and is saved every x timesteps according to the checkpointing frequency.
- Extend - The simulation is extended from the last saved checkpoint in the checkpointing file.
- Replay - The simulation is replayed from a specified timestep.
- Extract - The configuration files are extracted from the checkpointing file. This mode will not actually run the simulator itself.
HDF5 file format¶
Table structure¶
/Configuration/configuration | rank | 1 |
dims | 1 | |
dtype | ConfigDatatype | |
/amt_timesteps | rank | 1 |
dims | 1 | |
dtype | H5T_NATIVE_UINT | |
/last_timestep | rank | 1 |
dims | 1 | |
dtype | H5T_NATIVE_UINT | |
/person_time_independent | rank | 1 |
dims | amt_persons | |
dtype | PersonTIDatatype | |
/Timestep_n/randomgen | rank | 1 |
dims | 1 | |
dtype | StrType | |
/Timestep_n/person_time_dependent | rank | 1 |
dims | amt_persons | |
dtype | PersonTDDatatype | |
/Timestep_n/calendar | rank | 1 |
dims | 1 | |
dtype | CalendarDatatype | |
/Timestep_n/travellers | rank | 1 |
dims | amt_travellers | |
dtype | TravellerDataType | |
/Timestep_n/household_clusters | rank | 1 |
dims | amt_persons | |
dtype | H5T_NATIVE_UINT | |
/Timestep_n/primary_community_clusters | rank | 1 |
dims | amt_persons | |
dtype | H5T_NATIVE_UINT | |
/Timestep_n/secondary_community_clusters | rank | 1 |
dims | amt_persons | |
dtype | H5T_NATIVE_UINT | |
/Timestep_n/work_clusters | rank | 1 |
dims | amt_workers | |
dtype | H5T_NATIVE_UINT | |
/Timestep_n/school_clusters | rank | 1 |
dims | amt_students | |
dtype | H5T_NATIVE_UINT |
Custom datatypes¶
ConfigDatatype¶
- StrType - config_content
- StrType - disease_content
- StrType - holidays_content
- StrType - age_contact_content
PersonTIDatatype (time independent)¶
- H5T_NATIVE_UINT - ID
- H5T_NATIVE_DOUBLE - age
- H5T_NATIVE_CHAR - gender
- H5T_NATIVE_UINT - household_ID
- H5T_NATIVE_UINT - school_ID
- H5T_NATIVE_UINT - work_ID
- H5T_NATIVE_UINT - prim_comm_ID
- H5T_NATIVE_UINT - sec_comm_ID
- H5T_NATIVE_UINT - start_infectiousness
- H5T_NATIVE_UINT - time_infectiousness
- H5T_NATIVE_UINT - start_symptomatic
- H5T_NATIVE_UINT - time_symptomatic
PersonTDDatatype (time dependent)¶
- H5T_NATIVE_HBOOL - participant
- H5T_NATIVE_UINT - health_status
- H5T_NATIVE_UINT - disease_counter
- H5T_NATIVE_UINT - on_vacation
CalendarDatatype¶
- H5T_NATIVE_HSIZE - day
- StrType - date
TravellerDataType¶
This type consists of person data from original simulator, as well as data from the new simulator. Person data which is similair over both simulators is only saved once (such as gender data).
Other than that, the data type also contains metadata information:
- H5T_NATIVE_VARIABLE - home_sim_name
- H5T_NATIVE_VARIABLE - dest_sim_name
- H5T_NATIVE_UINT - home_sim_index
- H5T_NATIVE_UINT - dest_sim_index
- H5T_NATIVE_UINT - days_left
Elaboration¶
First of all, the configuration files are saved. This allows for independent runs for later simulations, by using the stored configurations.
In terms of person data, the time independent data is saved once. The time dependent data is stored at each save.
The order of person id’s in the different cluster types is saved as well. This, in combination with the saving of the rng state, guarantees that the run can be resumed exactly similair to the state in which it was saved. This also allows the exact same end results when running the simulation without multithreading.
As part of the multi region extension, travellers are saved too. This allows for a reconstruction of the simulation with multi region travellers present.