Checkpointing

Configuring checkpointing

Checkpointing is configured using command line options and/or specifying certain parameters in the configuration file. These options are specified in the chapters above. For more detailed information on how to configure checkpointing, read the Simulator chapter (Run the simulator part).

Checkpointing enables the ability to save the state of the simulator multiple times during the simulation itself. The simulator state is saved in a binary format, based on a HDF5 storage format. The format of this file is specified below. Checkpointing is configured using 3 parameters: the checkpointing frequency, the checkpointing file and the simulator run mode.

Checkpointing frequency

How frequently the simulator will be saved, can be set by the checkpointing frequency parameter. This parameter can be set by using a commandline argument or specifying the parameter in the xml configuration file. This are the possible values for the parameter:

  • 0 - Only save the last timestep of the simulation
  • x - Save the simulator state every x timesteps

Checkpointing file

This paramater specifies the name of the checkpointing file. The use for the file depends on the simulator run mode parameter.

Replay timestep

This parameter is used when the run mode is Replay. It specifies the timestep from which you want to start the simulation.

Simulator run mode

The simulator can be run in different modes. Currently, the following run modes are supported:

No starting from checkpointed file in initial mode.

  • Initial - The simulator is built from scratch using the configuration file, and is saved every x timesteps according to the checkpointing frequency.
  • Extend - The simulation is extended from the last saved checkpoint in the checkpointing file.
  • Replay - The simulation is replayed from a specified timestep.
  • Extract - The configuration files are extracted from the checkpointing file. This mode will not actually run the simulator itself.

HDF5 file format

Table structure

/Configuration/configuration rank 1
  dims 1
  dtype ConfigDatatype
/amt_timesteps rank 1
  dims 1
  dtype H5T_NATIVE_UINT
/last_timestep rank 1
  dims 1
  dtype H5T_NATIVE_UINT
/person_time_independent rank 1
  dims amt_persons
  dtype PersonTIDatatype
/Timestep_n/randomgen rank 1
  dims 1
  dtype StrType
/Timestep_n/person_time_dependent rank 1
  dims amt_persons
  dtype PersonTDDatatype
/Timestep_n/calendar rank 1
  dims 1
  dtype CalendarDatatype
/Timestep_n/travellers rank 1
  dims amt_travellers
  dtype TravellerDataType
/Timestep_n/household_clusters rank 1
  dims amt_persons
  dtype H5T_NATIVE_UINT
/Timestep_n/primary_community_clusters rank 1
  dims amt_persons
  dtype H5T_NATIVE_UINT
/Timestep_n/secondary_community_clusters rank 1
  dims amt_persons
  dtype H5T_NATIVE_UINT
/Timestep_n/work_clusters rank 1
  dims amt_workers
  dtype H5T_NATIVE_UINT
/Timestep_n/school_clusters rank 1
  dims amt_students
  dtype H5T_NATIVE_UINT

Custom datatypes

ConfigDatatype

  • StrType - config_content
  • StrType - disease_content
  • StrType - holidays_content
  • StrType - age_contact_content

PersonTIDatatype (time independent)

  • H5T_NATIVE_UINT - ID
  • H5T_NATIVE_DOUBLE - age
  • H5T_NATIVE_CHAR - gender
  • H5T_NATIVE_UINT - household_ID
  • H5T_NATIVE_UINT - school_ID
  • H5T_NATIVE_UINT - work_ID
  • H5T_NATIVE_UINT - prim_comm_ID
  • H5T_NATIVE_UINT - sec_comm_ID
  • H5T_NATIVE_UINT - start_infectiousness
  • H5T_NATIVE_UINT - time_infectiousness
  • H5T_NATIVE_UINT - start_symptomatic
  • H5T_NATIVE_UINT - time_symptomatic

PersonTDDatatype (time dependent)

  • H5T_NATIVE_HBOOL - participant
  • H5T_NATIVE_UINT - health_status
  • H5T_NATIVE_UINT - disease_counter
  • H5T_NATIVE_UINT - on_vacation

CalendarDatatype

  • H5T_NATIVE_HSIZE - day
  • StrType - date

TravellerDataType

This type consists of person data from original simulator, as well as data from the new simulator. Person data which is similair over both simulators is only saved once (such as gender data).

Other than that, the data type also contains metadata information:

  • H5T_NATIVE_VARIABLE - home_sim_name
  • H5T_NATIVE_VARIABLE - dest_sim_name
  • H5T_NATIVE_UINT - home_sim_index
  • H5T_NATIVE_UINT - dest_sim_index
  • H5T_NATIVE_UINT - days_left

Elaboration

First of all, the configuration files are saved. This allows for independent runs for later simulations, by using the stored configurations.

In terms of person data, the time independent data is saved once. The time dependent data is stored at each save.

The order of person id’s in the different cluster types is saved as well. This, in combination with the saving of the rng state, guarantees that the run can be resumed exactly similair to the state in which it was saved. This also allows the exact same end results when running the simulation without multithreading.

As part of the multi region extension, travellers are saved too. This allows for a reconstruction of the simulation with multi region travellers present.