Type MonitoredSession
Namespace tensorflow.train
Parent _MonitoredSession
Interfaces IMonitoredSession
Session-like object that handles initialization, recovery and hooks. Example usage:
Initialization: At creation time the monitored session does following things
in given order: * calls `hook.begin()` for each given hook
* finalizes the graph via `scaffold.finalize()`
* create session
* initializes the model via initialization ops provided by `Scaffold`
* restores variables if a checkpoint exists
* launches queue runners
* calls `hook.after_create_session()` Run: When `run()` is called, the monitored session does following things: * calls `hook.before_run()`
* calls TensorFlow `session.run()` with merged fetches and feed_dict
* calls `hook.after_run()`
* returns result of `session.run()` asked by user
* if `AbortedError` or `UnavailableError` occurs, it recovers or
reinitializes the session before executing the run() call again Exit: At the `close()`, the monitored session does following things in order: * calls `hook.end()`
* closes the queue runners and the session
* suppresses `OutOfRange` error which indicates that all inputs have been
processed if the monitored_session is used as a context How to set `tf.compat.v1.Session` arguments: * In most cases you can set session arguments as follows:
* In distributed setting for a non-chief worker, you can use following:
See `MonitoredTrainingSession` for an example usage based on chief or worker. Note: This is not a `tf.compat.v1.Session`. For example, it cannot do
following: * it cannot be set as default session.
* it cannot be sent to saver.save.
* it cannot be sent to tf.train.start_queue_runners.
Show Example
saver_hook = CheckpointSaverHook(...) summary_hook = SummarySaverHook(...) with MonitoredSession(session_creator=ChiefSessionCreator(...), hooks=[saver_hook, summary_hook]) as sess: while not sess.should_stop(): sess.run(train_op)