Recently, Deepmind open sourced their Impala algorithm. I studied their implementation and summarized some TF features suitable for describing a RL scenario. The slides are here.