Two-step Planning Task Code

In the two-step task, subjects learn a model of the dynamical structure of the environment, and use this model to plan their choices. The rat two-step task was adapted from work with human subjects (Daw et al., 2011), and introduced in Miller, Botvinick, and Brody (2017), where it is described in more detail. The task software runs within the bControl environment. Using this code, training a naive rat to perform the two-step task takes about two months. If you have any trouble running this code or training your rats, please contact Kevin Miller (even if you aren’t having trouble, if you’re using this task I’d love to hear from you!).

Two Step Task Training Code

After downloading this file, unzip it. You will find two folders and two files. Each folder contains code for a bControl protocol, and each file is a settings file suitable for use with that protocol. You will need to place these folders and files in the appropriate directories to work with your installation of bControl, and change the “experimenter” and “ratname” fields in the settings file to match your experimenter name and rat name.

To train a new rat:

In this phase, your rat will learn to enter the outcome ports when they illuminate to receive reward. Begin with the “Classical” protocol and the appropriate settings file. Train your rat with this protocol until he is completing ~200 trials per day.
In this phase, your rat will learn to enter all of the ports needed for the full task, in the correct order. Use the “TwoStep6” protocol with the settings file provided. Decide whether this rat will be in the “congruent” (common transitions link left-left and right-right choice ports to outcome ports) or “incongruent” (common transitions link left-right and right-left) condition. Set the p_congruent parameter in Params Section to 0.8 or 0.2, appropriately. Ensure that p_forceRight and p_forceLeft are both set to 50% (half of trials will be forced-choice to each side) and that left_reward_prob and right_reward_prob at both set to 1 (all trials will be rewarded). Train your rat with these settings until he is doing ~200 trials per day.
In this phase, your rat will learn to make choices between the choice ports, and to adapt those choices to changing task contingencies. Set the p_forceRight and p_forceLeft parameters in ParamsSection to 0.1 (80% free choice trials). Set the left_reward_prob parameter to 0 (in the first block, only visits to the right outcome port will be rewarded), and ensure that in RewardProbsSection the toggle buttons read “Performance-triggered flips” and “flips enabled” (blocks will flip based on the rat’s choices). Train your rat with these settings until he is earning 3-4 block switches per session for a few sessions in a row.
In this phase, we begin making the task harder. Set left_reward_prob to 0.1 and right_reward_prob to 0.9 (or vice versa). Train your rat with these settings until he is earning 3-4 block switches per session for a few sessions in a row.
Continue making the task harder. Set left_reward_prob to 0.2 and right_reward_prob to 0.8 (or vice versa). Train your rat with these settings until he is earning 2-3 block switches per session for a few sessions in a row.
Final task! Turn off the performance-triggering of the block switches, by toggling the “performance-triggered flips” button on RewardProbsSection to read “nonperformance flips”. Set nTrials_for_flip (the minimum block length) to 10, and flip_prob_if_ready (the per-trial chance of a block change) to 0.02. Monitor your rat for several days to make sure his performance remains good.

Brodylab

Laboratory for Quantitative and Computational Systems Neuroscience — Princeton University and Howard Hughes Medical Institute

Two-step Planning Task Code