class: center, middle, inverse, title-slide # Lecture 5 ## Dynamics theory review ### Ivan Rudik ### AEM 7130 --- # Roadmap 1. Review Markov chains and dynamic models 3. Review theory for numerical methods --- # Building a dynamic economic model We need 5 things for a dynamic economic model -- 1. Controls: what variables are we optimizing, what decisions do the economic agents make? -- 2. States: What are the variables that change over time and interact with the controls? -- 3. Payoff: What is the single-period payoff function? What's our reward? -- 4. Transition equations: How do the state variables evolve over time? -- 5. Planning horizon: When does our problem terminate? Never? 100 years? --- # Two types of solutions Dynamic problems can generally be solved in two ways -- .hi-blue[Open-loop:] treat the model as a sequence of static optimization problems solved simultaneously -- - Transitions act as constraints - Ends up being just a potentially giant (but simple) non-linear optimization problem - Drawback: solutions will be just a function of time so we can't introduce uncertainty, strategic behavior, etc --- # Two types of solutions .hi-blue[Feedback:] treat the model as a single-period optimization problem with the immediate payoff and the *continuation value* -- - Yields a solution that is a function of states - Permits uncertainty, game structures - Drawback: need to solve for the continuation value function --- # Markov chains Dynamic models in economic models are typically .hi-blue[Markovian] -- A stochastic process `\(\{x_t\}\)` is said to have the .hi-blue[Markov property] if for all `\(k\geq1\)` and all `\(t\)` `$$Prob(x_{t+1}|x_t,x_{t-1},...,x_{t-k}) = Prob(x_{t+1}|x_t)$$` -- The distribution of the next vector in the sequence (i.e. the distribution of next period's state) is a function of only the current vector (state) -- The Markov property is necessary for the feedback representation --- # Markov chains We characterize stochastic state transitions with .hi-blue[Markov chains] -- A Markov chain is characterized by: 1. `\(n\)`-dimensional state space with vectors `\(e_i\)`, `\(i=1,...,n\)` where `\(e_i\)` is an `\(n \times 1\)` unit vector whose `\(i\)`th entry is 1 and all others are 0 2. An `\(n \times n\)` *transition matrix* `\(P\)` which captures the probability of transitioning from one point of the state space to another point of the state space next period 3. `\(n \times 1\)` vector `\(\pi_0\)` whose `\(i\)`th value is the probability of being in state `\(i\)` at time 0: `\(\pi_{0i} = \text{Prob}(x_0 = e_i)\)` --- # Markov chains `\(P\)` is given by `$$P_{ij} = \text{Prob}(x_{t+1} = e_j|x_t = e_i)$$` -- We need one assumption: - For `\(i=1,..,n\)`, `\(\sum_{j=1}^n P_{ij} = 1\)` and `\(\pi_0\)` satisfies: `\(\sum_{i=1}^n \pi_{0i} = 1\)` --- # Markov chains Nice property of Markov chains: We can use `\(P\)` to determine the probability of moving to another state in *two* periods by `\(P^2\)` since `\begin{align} &\text{Prob}(x_{t+2} = e_j|x_t = e_i) \notag\\ &= \sum_{h=1}^n \text{Prob}(x_{t+2} = e_j|x_{t+1} = e_h)\text{Prob}(x_{t+1} = e_h|x_t = e_i) \notag\\ &= \sum_{h=1}^n P_{ih}P_{hj} = P_{ij}^2 \notag \end{align}` -- iterate on this to show that `$$\text{Prob}(x_{t+k}=e_j|x_t=e_i) = P_{ij}^k$$` --- # Dynamic programming Start with a general sequential problem to set up the basic recursive/feedback dynamic optimization problem -- Let `\(\beta \in (0,1)\)`, the economic agent selects a sequence of controls, `\(\{u_t\}_{t=0}^\infty\)` to maximize `$$\sum_{t=0}^\infty \beta^t r(x_t,u_t)$$` subject to `\(x_{t+1} = g(x_t,u_t)\)` and with `\(x_0\)` given --- # Dynamic programming Assume `\(r\)` is concave, continuously differentiable, and the state space is convex and compact -- We want to recover a *policy function* `\(h\)` which maps the current state `\(x_t\)` into the current control `\(u_t\)`, such that the sequence `\(\{u_s\}_{s=0}^\infty\)` generated by iterating `\begin{gather} u_t = h(x_t) \notag\\ x_{t+1} = g(x_t,u_t), \notag \end{gather}` starting from `\(x_0\)`, solves our original optimization problem --- # Value functions Consider a function `\(V(x)\)`, the .hi-blue[continuation value function] where `$$V(x_0) = \max_{\{u_s\}_{s=0}^\infty} \sum_{t=0}^\infty \beta^t r(x_t,u_t)$$` subject to the transition equation: `\(x_{t+1} = g(x_t,u_t)\)` The value function defines the maximum value of our original problem as a function of the state --- # Value functions Suppose we know `\(V(x)\)`, then we can solve for the policy function `\(h\)` by solving for each `\(x \in X\)` $$\max_u r(x,u) + \beta V(x') $$ where `\(x' = g(x,u)\)` and primes on state variables indicate next period -- Conditional on having `\(V(x)\)` we can solve our dynamic programming problem -- Instead of solving for an infinite dimensional set of policies, we instead find the `\(V(x)\)` and `\(h\)` that solves the continuum maximization problems, where there is a unique maximization problem for each `\(x\)` -- This is often easier --- # Bellman equations .hi-blue[Issue:] How do we know `\(V(x)\)` when it depends on future (optimized) actions? -- Define the .hi-blue[Bellman equation] `$$V(x) = \max_u r(x,u) + \beta V[g(x,u)]$$` -- `\(h(x)\)` maximizes the right hand side of the Bellman --- # Bellman equations The policy function satisfies `$$V(x) = r[x,h(x)] + \beta V\{g[x,h(x)]\}$$` -- Solving the problem yields a solution that is a function, `\(V(x)\)` -- This is a recursive problem since it maps itself into a scalar value, can be hard to think about at first -- One of the workhorse solution methods exploits this recursion and contraction mapping properties of the Bellman operator to solve for `\(V(x)\)` --- # Solution properties Under standard assuptions we have that 1. The solution to the Bellman equation, `\(V(x)\)`, is strictly concave 2. The solution is approached in the limit as `\(j \rightarrow \infty\)` by iterations on: `\(V_{j+1}(x) = \max_{u} r(x,u) + \beta V_j(x')\)`, given any bounded and continuous `\(V_0\)` and our transition equation 3. There exists a unique and time-invariant optimal policy function `\(u_t = h(x_t)\)` where `\(h\)` maximizes the right hand side of the Bellman 4. The value function `\(V(x)\)` is differentiable --- # Euler equations .hi-blue[Euler equations] are dynamic efficiency conditions: they equalize the marginal effects of an optimal policy over time E.g: set the current marginal benefit, energy from burning fossil fuels, with the future marginal cost, global warming -- Example: 1. We have a stock of capital `\(K_t\)` that depreciates at rate `\(\delta \in (0,1)\)` 2. We can invest to increase our future capital `\(I_t\)` at cost `\(c(I_t)\)` and effectiveness `\(\gamma \in (0,1]\)` 3. Per-period payoff `\(U(Y_t)\)` from consuming output `\(Y_t = f(K_t) = K_t\)` 4. Discount factor is `\(\beta \in (0,1)\)` --- # Euler equations The Bellman equation is `\begin{align} V(K_t) &= \max_{I_t} \left\{ u(K_t) - c(I_t) + \beta V(K_{t+1}) \right\} \notag \\ &\text{subject to: } \,\,\,\, K_{t+1} = (1 - \delta) K_t + \gamma I \notag \end{align}` -- The FOC with respect to investment is `$$c_I(I_t) = \beta \, \gamma \, V_K(K_{t+1})$$` -- Envelope theorem gives us `$$V_K(K_t) = u_K(K_t) + \beta \, \delta \, V_K(K_{t+1})$$` --- # Euler equations The FOC with respect to investment is `$$c_I(I_t) = \beta \, \gamma \, V_K(K_{t+1})$$` Envelope theorem gives us `$$V_K(K_t) = u_K(K_t) + \beta \, \delta \, V_K(K_{t+1})$$` -- Advance both by one period since they must hold for all `\(t\)` `\begin{gather} c_I(I_{t+1}) = \beta \, \gamma \, V_K(K_{t+2}) \notag\\ V_K(K_{t+1}) = u_K(K_{t+1}) + \beta \, \delta \, V_K(K_{t+2}) \notag \end{gather}` --- # Euler equations Substitute the time `\(t\)` and time `\(t+1\)` FOCs into our time `\(t+1\)` envelope condition `$$\frac{c'(I_t)}{\beta \, \gamma} = u'(K_{t+1}) + \beta \, \delta \frac{c'(I_{t+1})}{\beta \, \gamma}$$` `$$\Rightarrow c'(I_t) = \beta \left[ \gamma \, u'(K_{t+1}) + \delta \, c'(I_{t+1}) \right]$$` -- LHS is marginal cost of investment, RHS is marginal benefit of investment .hi-blue[along an optimal path] --- # Euler equations `$$\Rightarrow c'(I_t) = \beta \left[ \gamma \, u'(K_{t+1}) + \delta \, c'(I_{t+1}) \right]$$` LHS: marginal cost of investment RHS: marginal benefit of higher utility from more future output, and lower future investment cost because of higher capital stock --- # Euler equations: no-arbitrage Euler equations are .hi-blue[no-arbitrage conditions] Suppose we're on the optimal capital path and want to deviate by cutting back investment -- Yields a marginal benefit today of saving us some investment cost -- There are two costs associated with it: -- 1. Lower utility tomorrow because we will have a smaller capital stock 2. Greater investment cost tomorrow to return to the optimal capital trajectory --- # Euler equations: no-arbitrage If this deviation (or deviating by investing more today) were profitable, we would do it `\(\rightarrow\)` the optimal policy must have zero additional profit opportunities: this is what the Euler equation defines --- # Basic theory Here we finish up the basic theory pieces we need We will focus on deterministic problems but this easily ports to stochastic problems -- Consider an infinite horizon problem for an economic agent 1. Payoff `\(r(s_t,u_t)\)` in some period `\(t\)` is a function of the state vector `\(s_t\)` and control vector `\(u_t\)` 2. Transition equations are `\(s_{t+1} = g(s_t,u_t)\)` 3. Assume that `\(u \in U\)` and `\(s \in S\)` 4. Payoff is bounded: `\(u(s_t,u_t)\)`. --- # Basic theory Here the current state vector *completely* summarizes all the information of the past and is all the information the agent needs to make a forward-looking decision `\(\rightarrow\)` our problem has the Markov property -- Final two pieces 1. Stationarity: does not depend explicitly on time 2. Discounting: `\(\beta \in (0,1)\)`, the future matters but not as much as today Discounting and bounded payoffs ensures total value is bounded --- # Basic theory Represent this payoff as `$$\sum_{t=0}^\infty \beta^t r(s_t,u_t)$$` -- The value of the maximized discounted stream of payoffs is `\begin{gather} V(s_0) = \max_{u_0 \in U(s_0)} r(s_t,u_t) + \beta \left[\max_{\{u_t\}_{t=1}^\infty} \sum_{t=t}^\infty \beta^t r(s_t,u_t)\right] \notag \\ \text{subject to: } s_{t+1} = g(s_t,u_t) \notag \end{gather}` -- the terms inside the square brackets is the maximized discounted stream of payoffs beginning at state `\(s_1\)` --- # Basic theory This means the problem can be written recursively as `\begin{gather} V(s_0) = \max_{u_0 \in U(s_0)} r(s_t,u_t) + \beta\,V(s_1) \\ \text{subject to: } s_{t+1} = g(s_t,u_t) \end{gather}` which is our Bellman (we just exploited Bellman's principle of optimality) --- # Value function existence and uniqueness Reformulate the problem as, `$$V(s) = \max_{s' \in \Gamma(s)} r(s,s') + \beta\,V(s'), \,\,\, \forall s \in S$$` where `\(\Gamma(s)\)` is our set of feasible states next period -- There exists a solution to the Bellman under a (particular) set of sufficient conditions: -- If the following are true -- 1. `\(r(s_t, u_t)\)` is real-valued, continuous and bounded 2. `\(\beta \in (0,1)\)` 3. the feasible set of states for next period is non-empty, compact, and continuous -- then there exists a unique value function `\(V(s)\)` that solves the Bellman equation --- # Intuitive sketch of the proof Define an operator `\(T\)` as `$$T(W)(s) = \max_{s' \in \Gamma(s)} r(s,s') + \beta\,W(s'), \,\,\, \forall s \in S$$` -- This operator takes some value function `\(W(s)\)`, maximizes it, and returns another `\(T(W)(s)\)` -- It is easy to see that any `\(V(s)\)` that satisfies `\(V(s) = T(V)(s) \,\,\, \forall s \in S\)` solves the Bellman equation -- Therefore we simply search for the .hi-blue[fixed point] of `\(T(W)\)` to solve our dynamic problem, but how do we find the fixed point? -- First we must show that a way exists by showing that `\(T(W)\)` is a .hi-blue[contraction]: as we iterate using the `\(T\)` operator, we will get closer and closer to the fixed point --- # Intuitive sketch of the proof Blackwell's sufficient conditions for a contraction are -- 1. Monotonicity: if `\(W(s) \geq Q(s) \,\,\, \forall s \in S\)`, then `\(T(W)(s) \geq T(Q)(s) \,\,\, \forall s \in S\)` -- 2. Discounting: there exists a `\(\beta \in (0,1)\)` such that `\(T(W+k)(s) \leq T(W)(s) + \beta\,k\)` -- Monotonicity holds under our maximization -- Discounting reflects that we must be discounting the future -- If these two conditions hold then we have a contraction with modulus `\(\beta\)` -- Why do we care this is a contraction? --- # Intuitive sketch of the proof So we can take advantage of the contraction mapping theorem which states: -- 1. `\(T\)` has a unique fixed point 2. `\(T(V^*) = V^*\)` 3. We can start from any arbitrary initial function `\(W\)`, iterate using `\(T\)` and reach the fixed point --- # Next up ## Next: numerical methods for discrete time dynamics