1 Consumer and Producer Theory
This section is still in development.
The definitions and theorems below can also be used for courses on:
- calculus,
- linear algebra,
- optimization, and
- real analysis.
1.1 Linear Algebra
1.1.1 Basic Notions of Linear Algebra
Vectors are always columns, but it is customary to write them as rows to save space.
Definition 1.1 (Homogeneous System) If \(b = 0_{m\times 1}\), the system is said to be homogeneous and there is at least one solution, given by \(x = 0_{n\times 1}\). \[ \text{i.e.}\quad Ax = 0. \]
Proposition 1.1 If \(b \neq 0\), there need not exist \(x\) such that \(Ax = b\).
Definition 1.2 (Linear Combination) \(y \in \mathscr{R}^m\) is a linear combination of \(x_1, \dots, x_n \in \mathscr{R}^m\) if there exists \(\alpha_1, \dots, \alpha_n \in \mathscr{R}\) such that \(\displaystyle y = \sum_{i = 1}^n \alpha_i x_i\).
Example 1.1 Consider three matrices \(A_{m\times n}\), \(B_{k\times m}\) and \(C_{k\times n}\) such that \(BA = C\).
Observe that:
- rows of \(C\) are linear combinations of rows of \(A\), and
- columns of \(C\) are linear combinations of columns of \(B\).
These statements are in fact equivalent since \(BA = C\) if and only if \(A^T B^T = C^T\).
1.1.2 Elementary Row Operations and RREF
Definition 1.3 (Elementary Row Operations) Elementary row operations are simple operations that allow to transform a system of linear equations into an equivalent system.
There are three elementary row operations:
- switching rows,
- multiplying a row with a non-zero constant, and
- replacing a row with the sum of that row and another row.
Theorem 1.1 Let \(E_{n\times n}\) be an elementary matrix. Then
- \(E\) is invertible.
- There exist a number \(k\) of elementary matrices \(E_1, \dots, E_k\) such that \(E^{-1} = E_1 \cdots E_k\).
Definition 1.4 (Row Reduced Echelon Form) Applying the three elementary row operations sequentially to any given matrix, we can find its row reduced echelon form (RREF).
The RREF of a matrix has the following properties:
- all the zero rows are at the bottom,
- the first non-zero entry of each non-zero row is 1,
- if the first non-zero entry of a row occurs on column \(j\) and if the first non-zero entry of the next row occurs on column \(j'\), then \(j < j'\), and
- if the first non-zero entry of a row occurs on column \(j\), then all the earlier rows have zeros on column \(j\).
Theorem 1.2 Every matrix has a unique RREF.
1.1.3 Rank
Definition 1.5 (Rank) The rank of \(A_{m\times n}\) is the number of non-zero rows in its RREF \(A'\).
Definition 1.6 (Pivotal Column) A pivotal column is one which contains the first non-zero entry of some row.
Corollary 1.1 If \(A\) is \(m\times n\), then \(\text{rk}A = \text{rk}A' \leq \min\{m, n\}\).
Also, for a given \(A_{m\times n}\):
- \(\text{rk}A = m\) if and only if \(A'\) has no zero rows.
- \(\text{rk}A = n\) if and only if all columns of \(A'\) are pivotal.
Theorem 1.3 Let \(A\) be \(m \times n\). Then \(\text{rk}A = m\) if and only if, \(\forall b \in \mathscr{R}^m\), \(\exists x \in \mathscr{R}^n\) such that \(Ax = b\). Namely, there is at least one solution to the system.
Theorem 1.4 Let \(A\) be \(m \times n\). Then \(\text{rk}A = n\) if and only if, \(\forall b \in \mathscr{R}^m\), if \(Ax = Ay = b\), then \(x = y\). Namely, there is at most one solution to the system.
Corollary 1.2 Let \(A\) be \(m\times n\). For each \(b \in \mathscr{R}^m\), \(Ax = b\) has a unique solution in \(\mathscr{R}^n\) if and only if \(\text{rk}A = m = n\).
1.1.4 Subspaces
Definition 1.7 (Subspace) A non-empty set \(S \subseteq \mathscr{R}^m\) is a subspace of \(\mathscr{R}^m\) if \(\alpha x + \beta y \in S\) whenever \(x, y \in S\) and \(\alpha, \beta \in \mathscr{R}\).
Proposition 1.2 The intersection of members of an arbitrary family of subspaces is a subspace. The union of subspaces need not be a subspace.
Definition 1.8 (Range) The range of \(A_{m\times n}\) is the set \[ R(A) = \left\{ b \in \mathscr{R}^m : \exists x \in \mathscr{R}^n,\ Ax = b \right\}. \]
Definition 1.9 (Null Space) The null space of \(A_{m\times n}\) is the set \[ N(A) = \left\{ x \in \mathscr{R}^n : Ax = 0 \right\}. \]
Proposition 1.3 For any \(A_{m\times n}\), the following propositions hold:
- \(N(A) = N(A')\),
- \(R(A) \neq R(A')\),
- \(N(A) \subseteq N(BA)\),
- \(R(BA) \subseteq R(B)\),
- \(R(A) = \mathscr{R}^m \iff \text{rk}A = m\), and
- \(N(A) = \{0\} \iff \text{rk}A = n\).
Definition 1.10 (Linear Span) Vectors \(a_1, \dots, a_n \in S\) span \(S\) if, for every \(x \in S\), there exist \(\alpha_1, \dots, \alpha_n \in \mathscr{R}\) such that \(\displaystyle x = \sum_{i = 1}^n \alpha_i a_i\).
Namely, each \(x\) is a linear combination of \(a_1, \dots, a_n\).
Definition 1.11 (Linear Independence) Vectors \(a_1, \dots, a_n\) in \(\mathscr{R}^m\) are linearly independent if, for all \(\alpha_1, \dots, \alpha_n \in \mathscr{R}\), \[ \sum_{i=1}^n \alpha_i a_i = 0 \implies \alpha_1 = \cdots = \alpha_n = 0. \]
A linearly independent set cannot contain \(0\).
Proposition 1.4 For any given matrix, the following propositions hold:
- the non-zero rows of any matrix in RREF are linearly independent,
- all subsets of a linearly independent set are also linearly independent, and
- a set \(\{a_1, \dots, a_n\}\) in \(\mathscr{R}^m\) is linearly independent if and only if \(N(A_{m\times n}) = \{0_{n\times 1}\}\), where \(\displaystyle A = \left[ a_1 \mid \cdots \mid a_n \right]_{m\times n}\) is formed by merging members of \(\{a_1, \dots, a_n\}\) as columns.
Lemma 1.1 \(\{a_1, \dots, a_n\} \subset \mathscr{R}^m\) is linearly dependent if and only if, for some \(k \in \{2, \dots, n\}\), \(a_k\) is a linear combination of \(a_1, \dots, a_{k-1}\).
Definition 1.12 (Basis) Let \(S\) be a subspace of \(\mathscr{R}^m\). A set \(\{a_1, \dots, a_n\} \subseteq S\) is a basis for \(S\) if it spans \(S\) and is linearly independent.
Theorem 1.5 Let \(S\) be a subspace of \(\mathscr{R}^m\) with basis \(\{a_1, \dots, a_n\}\). If \(\{b_1, \dots, b_r\} \subset S\) is linearly independent, then \(r \leq n\).
Corollary 1.3 Any \(m + 1\) vectors in \(\mathscr{R}^m\) are linearly dependent.
Corollary 1.4 If \(\{a_1, \dots, a_n\}\) and \(\{b_1, \dots, b_r\}\) are two bases for \(S\), then \(n = r\).
Definition 1.13 (Dimension) The dimension of a subspace \(S\) is the number of elements in any basis for \(S\).
Theorem 1.6 (Fundamental Theorem of Linear Algebra I) Let \(A\) be \(m\times n\). Then
- \(\dim R(A^T) = \text{rk}A\),
- \(\text{rk}A = \text{rk}A^T\), and
- \(\dim N(A) = n - \text{rk}A\).
1.1.5 Orthogonality
Definition 1.14 (Orthogonal Complement) Let \(S\) be a subspace of \(\mathscr{R}^m\). The orthogonal complement of \(S\) is the set \[ S^\perp = \left\{ y \in \mathscr{R}^m : y \cdot x = 0,\ \forall x \in S \right\}. \]
Lemma 1.2 For any subspace \(S\) of \(\mathscr{R}^m\), \(S \cap S^\perp = \{0\}\).
Theorem 1.7 (Fundamental Theorem of Linear Algebra II) Let \(A\) be \(m\times n\). Then
- \(R(A) = N(A^T)^\perp\), and
- \(R(A)^\perp = N(A^T)\).
Corollary 1.5 For any subspace \(S\) of \(\mathscr{R}^m\), \((S^\perp)^\perp = S\).
1.1.6 Inverse
Theorem 1.8 Let \(A\) and \(C\) be both \(n\times n\). If \(AC = I\), then \(\text{rk}A = \text{rk}C = n\).
Theorem 1.9 Let \(A\) and \(C\) be both \(n\times n\). If \(AC = I\), then \(CA = I\).
Definition 1.15 (Invertible Matrix) \(A_{n\times n}\) is invertible if there exists \(C_{n\times n}\) such that \(AC =CA = I\).
Theorem 1.10 If \(AC = CA = I\) and if \(AB = BA =I\), then \(C = B\).
Remark. If \(A\) and \(B\) are invertible, then \(AB\) is invertible as well.
Remark. If \(A\) is invertible, then so is \(A^T\) and \((A^T)^{-1} = (A^{-1})^T\).
1.2 Real Analysis
1.2.1 Basic Notions of Topology
Definition 1.16 (Metric) Let \(X\) be a set. A function \(d : X \times X \to \mathscr{R}\) is a metric on \(X\) if, for all \(x, y, z \in X\), the following conditions hold:
- \(d(x,y) \geq 0\),
- \(d(x,y) = 0 \iff x = y\),
- \(d(x, y) = d((y, x)\), and
- \(d(x, y) \leq d(x, z) + d(z, y)\).
Definition 1.17 (Metric Space) A metric space is a pair \((X, d)\) where \(X\) is a set and \(d\) is a metric on \(X\).
Definition 1.18 (Open Ball) Let \((X, d)\) be a metric space. For any \(x \in X\) and \(\varepsilon > 0\), \(B(x, \varepsilon) = \{ y \in X : d(x, y) < \varepsilon \}\) is the open ball centered at \(x\) with radius \(\varepsilon\).
Definition 1.19 (Interior) Let \(S \subseteq X\). \(x\) is an interior point of \(S\) if \(B(x, \varepsilon) \subset S\) for some \(\varepsilon > 0\). The set of all interior points of \(S\) is the interior of \(S\).
Remark. \[ \text{int}(S) \subseteq S \]
Definition 1.20 (Open Set) \(S\) is open if \(S = \text{int}(S)\).
Remark. \(\varnothing\), \(\text{int}(S)\) and \(B(x, \varepsilon)\) are open.
Definition 1.21 (Closure) Let \(S \subseteq X\). \(x\) is an closure point of \(S\) if \(B(x, \varepsilon) \cap S \neq \varnothing\) for every \(\varepsilon > 0\). The set of all closure points of \(S\) is the closure of \(S\).
Remark. \[ S \subseteq \text{cl}(S) \]
Definition 1.22 (Closed Set) \(S\) is closed if \(S = \text{cl}(S)\).
Remark. \(\varnothing\) and \(\text{cl}(S)\) are closed.
Theorem 1.11 \(S\) is closed if and only if \(X \setminus S\) is open.
Definition 1.23 (Boundary) Let \(S \subseteq X\). \(x\) is an boundary point of \(S\) if \(B(x, \varepsilon) \cap S \neq \varnothing\) and \(B(x, \varepsilon) \cap (X \setminus S) \neq \varnothing\) for every \(\varepsilon > 0\). The set of all boundary points of \(S\) is the boundary of \(S\).
Remark.
- \(\partial S = \partial(X \setminus S)\).
- \(\partial S \subseteq \text{cl}(S)\).
- \(S\) is closed if and only if \(\partial S \subseteq S\).
- \(\partial S\) is a closed set.
Definition 1.24 (Metric Subspace) Let \((X, d)\) be a metric space and \(S \subseteq X\). \(d\) is a metric on \(S\) and therefore \((S, d)\) is a metric space as well, usually referred to as a metric subspace of \((X, d)\).
Theorem 1.12 Let \((S, d)\) be a metric subspace of \((X, d)\).
- \(A \subseteq S\) is open in \((S, d)\) if and only if there exists an open set \(U\) in \((X, d)\) such that \(A = S \cap U\).
- \(A \subseteq S\) is closed in \((S, d)\) if and only if there exists a closed set \(U\) in \((X, d)\) such that \(A = S \cap U\).
1.2.2 Sequences
Definition 1.25 (Sequence) Let \((X, d)\) be a metric space. A sequence in \((X, d)\) is a map \(f : \mathscr{N}\to X\).
Definition 1.26 (Convergent Sequence) A sequence \(\{x_k\}\) is said to be convergent if the following property holds: there exists some \(x \in X\) such that, for every \(\varepsilon > 0\), there exists some \(l \in \mathscr{N}\) such that, for every \(k > l\), \(x_k \in B(x, \varepsilon)\). Such \(x\) is called the limit of \(\{x_k\}\).
Theorem 1.13 If \(\lim x_k\) exists, it is unique.
Definition 1.27 (Subsequence) Let \(\{x_k\}\) be a sequence. A subsequence of \(\{x_k\}\) is a sequence obtained by deleting some (possibly none, possibly infinitely many) members of \(\{x_k\}\). Namely, let \(\{k_n\}\) be a non-decreasing sequence of integers. Then \(\{x_{k_{n}}\}\) is a subsequence of \(\{x_k\}\).
Theorem 1.14 \(\{x_k\}\) is convergent with a limit \(x\) if and only if every subsequence of \(\{x_k\}\) is convergent with limit \(x\).
Theorem 1.15 Let \((X, d)\) be a metric space and \(S \subseteq X\). \(x \in \text{cl}(S)\) if and only if there exists a sequence \(\{x_k\}\) such that \(x_k \in S\) for all \(k\), and \(\lim x_k = x\).
Theorem 1.16 \(S\) is closed if and only if every convergent sequence in \(S\) converges to an element of \(S\).
Definition 1.28 (Bounded Sequence) Let \((X, d)\) be a metric space. A sequence \(\{x_k\}\) in \(X\) is bounded if it is contained in a ball with finite radius, i.e., if for some \(\varepsilon > 0\) and \(y \in X\), \(x_k \in B(y, \varepsilon)\) for every \(k\).
Theorem 1.17 (Bolzano–Weierstrass) Every bounded sequence in \(\mathscr{R}^m\) has a convergent subsequence.
Definition 1.29 (Cauchy Sequence) Let \((X, d)\) be a metric space. A sequence \(\{x_k\}\) in \(X\) is a Cauchy sequence if, for every \(\varepsilon > 0\), there exists a positive integer \(k_\varepsilon\) such that, whenever \(k, l > k_\varepsilon\), \(d(x_k, x_l) < \varepsilon\).
Theorem 1.18 If \(\{x_k\}\) is convergent, then \(\{x_k\}\) is Cauchy. If \(\{x_k\}\) is Cauchy, then \(\{x_k\}\) is bounded.
Definition 1.30 (Completeness) Let \((X, d)\) be a metric space. \((X, d)\) is complete if every Cauchy sequence in \(X\) converges to an element of \(X\).
Theorem 1.19 Let \((X, d)\) be complete and \(S \subseteq X\). \((S, d)\) is complete if and only if \(S\) is a closed subset of \(X\).
Definition 1.31 (Open Cover) Let \((X, d)\) be a metric space and \(S \subseteq X\). A collection of open sets \(O_i\), \(i \in I\), is an open cover for \(S\) if \(\displaystyle S \subset \bigcup_{i \in I} O_i\).
Definition 1.32 (Finite Subcover) Let \(O_i\), \(i \in I\), be an open cover for \(S\). If \(I_0\) is a finite subset of \(I\) and \(\displaystyle S \subset \bigcup_{i \in I_0} O_i\), then the collection \(O_i\), \(i \in I_0\), is a finite subcover of \(S\).
Definition 1.33 (Compactness) \(S\) is compact if every open cover of \(S\) has a finite subcover.
Theorem 1.20 (Compactness) A compact set is bounded and closed.
Theorem 1.21 (Sequential Compactness) \(S\) is compact if and only if every sequence in \(S\) has a subsequence that converges to a point in \(S\).
Theorem 1.22 (Heine-Borel) A subset of \(\mathscr{R}^m\) is compact if and only if it is closed and bounded.
1.2.3 Continuity
Definition 1.34 (Continuity) Let \((X, d)\) and \((Y, \sigma)\) be metric spaces. A function \(f : X \to Y\) is continuous at \(x \in X\) if, for every \(\varepsilon > 0\), there exists \(\delta > 0\) such that, whenever \(x' \in B(x, \delta)\), \(f(x') \in B(f(x), \varepsilon)\).
Definition 1.35 (Global Continuity) \(f\) is continuous if it is continuous at \(x\) for every \(x \in X\).
Theorem 1.23 \(f\) is continuous at \(x\) if and only if for every sequence \(\{x_k\}\) in \(X\), whenever \(\lim x_k = x\), \(\lim f(x_k) = f(x)\).
Definition 1.36 (Image) Let \((X, d)\) and \((Y, \sigma)\) be metric spaces and \(f : X \to Y\). If \(S \subseteq X\), then \(f(S) = \{ f(x) \in Y : x \in S \}\).
Definition 1.37 (Inverse Image) Let \((X, d)\) and \((Y, \sigma)\) be metric spaces and \(f : X \to Y\). If \(S \subseteq Y\), then \(f^{-1}(S) = \{ x \in X : f(x) \in S \}\).
Theorem 1.24 The following are equivalent:
- \(f\) is continuous.
- \(f^{-1}(S)\) is open if \(S\) is open.
- \(f^{-1}(S)\) is closed if \(S\) is closed.
Definition 1.38 (Upper Semicontinuous) Let \((X, d)\) be a metric space. A function \(f : X \to \mathscr{R}\) is upper semicontinuous if \(\{x \in X : f(x) \geq \alpha \}\) is closed for every \(\alpha \in \mathscr{R}\).
Definition 1.39 (Lower Semicontinuous) Let \((X, d)\) be a metric space. A function \(f : X \to \mathscr{R}\) is lower semicontinuous if \(\{x \in X : f(x) \leq \alpha \}\) is closed for every \(\alpha \in \mathscr{R}\).
Theorem 1.25 \(f : X \to \mathscr{R}\) is continuous if and only if \(f\) is upper semicontinuous and lower semicontinuous.
Definition 1.40 (Maximizer / Minimizer) Let \((X, d)\) be a metric space and \(S \subseteq X\). A function \(f : S \to \mathscr{R}\) has a maximizer if for some \(x^* \in S\), \(f(x) \leq f(x^*)\) for every \(x \in S\). Similarly, \(f\) has a minimizer if for some \(x_* \in S\), \(f(x) \geq f(x_*)\) for every \(x \in S\).
Theorem 1.26 (Weierstrass) Let \(S\) be compact and \(f : S \to \mathscr{R}\) be continuous. Then \(f\) has a maximizer and a minimizer.
Theorem 1.27 Let \(S\) be compact and \(f : S \to \mathscr{R}\) be upper semicontinuous. Then \(f\) has a maximizer.
Theorem 1.28 Let \(S\) be compact and \(f : S \to \mathscr{R}\) be lower semicontinuous. Then \(f\) has a minimizer.
Theorem 1.29 (Separation) Let \((X, d)\) be a metric space. If \(S\) and \(T\) are disjoint and closed subsets of \(X\), then there exists disjoint and open sets \(U\) and \(V\) such that \(U \supset S\) and \(V \supset T\).
1.3 Differentiable Functions
Definition 1.41 (Differentiability) Let \(S \subseteq \mathscr{R}\), \(x \in S\) and \(f : S \to \mathscr{R}\). Then \(f\) is differentiable at \(x\) if \[ \lim_{u \to 0} \frac{f(x + u) - f(x)}{u} \] exists.
Namely, \(f\) is differentiable at \(x\) if there is some \(y \in \mathscr{R}\) such that, for every \(\varepsilon > 0\), there exists some \(\delta > 0\) such that \[ \left\{ x + u \in B_\delta(x) \cap S : u \neq 0 \right\} \implies \left\lvert \frac{f(x + u) - f(x)}{u} - y \right\rvert < \varepsilon \] where \(y = f'(x)\).
Definition 1.42 (Partial Derivative) Let \(S \subseteq \mathscr{R}^n\), \(x \in S\) and \(f : S \to \mathscr{R}\). The \(j\)th partial derivative of \(f\) at \(x\) is \[ D_j f(x) := \lim_{u \to 0} \frac{f(x + ue_j) - f(x)}{u} \] if this limit exists.
Definition 1.43 (Gradient) If \(D_j f(x)\) exists for all \(j = 1, \dots, n\), then the gradient of \(f\) at \(x\) is the vector \[ \nabla f(x) := \begin{bmatrix} D_1 f(x) \\ \vdots \\ D_n f(x) \end{bmatrix} \in \mathscr{R}^n. \]
Definition 1.44 (Jacobian) Let \(S \subseteq \mathscr{R}^n\), \(x \in S\) and \(f : S \to \mathscr{R}^m\). Suppose that \(D_j f_i(x)\) exists for all \(i\) and \(j\), then the Jacobian of \(f\) at \(x\) is the \(m\times n\) matrix \[ J_f(x) := \begin{bmatrix} D_1 f_1(x) & \cdots & D_n f_1(x) \\ \vdots & \ddots & \vdots \\ D_1 f_m(x) & \cdots & D_n f_m(x) \end{bmatrix}_{m\times n} = \begin{bmatrix} \nabla f_1(x)^T \\ \vdots \\ \nabla f_m(x)^T \end{bmatrix}. \]
Definition 1.45 (Differentiability) Let \(S \subseteq \mathscr{R}^n\). Then \(f : S \to \mathscr{R}^m\) is differentiable at \(x \in S\) if the following two conditions hold:
- \(D_j f_i (x)\) exists for all \(i\) and \(j\),
- we have \[ \lim_{u \to 0} \frac{\lVert f(x + u) - f(x) - J_f(x)u \rVert}{\lVert u \rVert} = 0. \]
The second condition holds if and only if, for every \(\varepsilon > 0\), there is some \(\delta > 0\) such that, whenever \(x + u \in B_\delta(x) \cap S\) and \(u \neq 0_{n\times 1}\), we have
\[ \frac{\lVert f(x + u) - f(x) - J_f(x)u \rVert}{\lVert u \rVert} < \varepsilon. \]
Theorem 1.30 Suppose \(S\subseteq\mathscr{R}^n\), \(x \in S\) and \(f : S \to \mathscr{R}^m\). Suppose that, for each \(i\) and \(j\), \(D_j f_i(x)\) exists and that \(x \mapsto D_j f_i(x)\) is continuous on \(S\). Then \(f\) is differentiable at \(x\).
Theorem 1.31 (Implicit Function) Let \(F : \mathscr{R}^2 \to \mathscr{R}\). Suppose that \(D_1F(x,y)\) and \(D_2F(x,y)\) exist and are continuous in an open \(U\) containing \((a, b) \in \mathscr{R}^2\). Suppose that \(F(a, b) = 0\) and that \(D_2F(a, b) \neq 0\). Then there exists \(\varepsilon > 0\) and \(g : B_\varepsilon \to \mathscr{R}\) such that \(g(a) = b\), \(F(x, g(x)) = 0\) for all \(x \in B_\varepsilon (a)\) and \[ g'(a) = -\frac{D_1F(a,b)}{D_2F(a,b)}. \]
Theorem 1.32 (Implicit Function) Suppose \(F : \mathscr{R}^{n + m} \to \mathscr{R}^m\) is differentiable in an open set \(U\) containing \((a, b) \in \mathscr{R}^{n + m}\) and suppose that the entries of \(J_F (x,y)\) are continuous at each \((x,y) \in U\). Let \(A_{m\times n}\) and \(B_{m\times m}\) be defined by \(J_F (a, b)\) as \[ A = \begin{bmatrix} D_1 F_1(a,b) & \cdots & D_n F_1(a,b) \\ \vdots & \ddots & \vdots \\ D_1 F_m(a,b) & \cdots & D_n F_m(a,b) \end{bmatrix}_{m\times n} \\ \\ B = \begin{bmatrix} D_{n+1} F_1(a,b) & \cdots & D_{n+m} F_1(a,b) \\ \vdots & \ddots & \vdots \\ D_{n+1} F_m(a,b) & \cdots & D_{n+m} F_m(a,b) \end{bmatrix}_{m\times m} \] and suppose that \(F(a,b) = 0\) and \(B\) is non-singular. Then there exists \(\varepsilon > 0\) and \(g : B_\varepsilon (a) \to \mathscr{R}^m\) such that \(g(a) = b\), \(F(x, g(x)) = 0\) for all \(x \in B_\varepsilon (a)\) and \(J_g (a) = -B^{-1}_{m\times m} A_{m\times n}\).
Theorem 1.33 (Inverse Function) Suppose that \(f : \mathscr{R}^m \to \mathscr{R}^m\) is differentiable in an open set \(U\) containing \(b \in \mathscr{R}^m\). Suppose that the partials of \(f\) are continuous on \(U\). Suppose that \(f(b) = a\) and \(J_f(b)\) is non-singular. Then there exists some \(\varepsilon > 0\) and some \(g : B_\varepsilon (a) \to \mathscr{R}^m\) such that \(g(a) = b\), \(f(g(x)) = x\) for all \(x \in B_\varepsilon (a)\), and \(g\) is differentiable at \(a\) with \(J_g (a) = J_f (b)^{-1}\).
Theorem 1.34 (Weierstrass) Let \(S \subseteq \mathscr{R}^n\) be compact and let \(f : S \to \mathscr{R}\) be continuous. Then there exists \(x'\) and \(x''\) in \(S\) such that, for all \(x \in S\), \(f(x') \leq f(x) \leq f(x'')\).
Theorem 1.35 Suppose \(f : \mathscr{R}^n \to \mathscr{R}\) attains its maximum or minimum on an open set \(U\) at some \(x \in U\). If \(D_i f(x)\) exists, then \(D_i f(x) = 0\).
Lemma 1.3 ("Rolle's) Suppose \(f : [a, b] \to \mathscr{R}\) is continuous at each \(x \in [a, b]\) and \(f\) is differentiable at each \(x \in (a, b)\). Furthermore, suppose that \(f(a) = 0 = f(b)\). Then there exists some \(c \in (a, b)\) such that \(f'(c) = 0\).
Theorem 1.36 (Mean Value) Suppose that \(f : [a, b] \to \mathscr{R}\) is continuous at each \(x \in [a, b]\) and that \(f\) is differentiable at each \(x \in (a, b)\). Then there exists \(c \in (a, b)\) such that \[ f'(c) = \frac{f(b) - f(a)}{b - a}. \]
Theorem 1.37 (Mean Value) Let \(U\) be an open set in \(\mathscr{R}^n\) containing \(a\) and \(b\). Suppose that \((1 - t)a + tb \in U\) whenever \(t \in [0, 1]\). Suppose that \(f : U \to \mathscr{R}\) is differentiable at each \(x \in U\). Then there exists \(t' \in (0,1)\) such that \(\nabla f(c') \cdot (b - a) = f(b) - f(a)\) where \(c' = (1 - t')a + t'b\).
Definition 1.46 (Hessian Matrix) Let \(U\) be open set in \(\mathscr{R}^n\) and let \(f : U \to \mathscr{R}\). The Hessian of \(f\) at \(x \in U\) is the matrix \[ H_f (x) := \begin{bmatrix} D_1 D_1 f(x) & \cdots & D_n D_1 f(x) \\ \vdots & \ddots & \vdots \\ D_1 D_n f(x) & \cdots & D_n D_n f(x) \end{bmatrix}_{n \times n} \]
Theorem 1.38 Suppose \(U\) is open in \(\mathscr{R}^n\), \(f : U \to \mathscr{R}^n\) and \(x \mapsto D_i D_j f(x)\) is continuous at each \(x \in U\) for every \(i\) and \(j\). Then \(H_f (x)\) is symmetric for each \(x \in U\).
Theorem 1.39 Suppose \(U \subseteq \mathscr{R}^n\) is an open set containing \(a\) and \(b\) such that \(f : U \to \mathscr{R}\). Furthermore, suppose that \(U\) contains \((1 - t) a + tb\) for each \(t \in [0, 1]\). Suppose that both \(f\) and \(\nabla f\) are differentiable on \(U\). Then there is some \(t' \in (0, 1)\) such that \[ f(b) = f(a) + \nabla f(a) \cdot (b - a) + \frac{1}{2} (b - a)^T H_f (c') (b - a) \] where \(c' = (1 - t')a + t'b\).
Definition 1.47 (Convex Function) Let \(S \subseteq \mathscr{R}^n\) be convex. A function \(f : S \to \mathscr{R}\) is convex if, for all \(t \in [0,1]\) and for all \(x, x' \in S\), we have \(f((1 - t)x + tx') \leq (1 - t) f(x) + tf(x')\). A function \(f : S \to \mathscr{R}\) is strictly convex if, for all \(t \in (0,1)\) and for all \(x, x' \in S\) such that \(x \neq x'\), we have \(f((1 - t)x + tx') < (1 - t) f(x) + tf(x')\).
Theorem 1.40 Suppose that \(U \subseteq \mathscr{R}^n\) is open and convex and that \(f : U \to \mathscr{R}\) is differentiable. Then
- \(f\) is convex if and only if \(f(x') \geq f(x) + \nabla f(x) \cdot (x' - x)\) for each \(x, x' \in U\).
- \(f\) is strictly convex if and only if \(f(x') > f(x) + \nabla f(x) \cdot (x' - x)\) for each \(x, x' \in U\) such that \(x \neq x'\).
1.4 Correspondences
Definition 1.48 (Correspondence) A correspondence from \(X\) to \(Y\) (both Euclidean) is a map \(\varphi : X \rightrightarrows Y\) which associates with every \(x \in X\) a non-empty set \(\varphi (x) \subseteq Y\). Namely, a correspondence from \(X\) to \(Y\) is a function from \(X\) to \(\{A \subseteq Y : A \neq \varnothing\}\).
Definition 1.49 (Closed-Valued / Open-Valued / Compact-Valued / ... ) A correspondence \(\varphi\) is closed-valued is \(\varphi(x)\) is a closed subset of \(Y\) for every \(x \in X\). We similarly define correspondences which are open-valued, compact-valued, etcetera.
Definition 1.50 (Graph) The graph of \(\varphi\) is the set \[ G_\varphi := \{ (x, y) \in X \times Y : y \in \varphi(x) \} . \]
Proposition 1.5 \(\varphi\) is closed / open if \(G_\varphi\) is a closed / open subset of \(X \times Y\).
Definition 1.51 (Inverses) Let \(U \subseteq Y\). The strong and weak inverses of \(U\) under \(\varphi\) are defined respectively as \[ \varphi_s^{-1} (U) := \{x \in X : \varphi(x) \subseteq U\} \\ \varphi_w^{-1} (U) := \{x \in X : \varphi(x) \cap U \neq \varnothing\} \]
Proposition 1.6
- \(\varphi_s^{-1} (U) \subseteq \varphi_w^{-1} (U)\).
- If \(\varphi\) is such that \(\mid \varphi (x) \mid = 1\) for every \(x\), then \(\varphi_s^{-1} (U) = \varphi_w^{-1} (U)\).
- In general, \(\varphi_w^{-1} (U) = \left(\varphi_s^{-1} \left(U^C\right)\right)^C\).
Definition 1.52 (Upper Hemi-Continuity) A correspondence \(\varphi : X \rightrightarrows Y\) is upper hemi-continuous (UHC) at \(x \in X\) if, whenever \(U\) is an open subset of \(Y\) and \(x \in \varphi_s^{-1} (U)\), \(B(x, \delta) \subseteq \varphi_s^{-1} (U)\) for some \(\delta > 0\). \(\varphi\) is UHC if it is UHC at every \(x \in X\).
Proposition 1.7 \(\varphi\) is UHC if and only if the strong inverse of any open set in \(Y\) under \(\varphi\) is open in \(X\).
Definition 1.53 (Lower Hemi-Continuity) A correspondence \(\varphi : X \rightrightarrows Y\) is lower hemi-continuous (LHC) at \(x \in X\) if, whenever \(U\) is an open subset of \(Y\) and \(x \in \varphi_w^{-1} (U)\), \(B(x, \delta) \subseteq \varphi_w^{-1} (U)\) for some \(\delta > 0\). \(\varphi\) is LHC if it is LHC at every \(x \in X\).
Proposition 1.8 \(\varphi\) is LHC if and only if the weak inverse of any open set in \(Y\) under \(\varphi\) is open in \(X\).
Definition 1.54 (Continuity) A correspondence is continuous if it is LHC and UHC.
Theorem 1.41 Let \(f : X \to Y\) be a function and let \(\varphi : X \rightrightarrows Y\) be defined as \(\varphi (x) = \{f(x)\}\) for every \(x\). Then the following are equivalent:
- \(\varphi\) is UHC.
- \(\varphi\) is LHC.
- \(f\) is continuous.
Proposition 1.9 If the function \(f : X \to Y\) is upper / lower semicontinuous, the correspondence \(\varphi : X \rightrightarrows Y\) defined by \(\varphi (x) = \{f(x)\}\) need not be upper / lower hemicontinuous.
Remark. Closedness and UHC are not nested.
Theorem 1.42 If \(\varphi : X \rightrightarrows Y\) is UHC and closed-valued, then it is also closed.
Theorem 1.43 If \(\varphi : X \rightrightarrows Y\) is closed and \(Y\) is compact, then \(\varphi\) is UHC.
Theorem 1.44 If \(\varphi\) is open, then it is LHC.
Proposition 1.10 For every \(x \in X\), every sequence \(\{x_n\}\) converging to \(x\) and every sequence \(\{y_n\}\) such that \(y_n \in \varphi(x_n)\) for every \(n\), there exists a convergent subsequence of \(\{y_n\}\) with limit in \(\varphi(x)\).
Proposition 1.11 For every \(x \in X\), every sequence \(\{x_n\}\) converging to \(x\) and every \(y \in \varphi(x)\), there exists a sequence \(\{y_n\}\) converging to \(y\) such that \(y_n \in \varphi(x_n)\) for every \(n\).
Theorem 1.45
- If \(\varphi\) satisfies proposition 1.10, then \(\varphi\) is UHC.
- If \(\varphi\) is UHC and compact-valued, then \(\varphi\) satisfies proposition 1.10.
Theorem 1.46 \(\varphi\) satisfies proposition 1.11 if and only if \(\varphi\) satisfies LHC.
Corollary 1.6 If \(\varphi : X \rightrightarrows Y\) is UHC and compact-valued and \(K \subseteq X\) is compact, then \(\displaystyle \varphi(K) := \bigcup_{x \in K} \varphi (x)\) is compact.
Remark. A subset \(S\) of a metric space is compact if and only if every sequence \(S\) has a subsequence that converges to a point in \(S\).
Theorem 1.47 (The Maximum) Let \(T\) and \(X\) be Euclidean sets. Let \(\varphi : T \rightrightarrows X\) and \(f : X \times T \to \mathscr{R}\). For every \(t \in T\), consider the problem \[ \max_{x \in \varphi(t)} f(x, t). \] Define \(\displaystyle \mu (t) = \mathop{\mathrm{\arg\!\max}}_{x \in \varphi(t)} f(x, t)\). Suppose that \(\mu(t) \neq \varnothing\) for every \(t\).
We can define a function \(g : T \to \mathscr{R}\) as \(g(t) = f(x, t)\) for some \(x \in \mu(t)\).
If \(\varphi\) is compact-valued, UHC and LHC and if \(f\) is continuous, then
- \(\mu : T \rightrightarrows X\) is compact-valued and UHC, and
- \(g : T \to \mathscr{R}\) is continuous.
1.5 Convexity
Definition 1.55 (Convex Set) A non-empty set \(S \subseteq \mathscr{R}^m\) is convex if, for every \(x, x' \in S\) and every \(t \in [0, 1]\), \(tx + (1 - t)x' \in S\).
Definition 1.56 (Convex Combination) A convex combination of vectors \(x_1, \dots, x_n\) is a vector of the form \(\displaystyle \sum_{i = 1}^n \alpha_i x_i\) where \(\alpha_1, \dots, \alpha_n\) are non-negative numbers which add up to 1.
Theorem 1.48 \(S \subseteq \mathscr{R}^m\) is convex if and only if \(S = \tilde{S}\) where \[ \tilde{S} := \left\{ \sum_{i = 1}^n \alpha_i x_i : n \in \mathscr{N},\ x_i \in S,\ \alpha_i \geq 0,\ \forall i,\ \sum_{i = 1}^n \alpha_i = 1 \right\}; \] i.e., if and only if it contains all convex combinations of its elements.
Proposition 1.12
- Arbitrary intersections of convex sets are convex.
- \(S + T := \{ s + t : s \in S, t \in T\}\) is convex if \(S\) and \(T\) are convex.
- For every scalar \(\lambda \geq 0\), \(\lambda S := \{\lambda s : s \in S\}\) is convex if \(S\) is convex.
- The closure and interior of a convex set are convex using the Euclidean metric.
Definition 1.57 (Convex Hull) The convex hull of a set \(S \subseteq \mathscr{R}^m\) is the “smallest” convex superset of \(S\). Namely, \[ \text{co}S := \bigcap \left\{ G \subseteq \mathscr{R}^m : S \subseteq G \wedge G\ \text{is convex} \right\}. \]
Proposition 1.13 \(\text{co}S\) is convex and \(S \subseteq \text{co}S\).
Theorem 1.49 \[ \text{co}S = \tilde{S} \]
Proposition 1.14 \(S\) is convex if and only if \(\text{co}S \subseteq S\).
Theorem 1.50 (Carathéodory) Let \(S \subseteq \mathscr{R}^m\) be non-empty. If \(x \in \text{co}S\), then \(x\) can be written as a convex combination of no more than \(m + 1\) members of \(S\), i.e., there exists \(x_1, x_2, \dots, x_{m+1} \in S\) and \(\alpha_1, \alpha_2, \dots, \alpha_{m+1} \geq 0\) with \(\displaystyle \sum_{i = 1}^{m + 1} \alpha_i = 1\) such that \(\displaystyle x = \sum_{i = 1}^{m + 1} \alpha_i x_i\).
Proposition 1.15
- \(\displaystyle \text{co}\left( \sum_{i = 1}^n S_i \right) = \sum_{i = 1}^n \text{co}S_i\).
- If \(A \subset \mathscr{R}^m\) is open, then \(\text{co}A\) is open.
- The convex hull of a closed set in \(\mathscr{R}^m\) need not be closed. But if \(K \subset \mathscr{R}^m\) is compact, then \(\text{co}K\) is compact.
Theorem 1.51 (Shapley-Folkman) Let \(S_i \subseteq \mathscr{R}^m\) for every \(i = 1, \dots, n\), and let \(\displaystyle x \in \text{co}\sum_{i = 1}^n S_i\). Then there exists \(x_1, \dots, x_n\) such that
- \(x_i \in \text{co}S_i\) for every \(i\),
- \(\displaystyle x = \sum_{i = 1}^n x_i\), and
- \(\# \{i : x_i \notin S_i\} \leq m\).
Remark. If \(x \in \mathscr{R}^m\) can be written as \(\displaystyle x = \sum_{i = 1}^k \alpha_i x_i\) where \(\alpha_1, \dots, \alpha_k \in \mathscr{R}_+\), \(x_1, \dots, x_k \in \mathscr{R}^m\) and, if \(k > m\), then there exist \(\beta_1, \dots, \beta_k \in \mathscr{R}_+\) with \(\# \{i : \beta_i > 0\} \leq m\) such that \(\displaystyle x = \sum_{i = 1}^k \beta_i x_i\).
Scalars \(\alpha_i\) and \(\beta_i\) do not need to add up to 1.
Definition 1.58 (Hyperplane) Fix \(p \in \mathscr{R}^m\) and \(\alpha \in \mathscr{R}\). The hyperplane formed by \(p\) and \(\alpha\) is \[ H(p; \alpha) := \left\{ x \in \mathscr{R}^m : p \cdot x = \alpha \right\} \] where \(p\) is called the normal vector of \(H(p; \alpha)\).
Theorem 1.52 (Minkowski) Let \(S \subseteq \mathscr{R}^m\) be non-empty, convex and closed, and let \(\bar{x} \notin S\). There exists \(p \in \mathscr{R}^m \setminus \{0\}\) and \(x_0 \in S\) such that \(p \cdot \bar{x} > p \cdot x_0 \geq p \cdot x\) for every \(x \in S\).
Theorem 1.53 Suppose that \(S \subseteq \mathscr{R}^m\) is non-empty and convex, and that \(x_n\) is a sequence in \(\mathscr{R}^m\setminus \text{cl}S\). If \(x_n \to \bar{x}\), then there exists \(p \in \mathscr{R}^m \setminus \{0\}\) such that, for every \(x \in S\), \(p \cdot x \leq p \cdot \bar{x}\).
Theorem 1.54 (Supporting Hyperplane) Suppose that \(S \subseteq \mathscr{R}^m\) is non-empty and convex. If \(\bar{x} \in \partial S\), then there exists \(p \in \mathscr{R}^m \setminus \{0\}\) such that \(p \cdot \bar{x} \geq p \cdot x\) for every \(x \in S\).
Theorem 1.55 Suppose that \(S \subseteq \mathscr{R}^m\) is a non-empty and convex set, and let \(\bar{x} \notin S\). Then there exists \(p \in \mathscr{R}^m \setminus \{0\}\) such that \(p \cdot x \leq p \cdot \bar{x}\) for every \(x \in S\).
Theorem 1.56 (Separating Hyperplane) Let \(S\) and \(T\) be disjoint and convex subsets of \(\mathscr{R}^m\). There exists \(p \in \mathscr{R}^m \setminus \{0\}\) such that \(p \cdot s \leq p \cdot t\) for every \((s, t) \in S \times T\).
Definition 1.59 (Convex Function) Let \(S \subseteq \mathscr{R}^n\) be convex. A function \(f : S \to \mathscr{R}\) is convex if \(f(tx + (1 - t)y) \leq tf(x) + (1 - t)f(y)\), \(\forall x, y \in S\), \(t \in [0, 1]\). Similarly, \(f\) is said to be strictly convex if \(f(tx + (1 - t)y) < tf(x) + (1 - t)f(y)\), \(\forall x, y \in S\), \(t \in (0, 1)\).
Definition 1.60 (Concave Function) Let \(S \subseteq \mathscr{R}^n\) be convex. A function \(f : S \to \mathscr{R}\) is concave if \(f(tx + (1 - t)y) \geq tf(x) + (1 - t)f(y)\), \(\forall x, y \in S\), \(t \in [0, 1]\). Similarly, \(f\) is said to be strictly concave if \(f(tx + (1 - t)y) > tf(x) + (1 - t)f(y)\), \(\forall x, y \in S\), \(t \in (0, 1)\).
Definition 1.61 (Subgradient) Suppose \(S \subseteq \mathscr{R}^n\) is convex and \(f : S \to \mathscr{R}\). A vector \(p \in \mathscr{R}^n\) is a subgradient for \(f\) at \(x \in S\) if \[ f(y) \geq f(x) + p \cdot (x - y), \] for all \(y \in S\).
Definition 1.62 (Supergradient) Suppose \(S \subseteq \mathscr{R}^n\) is convex and \(f : S \to \mathscr{R}\). A vector \(p \in \mathscr{R}^n\) is a supergradient for \(f\) at \(x \in S\) if \[ f(y) \leq f(x) + p \cdot (x - y), \] for all \(y \in S\).
Theorem 1.57 Suppose that \(S \subseteq \mathscr{R}^n\) is convex and \(f : S \to \mathscr{R}\). If \(f\) has a subgradient at every \(x \in S\), then \(f\) is convex.
Theorem 1.58 Suppose that \(S \subseteq \mathscr{R}^n\) is convex, \(f : S \to \mathscr{R}\) is convex, and \(x \in \text{int}(S)\). Then \(f\) is continuous at \(x\).
Theorem 1.59 Suppose that \(S \subseteq \mathscr{R}^n\) is convex, \(f : S \to \mathscr{R}\) is convex, and \(x \in \text{int}(S)\). Then \(f\) has a subgradient at \(x\).
Theorem 1.60 Suppose that \(S \subseteq \mathscr{R}^n\) is convex, \(f : S \to \mathscr{R}\) is convex, \(x \in \text{int}(S)\), and \(f\) is differentiable at \(x\). Then \(\nabla f(x)\) is the unique subgradient of \(f\) at \(x\).
Theorem 1.61 Suppose that \(S \subseteq \mathscr{R}^n\) is convex and \(f : S \to \mathscr{R}\). If \(p_1\) is a subgradient of \(f\) at \(x_1\) and \(p_2\) is a subgradient of \(f\) at \(x_2\), then \((p_1 - p_2) \cdot (x_1 - x_2) \geq 0\).
Theorem 1.62 Suppose that \(S \subseteq \mathscr{R}^n\) is open and convex, and \(f : S \to \mathscr{R}\) is differentiable and convex. Then \(\left(\nabla f(x_1) - \nabla f(x_2)\right) \cdot (x_1 - x_2) \geq 0\).
1.6 Support Functions
1.6.1 Maximization
Definition 1.63 (Barrier Cone) For any non-empty \(S \subseteq \mathscr{R}^n\), the barrier cone of \(S\) is defined as \[ b(S) := \left\{ p \in \mathscr{R}^n : \max_{x \in S}\ \text{is well-defined} \right\}. \]
Definition 1.64 (Support Function) For any non-empty \(S \subseteq \mathscr{R}^n\), the support function of \(S\) is a function \(\sigma_S : b(S) \to \mathscr{R}\) such that \[ \sigma_S (p) := \max_{x \in S} p \cdot x. \]
Remark (Cone). If \(p \in b(S)\) and \(\lambda \in \mathscr{R}_+\), then \(\lambda p \in b(S)\) as well. Hence \(b(S)\) is a cone and, in particular, \(0 \in b(S)\).
Proposition 1.16 \(b(S)\) need not be convex even if \(S\) is.
Theorem 1.63 Let \(S \subseteq \mathscr{R}^n\) be non-empty.
- If \(x_1\) solves \(\displaystyle \max_{x \in S} p_1 \cdot x\) and \(x_2\) solves \(\displaystyle \max_{x \in S} p_2 \cdot x\), then \((x_1 - x_2)\cdot(p_1 - p_2) \geq 0\). This is known as monotonicity of solutions.
- \(\sigma_S(tp) = t\sigma_S(p)\) for every \(p \in b(S)\) and \(t \geq 0\). Namely, homogeneity of degree 1.
- If \(b(S)\) is convex, then \(\sigma_S\) is convex.
- Suppose \(S \subseteq \mathscr{R}^n\) is closed and convex, and \(p \in b(S)\). \(x_p\) solves \(\displaystyle \max_{x \in S} p \cdot x\) if and only if \(x_p\) is a subgradient of \(\sigma_S\) at \(p\).
- Suppose \(S \subseteq \mathscr{R}^n\) is closed and convex, \(p \in \text{int}(b(S))\) and \(\sigma_S\) is differentiable at \(p\). Then \(x_p\) solves \(\displaystyle \max_{x \in S} p \cdot x\) if and only if \(x_p = \nabla \sigma_S (p)\). Also known as Hotelling’s Lemma in profit maximization.
1.6.2 Minimization
Definition 1.65 (Barrier Cone) For any non-empty \(S \subseteq \mathscr{R}^n\), the barrier cone of \(S\) is defined as \[ b^-(S) := \left\{ p \in \mathscr{R}^n : \min_{x \in S}\ \text{is well-defined} \right\}. \]
Definition 1.66 (Support Function) For any non-empty \(S \subseteq \mathscr{R}^n\), the support function of \(S\) is a function \(\tau_S : b^-(S) \to \mathscr{R}\) such that \[ \tau_S (p) := \min_{x \in S} p \cdot x. \]
Remark (Cone). Note that \(b^-(S) = -b(S) = \left\{ p \in \mathscr{R}^n : -p \in b(S) \right\}\). Furthermore, \(\tau_S (p) = -\sigma_S (-p)\).
Theorem 1.64 Let \(S \subseteq \mathscr{R}^n\) be non-empty.
- If \(x_1\) solves \(\displaystyle \min_{x \in S} p_1 \cdot x\) and \(x_2\) solves \(\displaystyle \min_{x \in S} p_2 \cdot x\), then \((x_1 - x_2)\cdot(p_1 - p_2) \leq 0\). This is known as monotonicity of solutions.
- \(\tau_S(tp) = t\tau_S(p)\) for every \(p \in b^-(S)\) and \(t \geq 0\). Namely, homogeneity of degree 1.
- If \(b^-(S)\) is convex, then \(\tau_S\) is concave.
- Suppose \(S \subseteq \mathscr{R}^n\) is closed and convex, and \(p \in b^-(S)\). \(x_p\) solves \(\displaystyle \min_{x \in S} p \cdot x\) if and only if \(x_p\) is a supergradient of \(\tau_S\) at \(p\).
- Suppose \(S \subseteq \mathscr{R}^n\) is closed and convex, \(p \in \text{int}(b^-(S))\) and \(\tau_S\) is differentiable at \(p\). Then \(x_p\) solves \(\displaystyle \min_{x \in S} p \cdot x\) if and only if \(x_p = \nabla \tau_S (p)\). Also known as Shephard’s Lemma in cost minimization.
1.7 Nonlinear Programming
1.7.1 Minimization Problems
Definition 1.67 (Nonlinear Programming Problem) Let \(f : \mathscr{R}^n \to \mathscr{R}\), \(g_1 : \mathscr{R}^n \to \mathscr{R}\), . . . , \(g_m : \mathscr{R}^n \to \mathscr{R}\) be differentiable functions. The NLP problem is
\[ \begin{align} \min_{x \in \mathscr{R}^n} \quad & f(x) \\ \text{subject to} \quad & g_1(x) \geq 0, \\ & \quad \vdots \\ & g_m(x) \geq 0. \end{align} \]
1.7.1.1 Kuhn-Tucker Necessity Conditions
Definition 1.68 (Kuhn-Tucker Pair for NLP) A pair \(\left(\bar{x}, \bar{\lambda}\right) \in \mathscr{R}^{n + m}\) is a Kuhn-Tucker pair for NLP if
- \(g_i\left(\bar{x}\right) \geq 0\) for all \(i = 1, \dots, m\),
- \(\bar{\lambda}_i \geq 0\) for all \(i = 1, \dots, m\),
- \(\displaystyle \nabla f\left(\bar{x}\right) = \sum_{i = 1}^m \nabla g_i\left(\bar{x}\right)\bar{\lambda}_i\), and
- \(\bar{\lambda}_i g_i\left(\bar{x}\right) = 0\) for all \(i = 1, \dots, m\).
We will say that \(\bar{x}\) satisfies the Kuhn-Tucker conditions for NLP if there exists some \(\bar{\lambda}\) such that \(\left( \bar{x}, \bar{\lambda} \right)\) is a Kuhn-Tucker pair for NLP.
Definition 1.69 (Active Constraints) For any \(x \in \mathscr{R}^n\), let \(I(x) = \left\{ i = 1, \dots, m : g_i(x) = 0 \right\}\). Hence \(I(x)\) is the set of active constraints at \(x\).
Lemma 1.4 Suppose that \(\left( \bar{x}, \bar{\lambda} \right)\) is a Kuhn-Tucker pair for NLP. If \(I\left(\bar{x}\right) = \varnothing\), then \(\nabla f\left(\bar{x}\right) = 0\). If \(I\left(\bar{x}\right) \neq \varnothing\), then \(\displaystyle \nabla f\left(\bar{x}\right) = \sum_{i = 1}^m \nabla g_i\left(\bar{x}\right)\bar{\lambda}_i\).
Lemma 1.5 Suppose that
- \(g_i\left(\bar{x}\right) \geq 0\) for all \(i = 1, \dots, m\), and
- \(\left\{ \bar{\mu}_i \right\}_{i \in I\left(\bar{x}\right)}\) is a collection of non-negative numbers satisfying \(\displaystyle \nabla f\left(\bar{x}\right) = \sum_{i \in I\left(\bar{x}\right)} \nabla g_i\left(\bar{x}\right)\bar{\mu}_i\).
If \(\bar{\lambda}_i = \bar{\mu}_i\) for all \(i \in I\left(\bar{x}\right)\) and \(\bar{\lambda}_i = 0\) for all \(i \notin I\left(\bar{x}\right)\), then \(\left( \bar{x}, \bar{\lambda} \right)\) is a Kuhn-Tucker pair for NLP.
Definition 1.70 (Linear Programming Problem) For \(a_1, \dots, a_m, c \in \mathscr{R}^n\) and \(b_1, \dots, b_m \in \mathscr{R}\), consider the following LP problem:
\[ \begin{align} \min_{x \in \mathscr{R}^n} \quad & c \cdot x \\ \text{subject to} \quad & a_1 \cdot x \geq b_1, \\ & \qquad \vdots \\ & a_m \cdot x \geq b_m. \end{align} \] LP is a special case of NLP where \(f(x) = c \cdot x\) and \(g_i(x) = a_i \cdot x - b_i\).
Definition 1.71 (Kuhn-Tucker Pair for LP) A pair \(\left(\bar{x}, \bar{\lambda}\right) \in \mathscr{R}^{n + m}\) is a Kuhn-Tucker pair for LP if
- \(a_i \cdot \bar{x} \geq b_i\) for all \(i = 1, \dots, m\),
- \(\bar{\lambda}_i \geq 0\) for all \(i = 1, \dots, m\),
- \(\displaystyle c = \sum_{i = 1}^m a_i\bar{\lambda}_i\), and
- \(\bar{\lambda}_i \left( a_i \cdot \bar{x} - b_i \right) = 0\) for all \(i = 1, \dots, m\).
Theorem 1.65 Suppose that \(A = \left[ a_1 \mid \cdots \mid a_m \right]\) and \(a_i \in \mathscr{R}^n\) for all \(i = 1, \dots, m\). Then the set
\[ \text{pos}A := \left\{ Au : u \in \mathscr{R}^m_+ \right\} \]
is a non-empty, closed and convex cone.
Theorem 1.66 If \(\bar{x}\) solves the LP problem, then there exists some \(\bar{\lambda}\) such that \(\left(\bar{x}, \bar{\lambda}\right) \in \mathscr{R}^{n + m}\) is a Kuhn-Tucker pair for LP.
Definition 1.72 (General Constraint Qualification) A solution \(\bar{x}\) to the NLP problem satisfies the general constraint qualification (GCQ) condition if it solves the following LP problem:
\[ \begin{align} \min_{x \in \mathscr{R}^n} \quad & \nabla f\left(\bar{x}\right) \cdot x \\ \text{subject to} \quad & g_i\left(\bar{x}\right) + \nabla g_i\left(\bar{x}\right) \cdot \left(x - \bar{x}\right) \geq 0,\ \forall i = 1, \dots, m. \end{align} \]
Theorem 1.67 If \(\bar{x}\) solves the NLP problem and satisfies the GCQ condition, then it satisfies the KT conditions for NLP.
Definition 1.73 (Cottle Constraint Qualification) A solution \(\bar{x}\) to the NLP problem satisfies the Cottle constraint qualification (CCQ) condition if there exists some \(z \in \mathscr{R}^n\) such that \(\nabla g_i \left(\bar{x}\right) \cdot z > 0\) for all \(i \in I\left(\bar{x}\right)\).
Theorem 1.68 Suppose \(\bar{x}\) solves the NLP problem. If \(\bar{x}\) satisfies the CCQ condition, then it satisfies the GCQ condition.
Definition 1.74 (Linear Independence Constraint Qualification) A solution \(\bar{x}\) to the NLP problem satisfies the linear independence constraint qualification (LICQ) condition is the collection \(\left\{ \nabla g_i \left(\bar{x}\right) \right\}_{i \in I\left(\bar{x}\right)}\) is linearly independent.
Theorem 1.69 Suppose \(\bar{x}\) solves the NLP problem. If \(\bar{x}\) satisfies the LICQ condition, then it satisfies the CCQ condition.
Corollary 1.7 Suppose \(\bar{x}\) solves the NLP problem. If \(\bar{x}\) satisfies the LICQ condition, then it satisfies the KT conditions for NLP.
1.7.1.2 Kuhn-Tucker Sufficiency Conditions
Theorem 1.70 Let \(f : \mathscr{R}^n \to \mathscr{R}\) be differentiable. Then
- \(f\) is convex if and only if \(f(y) - f(x) \geq \nabla f(x) \cdot (y - x)\), and
- \(f\) is concave if and only if \(f(y) - f(x) \leq \nabla f(x) \cdot (y - x)\),
for every \(x\) and \(y\).
Theorem 1.71 Suppose \(\bar{x} \in \mathscr{R}^n\) satisfies the KT conditions for NLP. If \(f\) is convex and each \(g_i\) is concave, then \(\bar{x}\) solves the NLP problem.
Definition 1.75 (Quasi-convex / Quasi-concave) A function \(f : \mathscr{R}^n \to \mathscr{R}\) is quasi-convex / quasi-concave if the lower-contour set \(\left\{ x \in \mathscr{R}^n : f(x) \leq \alpha \right\}\) / the upper-contour set \(\left\{ x \in \mathscr{R}^n : f(x) \geq \alpha \right\}\) is convex for every \(\alpha \in \mathscr{R}\).
Theorem 1.72 Let \(f : \mathscr{R}^n \to \mathscr{R}\) be differentiable. Then
- \(f\) is quasi-concave if and only if \(f(y) - f(x) \geq 0 \implies \nabla f(x) \cdot (y - x) \geq 0\), and
- \(f\) is quasi-convex if and only if \(f(y) - f(x) \leq 0 \implies \nabla f(x) \cdot (y - x) \leq 0\),
for every \(x\) and \(y\).
Definition 1.76 (Pseudo-convex / Pseudo-concave) A differentiable function \(f : \mathscr{R}^n \to \mathscr{R}\) is pseudo-convex if, for every \(x\) and \(y\), \(\nabla f(x) \cdot (y - x) \geq 0\) implies \(f(y) - f(x) \geq 0\). A differentiable function \(f : \mathscr{R}^n \to \mathscr{R}\) is pseudo-concave if, for every \(x\) and \(y\), \(\nabla f(x) \cdot (y - x) \leq 0\) implies \(f(y) - f(x) \leq 0\).
Theorem 1.73 Suppose \(\left(\bar{x}, \bar{\lambda}\right)\) is a KT pair for NLP. If \(f\) is pseudo-convex and every \(g_i\) is quasi-concave, then \(x\) solves the NLP problem.
Theorem 1.74 Suppose \(\bar{x}\) solves the NLP. If each \(g_i\) is concave and if there exists \(x^*\) such that \(g_i(x^*) > 0\) for all \(i\), then \(\bar{x}\) satisfies the CCQ condition.
Remark. For a NLP problem with concave constraints, the existence of an \(x^*\) such that \(g_i(x^*) > 0\) for all \(i\) is called the Slater constraint qualification condition.
1.7.1.3 Comparative Statics
Consider the problem \(\displaystyle \min_{x \in \mathscr{R}^n} F(x, \alpha)\) subject to \(G_i (x, \alpha) \geq 0\) for all \(i = 1, \dots, m\), where \(\alpha \in \mathscr{R}^p\) and all functions are differentiable. Call this problem \(P_\alpha\). Let \(v(\alpha)\) be the value of \(P_\alpha\).
Definition 1.77 (Regular Solution) A vector \(\bar{x} \in \mathscr{R}^n\) is a regular solution to \(P_\bar{\alpha}\) if there exist \(\varepsilon > 0\) and \(\hat{x} : B_\varepsilon (\bar{\alpha}) \to \mathscr{R}^n\) such that
- \(\bar{x}\) satisfies the KT conditions,
- \(\hat{x} (\alpha)\) solves the problem \(P_\alpha\) for every \(\alpha \in B_\varepsilon (\bar{\alpha})\) and \(\hat{x}(\bar{\alpha}) = \bar{x}\),
- \(I(\hat{x}(\alpha)) = I(\bar{x})\) for all \(\alpha \in B_\varepsilon (\bar{\alpha})\), and
- \(\hat{x}\) is differentiable at \(\bar{\alpha}\).
Theorem 1.75 Suppose that \(\bar{x}\) is a regular solution to \(P_\bar{\alpha}\) with associated multiplier vector \(\bar{\lambda}\). Then
\[ \nabla v (\bar{\alpha}) = \nabla_\alpha F (\bar{x}, \bar{\alpha}) - \sum_{i = 1}^m \bar{\lambda}_i \nabla_\alpha G_i (\bar{x}, \bar{\alpha}). \]
Remark. Suppose that \(\bar{x}\) is a regular solution to \(P_\bar{\alpha}\) with associated multiplier vector \(\bar{\lambda}\). If \(m = 0\), then \(\nabla v(\bar{\alpha}) = \nabla_\alpha F(\bar{x}, \bar{\alpha})\). If \(p = m\), \(G_i (x, \alpha) = g_i(x) - \alpha_i\), and \(F(x, \alpha) = f(x)\), then \(\nabla v(\bar{\alpha}) = \bar{\lambda}\).
1.7.2 Maximization Problems
Consider the problem \(\max f(x)\) subject to \(g_i (x) \leq 0\) for all \(i = 1, \dots, m\) where all function are differentiable. Let \(I(x) = \left\{ i : g_i(x) = 0 \right\}\).
Theorem 1.76 If \(\bar{x}\) solves the maximization problem above and \(\displaystyle \left\{\nabla g_i (\bar{x})\right\}_{i \in I(\bar{x})}\) is a linearly independent set, then there exists \(\bar{\lambda} \in \mathscr{R}^m_+\) such that \(\displaystyle \nabla f(\bar{x}) = \sum_{i=1}^m \bar{\lambda}_i \nabla g_i (\bar{x})\) and \(\bar{\lambda}_i g_i (\bar{x}) = 0\) for all \(i\).
Theorem 1.77 Suppose that
- \(g_i(\bar{x}) \leq 0\) for all \(i = 1, \dots, m\),
- there exists a set \(\displaystyle \left\{\bar{\lambda}_i\right\}_{i \in I(\bar{x})}\) of positive numbers such that \(\displaystyle \nabla f(\bar{x}) = \sum_{i\in I(\bar{x})} \bar{\lambda}_i \nabla g_i (\bar{x})\),
- \(f\) is pseudo-concave, and
- each \(g_i\) is quasi-convex.
Then \(\bar{x}\) solves the maximization problem above.
1.7.3 Consumer Theory
1.7.3.1 Utility Maximization
Consider the following utility maximization problem
\[ \begin{align} \max_{x \in \mathscr{R}^n} \quad & u(x) \\ \text{subject to} \quad & p \cdot x \leq y, \\ & x \geq 0. \end{align} \]
The KT conditions are
\[ \begin{array}{ccc} \nabla u(x) \leq \lambda p & x \geq 0 & \left[ \nabla u(x) - \lambda p \right] \cdot x = 0 \\ p \cdot x \leq y & \lambda \geq 0 & \left( p \cdot x - y \right) \lambda = 0 \end{array} \]
Remark. \(\lambda\) is interpreted as the marginal utility of income. Call the solutions \(x(p,y)\) the demand functions and \(v(p,y) := u(x(p,y))\) the indirect utility function.
Remark (Indirect Utility Function Properties).
- Zero degree homogeneity.
- \(v\) is non-decreasing in \(y\), and non-increasing in \(p\).
- \(v\) is quasi-convex in \((p,y)\).
1.7.3.2 Expenditure Minimization
Consider the following expenditure minimization problem
\[ \begin{align} \min_{x \in \mathscr{R}^n} \quad & p \cdot x \\ \text{subject to} \quad & u(x) \geq \bar{u}, \\ & x \geq 0. \end{align} \]
Remark. Let \(h(p, \bar{u})\) be the solution and \(e(p,\bar{u}) := p \cdot h(p, \bar{u})\). The expenditure function \(e\) is mathematically identical to the cost function.
Remark (Expenditure Function Properties).
- \(e(\cdot, \bar{u})\) is 1-homogeneous.
- \(e(\cdot, \bar{u})\) is non-decreasing.
- \(e(\cdot, \bar{u})\) is concave.
- By Shephard’s Lemma, \(\displaystyle \frac{\partial e}{\partial p_i} = h_i (p, \bar{u})\).
Theorem 1.78 (Roy's Identity) \[ x_i(p,y) = -\frac{\displaystyle \frac{\partial v}{\partial p_i}}{\displaystyle \frac{\partial v}{\partial y}}. \]
Remark (Relationships between utility maximization and expenditure minimization).
- \(h(p,v(p,y)) = x(p,y)\)
- \(x(p,e(p,\bar{u})) = h(p,\bar{u})\)
- \(e(p,v(p,y)) = y\)
- \(v(p,e(p,\bar{u})) = \bar{u}\)
Theorem 1.79 (Slutsky) \[ \frac{\partial x_i}{\partial p_j} (p,y) = \frac{\partial h_i}{\partial p_j} (p,v(p,y)) - x_j \frac{\partial x_i}{\partial y} (p,y),\ \forall i \neq j. \]
1.8 Preferences and Utility Representation
1.8.1 Binary Relations
Definition 1.78 (Binary Relation) Let \(X\) be a set and \(X^2 = \left\{(x, y) : x, y \in X\right\}\). A binary relation on \(X\) is any subset of \(X^2\). We will typically denote binary relations by \(P\), \(R\), \(I\) or \(\succ\), \(\succsim\), \(\sim\).
Let \(P\) be a binary relation. Instead of \((x, y) \in P\), we write \(xPy\). Similarly, instead of \((x, y) \notin P\), we write \(x {\not\negmedspace P}y\).
Definition 1.79 (Strict Preference) Let \(P\) be a binary relation. Suppose we would like \(xPy\) to mean \(x\) is strictly better that \(y\). Here are some properties which we might like to impose on \(P\).
- Irreflexivity: \(xPx\) for no \(x \in X\).
- Asymmetry: for any \(x, y \in X\), if \(xPy\), then \(y {\not\negmedspace P}x\).
- Acyclicity: for every \(n\) and for every \(x_1, \dots, x_n \in X\), if \(x_i P x_{i+1}\) for every \(i < n\), then \(x_n {\not\negmedspace P}x_1\).
- Transitivity: if \(xPy\) and \(yPz\), then \(xPz\).
- Negative transitivity: if \(xPy\) and \(z \in X\), then either \(xPz\) or \(zPy\).
- Connectedness: if \(x \neq y\), then either \(xPy\) or \(yPx\).
Remark.
- \(P\) is a (strict) linear order if it is connected, asymmetric and negatively transitive.
- \(P\) is a (strict) weak order if it is asymmetric and negatively transitive.
- \(P\) is a (strict) partial order if it is irreflexive and transitive.
Definition 1.80 (Weak Preference) Let \(R\) be a binary relation. Suppose we would like \(xRy\) to mean \(x\) is at least as good as \(y\). Here are some properties which we might like to impose on \(R\).
- Reflexivity: \(xRx\) for all \(x \in X\).
- Completeness: for all \(x, y \in X\), \(xRy\) or \(yRx\).
- Transitivity: if \(xRy\) and \(yRz\), then \(xRz\).
- Antisymmetry: if \(xRy\) and \(yRx\), then \(x = y\).
Remark.
- \(P\) is a (weak) linear order if it is antisymmetric, complete and transitive.
- \(P\) is a (weak) weak order if it is complete and transitive.
- \(P\) is a (weak) partial order if it is reflexive and transitive.
Definition 1.81 (Indifference) Let \(I\) be a binary relation. Suppose we would like \(xIy\) to mean that \(x\) and \(y\) are equally good. Here are some properties which we might like to impose on \(I\).
- Reflexivity: \(xIx\) for all \(x \in X\).
- Symmetry: if \(xIy\), then \(yIx\).
- Transitivity: if \(xIy\) and \(yIz\), then \(xIz\).
Remark. Call \(I\) an equivalence class if it is reflexive, symmetric and transitive.
Proposition 1.17 Suppose that \(P\) is a (strict) weak order and that binary relations \(R\) and \(I\) are defined, using \(P\), as follows
\[ \begin{align} xRy && \iff && y {\not\negmedspace P}x \\ xIy && \iff && x {\not\negmedspace P}y && \land && y {\not\negmedspace P}x \end{align} \]
Then \(R\) is a (weak) weak order and \(I\) is an equivalence class. If, in particular, \(P\) is a (strict) linear order, then \(R\) is a (weak) linear order.
Proposition 1.18 Suppose that \(R\) is a (weak) weak order and that binary relations \(P\) and \(I\) are defined, using \(R\), as follows
\[ \begin{align} xPy && \iff && xRy && \land && y {\not\negmedspace R}x \\ xIy && \iff && xRy && \land && yRx \end{align} \]
Then \(P\) is a (strict) weak order and \(I\) is an equivalence class. If, in particular, \(R\) is a (weak) linear order, then \(P\) is a (strict) linear order.
1.8.2 Maximization
Definition 1.82 (Menu) Let \(2^X = \left\{ A \subseteq X : A \neq \varnothing\right\}\). Any member of \(2^X\) is a menu.
Remark. For any binary relation \(T\) and any menu \(A\), let \[ \begin{align} m(A,T) &= \left\{ x \in A : yTx,\ \text{for no}\ y \in A \right\}, \\ M(A,T) &= \left\{ x \in A : xTy,\ \forall y \in A \right\}. \end{align} \] In general, these sets may be empty and they need not to coincide.
Proposition 1.19 Suppose \(X\) is finite. \(P\) is acyclic if and only if \(m(A,P) \neq \varnothing\) for all \(A\).
Proposition 1.20 Suppose \(R\) is a (weak) weak order and define \(P\) using \(R\) as follows: \[ xPy \iff xRy \land y {\not\negmedspace R}x. \] Then \(m(A,P) = M(A,P)\) for all \(A\). If \(R\) is a (weak) linear order, then \(M(A,R)\) is a singleton.
Theorem 1.80 Let \(P\) be a binary relation on a countable set \(X\). \(P\) admits a utility representation if and only if \(P\) is a weak order.
Remark. In general, there exist weak orders (even linear orders) which do not have utility representations.
Theorem 1.81 Let \(P\) be a linear order on an arbitrary set \(X\). \(P\) admits a utility representation if and only if \(X\) contains a countable \(P\)-dense subset.
Theorem 1.82 A binary relation \(P\) on an arbitrary set \(X\) admits a utility representation if and only if \(P\) is a weak order on \(X\) and \(X^*\) contains a countable \(P^*\)-dense subset.
Lemma 1.6 Given \(P\), say that the pair \((a,b)\) is a gap if \(aPb\) but there exits no \(c\) such that \(aPc\) and \(cPb\). Let \(G_1 = \left\{ a : (a,b) \right.\) is a gap for some \(\left. b \right\}\), \(G_2 = \left\{ b : (a,b) \right.\) is a gap for some \(\left. a \right\}\) and \(G = G_1 \cup G_2\).
Suppose \(P\) is a linear order on an arbitrary set \(X\). \(G\) is countable if one of the following two conditions holds:
- \(X\) contains a countable \(P\)-dense subset.
- \(P\) admits a utility representation.
1.8.3 A Topological Approach
Corollary 1.8 Let \(R\) be a complete and transitive binary relation (i.e., a weak order) on \(X \subseteq \mathscr{R}^n\). Define, as usual, \(xPy\) if and only if \(xRy\) and \(y {\not\negmedspace R}x\). Recall that such \(P\) is asymmetric and negatively transitive. Endow \(X\) with the Euclidean metric. We say that \(R\) admits a continuous utility representation if there exists a continuous function \(u : X \to \mathscr{R}\) such that \(xRy\) if and only if \(u(x) \geq u(y)\).
Theorem 1.83 Let \(X \subseteq \mathscr{R}^n\) be convex and \(R\) be a binary relation on \(X\). \(R\) is a continuous weak order if and only if it admits a continuous utility representation.
Proposition 1.21 Let \(e = (1, \dots, 1) \in \mathscr{R}^n\). If \(R\) is a strictly monotone weak order and \(\alpha\) and \(\beta\) are scalars, then \[ \alpha \geq \beta \iff (\alpha e) R (\beta e). \]
Theorem 1.84 Suppose \(R\) is a strictly monotone and continuous weak order on \(X = \mathscr{R}^n\). Then \(R\) admits a continuous utility representation.