Integration by Substitution

Current UK exam textbooks pass over proofs and mathematical discussions in a hurry to show the ‘how to’ of exam questions.

Integration by substitution is a little more than just backwards chain-rule and deserves a fuller treatment.

Try this,


y=\displaystyle \int \textnormal{f}(x) \ \textnormal{d}x



Suppose that there exists a function g, of another variable u, such that x=\textnormal{g}(u) and let, \textnormal{f}(x)=\textnormal{f}(\textnormal{g}(u))=\textnormal{F}(u). So that,


Now, by the chain rule,

\dfrac{\textnormal{d}y}{\textnormal{d}u}=\dfrac{\textnormal{d}y}{\textnormal{d}x}\times \dfrac{\textnormal{d}x}{\textnormal{d}u}=\textnormal{F}(u)\dfrac{\textnormal{d}x}{\textnormal{d}u}


y=\displaystyle \int \dfrac{\textnormal{d}y}{\textnormal{d}u} \ \textnormal{d}u=\displaystyle \int \textnormal{F}(u)\dfrac{\textnormal{d}x}{\textnormal{d}u} \ \textnormal{d}u


y=\displaystyle \int \textnormal{f}(x) \ \textnormal{d}x=\displaystyle \int \textnormal{F}(u)\dfrac{\textnormal{d}x}{\textnormal{d}u} \ \textnormal{d}u

Differentiation From First Principles

The gradient of a smooth curve, \textnormal{f}(x), at a point x is the gradient of the tangent to the curve at the point x. Point P is on the curve and Q is a neighbouring point whose x value is displaced a small quantity, \delta x.

The idea behind differentiation is that as \delta x becomes very small, the gradient of PQ tends towards the gradient of the curve. In the limit as \delta x becomes infinitesimally close to zero, the gradient PQ becomes the gradient of the curve.

We write:

\textnormal{gradient f}(x)=\dfrac{\textnormal{d}y}{\textnormal{d}x}=\lim_{\delta x \rightarrow 0}\left(\dfrac{\delta y}{\delta x}\right)=\lim_{\delta x \rightarrow 0}\left(\dfrac{\textnormal{f}(x+\delta x)-\textnormal{f}(x)}{\delta x}\right)

there is a fair bit of analytic work missing (higher education) to make these ideas sound.

We also write:



Standard results can be proved for different functions.

If \textnormal{f}(x)=x^{n} then

If \textnormal{f}(x)=\sin x, then we need to consider the small angle approximation that is if \delta x radians is very small (infinitesimal), then \delta x\approx\sin \delta x and \cos \delta x \approx 1, and compound trigonometry from which follows,

The differentiation process described above is linear and extends to more complicated functions. That is to say that if, y=a\textnormal{f}(x)+b\textnormal{g}(x) where a,b \in \mathbb{R},

The Fundamental Theorem of Calculus

Integration is introduced as the reversal of differentiation i.e. in solving a differential equation, \dfrac{\textnormal{d}y}{\textnormal{d}x}=\textnormal{g}(x). The link between integration and area is often passed over and is the subject of the Fundamental Theorem of Calculus. [The following discussion can be adapted for a decreasing function or, piece-wise, a function which successively increases or decreases.]

Consider and area function, A(x), defined by the area under \textnormal{f}(x) between a and and a general point, x. If a small increment, \delta x, is applied to x giving a small element, \delta A of area. Now,

\textnormal{f}(x)\delta x \leqslant \delta A \leqslant \textnormal{f}(x+\delta x)\delta x

dividing though by \delta x, gives,

\textnormal{f}(x) \leqslant \dfrac{\delta A}{\delta x} \leqslant \textnormal{f}(x+\delta x),

a limit sandwich where, as \delta x \rightarrow 0,


The curve function, \textnormal{f}(x) is the derivative of the area function; hence the area function is the anti-derivative of the curve function and,

\displaystyle\int \textnormal{f}(x) \textnormal{d}x=A(x).


The Ghosts of Departed Quantities and the difficulties of teaching and learning caculus

Bishop Berkeley writes this attack on the apparent supernatural reasoning involved in calculus.  The infidel was probably Halley (of comet fame) or Newton.

If pupils find the subject difficult to understand at school, and teachers find it difficult to teach, then the reason may be articulated in this book by the great man.


Quotes include:

“Now to conceive a Quantity infinitely small, that is, infinitely less than any sensible or imaginable Quantity, or any the least finite Magnitude, is, I confess, above my Capacity. But to conceive a Part of such infinitely small Quantity, that shall be still infinitely less than it, and consequently though multiply’d infinitely shall never equal the minutest finite Quantity, is, I suspect, an infinite Difficulty to any Man whatsoever”


“They are neither finite Quantities nor Quantities infinitely small, nor yet nothing. May we not call them the Ghosts of departed Quantities?”

Some sympathy for the thesis is gained by Berkeley’s examination of tangent reasoning:

“Therefore the two errors being equal and contrary destroy each other; the first error of defect being corrected by a second error of excess. ……. If you had committed only one error, you would not have come at a true Solution of the Problem. But by virtue of a twofold mistake you arrive, though not at Science, yet at Truth. For Science it cannot be called, when you proceed blindfold, and arrive at the Truth not knowing how or by what means.”

The student of sixth form level mathematics who is eager to see how this is resolved must continue their path to mathematical enlightenment by studying the \epsilon-\delta Analysis of Cauchy, Riemann and Weierstrass.





Euler and the Properties of the Second Derivative

The summer examination season sees pupils searching for maximums and minimums on their text papers.

This long-standing pursuit was initiated by the likes of Newton and Leibniz in their calculus.

It can be all too difficult to think about for some:

“our modern Analysts are not content to consider only the Differences of finite Quantities: they also consider the Differences of those Differences, and the Differences of the Differences of the first Differences. And so on ad infinitum.”

Bishop Berkeley, The Analyst, 1734

The example above is one in which Euler demonstrates the geometrical significance of the first and second derivatives.

Note that the Point of Inflection is where f''(x)=\dfrac{d^{2}y}{dx^{2}}=0 and is a change from upward to downward convex curvature, or vice versa.  There is no need for f'(x)=\dfrac{dy}{dx}=0 too.





Exdexcel Formula Sheet lays Pooh Traps for Further Mathematics Pupils

FP3 integration standard forms could lead unwary pupils and teachers into a Pooh Trap. The following issue arises each year as the finer points of this course get sharpened.

When consulting the Edexcel formula sheet to find the integral,

\displaystyle \int \dfrac{1}{\sqrt{a^{2}+x^{2}}} \textnormal{d}x

two anti-derivatives are given,

\textnormal{arsinh}\left(\dfrac{x}{a}\right)\ \ \ \textnormal{and}\ \ \ \ln\{x+\sqrt{x^{2}+a^{2}}\}

Some might be tempted to infer that these functions are equal, this is the Pooh Trap, because whilst they both differentiate to \dfrac{1}{\sqrt{a^{2}+x^{2}}} they are not equal.

Play with exponential functions and quadratic equations give us the logarithmic form of \textnormal{arsinh}\ x which is also represented in formula books.

\textnormal{arsinh}\ x=\ln\{x+\sqrt{x^{2}+1}\}

\textnormal{arsinh} \left(\dfrac{x}{a}\right)=\ln\left\{\dfrac{x}{a}+\sqrt{\left(\dfrac{x}{a}\right)^{2}+1}\right\}=\\\ln\left\{\dfrac{x+\sqrt{x^{2}+a^{2}}}{a}\right\}=\ln\left\{x+\sqrt{x^{2}+a^{2}}\right\}-\ln a
The two anti-derivatives differ by the constant \ln a.

In a definite integration, this constant is added and then taken away making no difference as to which anti-derivative the student uses. In a particular solution to a differential equation, the evaluation of the integral using boundary condition would lead to two different constant values.

The worse case is that a student takes the integration standard from and reads it as the logarithmic form of the inverse of \sinh x. I have seen this happen.