**Definition of a quantitle**:

Let the cdf of \(y\) be \(F_Y=P(Y\leq y)\). The \(q\)th quantile of Y is then \(Q_Y(q)=F_Y^{-1}(q)=inf\{y:F_Y(y)\leq q\}\). I.e. it is the smallest value in the distribution such that the cdf is greater than the quantile. E.g. the 90% quantile is the first value of the distribution where the cdf exceeds 0.9.

In frequentist statistics, the parameters are fixed. The following is taken from the `bayesQR`

paper, https://www.jstatsoft.org/article/view/v076i07.

Consider the standard linear model, \[y_i=x_i^T\beta+\epsilon_i\] The conditional mean model is obtained by assuming that \(E(\epsilon|x)=0\). The likelihood (how likely parameter values are given the data) is then, \[l(x_i,y_i,\beta, \epsilon, \sigma)=\Pi_{i=1}^{n} \dfrac{1}{2\pi \sigma^2}exp\left(-\dfrac{(y_i-x_i^T\beta)^2}{\sigma^2}\right)\] and the log likelihood is, \[L(x_i,y_i,\beta, \epsilon, \sigma)\propto -\sum_{i=1}^n(y_i-x_i^T\beta)^2,\]

So we see that to maximise the log likelihood (a negative number), we can convert it to a minimisation problem - in which case the regression coefficients can be obtained by solving \[\hat\beta=argmin_{\beta}\sum_{i=1}^n (y_i-x_i^T\beta)^2.\]

If we consider QR, then the optimisation problem needs to depend on \(q\). To do this, we introduce the check function, \(p_q(.)\),

\[ p_q = \begin{cases} q &\quad\text{if residuals}\geq 0 \implies y_i\geq x_i^T\beta\\ 1-q &\quad\text{if residuals}< 0 \implies y_i<x_i^T \beta \\ \end{cases} \]

So that now the optimisation problem is, \[\hat\beta=argmin_{\beta}\sum_{i=1}^n p_q(y_i-x_i^T\beta)=argmin_{\beta}\left(\sum_{i: y_i \geq x_i^T\beta}q|y_i-x_i^T\beta|+\sum_{i: y_i < x_i^T\beta}(1-q)|y_i-x_i^T\beta| \right)\] Note that if \(q=0.5\) (median regression) then this reduces to \(\hat\beta=argmin_{\beta}\sum|y_i-x_i^T\beta|\).

The `quantreg`

R package uses frequentist methodology to estimate quantile regression coefficients. It offers 3 methods to solve the above minimisation problem:

Simplex method (default) using

`method=br`

: Uses the fact that solutions are focused on the vertices of the constraint set.Frisch-Newton interior point method using

`method="fn"`

or`method="pfn"`

for very large problems: Instead of focusing on the vertices, it traverses the interior of the feasible region.Sparse regression quantile fitting using

`method="sfn"`

or`rq.fit.sfn`

: Sparse version of the above, it is efficient when the design matrix has many zeros (e.g. if the predictors contain several factors).

See Yu et al. and APTS notes that explain Bayesian quantile regression whereby a likelihood function that is based on the asymmetric Laplace distribution (ALD) is used.

The Laplace distribution with density \(f(z)=\frac{1}{2\sigma} exp\left(-\frac{|z-\mu|}{\sigma}\right)\) has the nice property that the MLE of \(\mu\) is the sample median. The asymmetric Laplace distribution is a generalisation of the Laplace distribution whereby the MLE of \(\mu\) is now the sampled quantile of \(z\).

The AL distribution is similar to the normal distribution but is “spikier” and is allowed to be asymmetric. The pdf for the ALD is, \[f(y | x_i^T\beta, \sigma, q)=\frac{q(1-q)}{\sigma} \exp \left(-p_q\left(\frac{y- x_i^T\beta}{\sigma}\right)\right).\] Notice that this contains the check function \(p_q(.)\).

If we assume place an ALD prior on the residuals with \(\sigma=1\) then the likelihood of our model becomes \[L(y|\beta)=q^n(1-q)^n exp\left(-\sum_{i=1}^n p_q(y_i-x_i^T\beta) \right).\]

We can then combine this with a prior to get our posterior, and use the standard methods to approximate this.

\[\psi(\beta,\sigma|y,x,q)\propto \pi(\beta,\sigma)\Pi_{i=1}^n \text{ALD}(y_i | x_i^T\beta, \sigma,q).\]

Bayesian quantile regression can be performed in R using `bayesQR`

, which places weak priors on \(\beta\) and \(\sigma\) and using a Gibbs sampler to estimate the model parameters. Users need to specify the number of MCMC draws, and there is an option to use adaptive lasso variable selection.

Coefficient estimation: Marginal change in \(q\)th quantile due to marginal change in \(x\).