## Intro to Over

On previous pages we looked at the utilisation of a single aggregation function in a `SELECT`

statement.
Suppose we’d like to wrap an aggregation function around another aggregation function like so: `SELECT SUM(COUNT(*))`

.
In SQL, this is unfortunately **forbidden**.

We *could* instead rely on subqueries (CTEs in particular) to get around this limitation.

However, there are some instances where we need not result to such verbosity.
We now introduce the `OVER()`

clause, which makes such expressions possible in a single statement.

## Questions

First, let’s assume we need to answer the following questions:

- Where do our users come from?
- How do we express the aggregation as a percentage?

## Returning the Count

### SQL

First, let’s write a query that returns the actual figures.

```
SELECT
Country
, COUNT(*) AS Cnt
FROM music.Users
GROUP BY
Country
ORDER BY
COUNT(*) DESC;
```

### Output

## Computing the Total

To compute the percentage, we first need to get the total number of users in `music.Users`

.
We can augment that above query such that the total is the sum of every value in `Cnt`

.
That is: `SUM(COUNT(*))`

.

If we include this expression in our query, we will get an error:

`Cannot perform an aggregate function on an expression containing an aggregate or a subquery.`

That is, we cannot use an aggregate function on another aggregate function. Why? It kind of boils down to a matter of sequencing.

To rephrase the problem, we want to sum all the values of `Cnt`

in the output above.
That is, we need to invoke `SUM()`

after `COUNT(*)`

has given us its result set.
Therefore, we need to explicitly tell SQL to use `SUM()`

*only* once the above result set has been computed.

We can do this with `OVER()`

.
This allows us to define the intial result set from the first aggregation as a *window*.
So instead of aggregating on an aggregate, we aggregate on the *result set* via a window which interposes itself between the two aggregation steps.

Now we write: `SUM(COUNT(*)) OVER()`

In this case the window is the entire result set returned by the above query.
This is because have not specified any *partitioning*.
We are summming all values of `Cnt`

.

### SQL

```
SELECT
Country
, COUNT(*) AS Cnt
, SUM(COUNT(*)) OVER() AS Total
FROM music.Users
GROUP BY
Country
ORDER BY
COUNT(*) DESC;
```

### Output

Note: `Cnt`

in the output table does not sum to 100 as I’m not displaying every row in the result set.

## Expressing the Percentage

Now we can simply write an expression that divides `Cnt`

by `Total`

and multiplies by 100.0 to get the percentage.

### SQL

```
SELECT
Country
, COUNT(*) AS Cnt
, SUM(COUNT(*)) OVER() AS Total
, COUNT(*)*100.0/SUM(COUNT(*)) OVER() AS Pcnt
FROM music.Users;
GROUP BY
Country
ORDER BY
COUNT(*) DESC;
```

Note: when multiplying or dividing an integer by another integer, our result will also be an integer.
If we expect a non-integer result, we can cast an `int`

to `float`

by rewriting the integer as a decimal.
This is known as **implicit conversion**.
To illustrate this, compare these two expressions: `SELECT 1/2`

and `SELECT 1/2.0`

.

### Output

Voila! We have computed the percentage distribution of an aggregation in SQL using a single query. We could have used a subquery to return the total, but that would have been verbose.

### Verbose Example

```
WITH cteTotal AS (
SELECT
COUNT(*) AS Total
FROM music.Users;
)
SELECT
Country
, COUNT(*)*100.0/(SELECT Total FROM cteTotal)
FROM music.Users
GROUP BY
Country
ORDER BY COUNT(*) DESC;
```