-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Description
With #22957 done, we can now add translations for additional aggregate operators.
| Name | SQL Server | PostgreSQL | MySQL | Oracle |
|---|---|---|---|---|
| Standard deviation (sample) | STDEV | stddev/stddev_samp | STDDEV_SAMP | STDDEV/STDDEV_SAMP1 |
| Standard deviation (population) | STDEVP | stdddev_pop | STDDEV/STDDEV_POP | STDDEV_POP |
| Variance (sample) | VAR | variance/var_samp | VAR_SAMP | VARIANCE/VAR_SAMP |
| Variance (population) | VARP | var_pop | VARIANCE/VAR_POP | VAR_POP |
- SQLite is the only database that doesn't have built-in standard deviation and variance.
- Unfortunately, these functions return different types based on their argument type across databases:
- SQL Server always returns float
- PG returns double precision for real or double precision, otherwise numeric.
- MariaDB always returns double, with different precision for float/double than for int/decimal (though it doesn't seem to matter)
- As a result, we probably shouldn't have .NET functions in relational, but add them to each provider separately (signatures are different).
- When there are no input values, these functions return NULL on all databases.
- Standard deviation was already discussed in #11850, but that issue now tracks aggregate UDFs.
- Old EF6 supported translating to the SQL Server aggregate statistical functions (docs), so this is ef6-parity.
Detailed investigation across databases
SQL Server
DROP TABLE IF EXISTS data;
CREATE TABLE data
(
int INT,
real REAL,
float FLOAT,
decimal DECIMAL
);
INSERT INTO data (int, real, float, decimal) VALUES (1, 1, 1, 1), (10, 10, 10, 10), (NULL, NULL, NULL, NULL);
-- Output types:
SELECT SQL_VARIANT_PROPERTY(STDEV(int), 'BaseType') FROM data; -- float
SELECT SQL_VARIANT_PROPERTY(STDEV(real), 'BaseType') FROM data; -- float
SELECT SQL_VARIANT_PROPERTY(STDEV(float), 'BaseType') FROM data; -- float
SELECT SQL_VARIANT_PROPERTY(STDEV(decimal), 'BaseType') FROM data; -- float
-- No input values:
SELECT STDEV(decimal) FROM data WHERE decimal < 0; -- NULLPostgreSQL
DROP TABLE IF EXISTS data;
CREATE TABLE data
(
int INT,
real REAL,
double DOUBLE PRECISION,
decimal DECIMAL
);
INSERT INTO data (int, real, double, decimal) VALUES (1, 1, 1, 1), (10, 10, 10, 10), (NULL, NULL, NULL, NULL);
-- Output types:
SELECT pg_typeof(stddev(int)) FROM data; -- numeric
SELECT pg_typeof(stddev(real)) FROM data; -- double
SELECT pg_typeof(stddev(double)) FROM data; -- double
SELECT pg_typeof(stddev(decimal)) FROM data; -- numeric
-- No input values:
SELECT stddev(decimal) FROM data WHERE decimal < 0; -- NULLMariaDB
DROP TABLE IF EXISTS data;
CREATE TABLE data
(
intc INT,
floatc FLOAT,
doublec DOUBLE,
decimalc DECIMAL
);
INSERT INTO data (intc, floatc, doublec, decimalc) VALUES (1, 1, 1, 1), (10, 10, 10, 10), (NULL, NULL, NULL, NULL);
-- Output types:
DROP TEMPORARY TABLE IF EXISTS foo;
CREATE TEMPORARY TABLE foo SELECT STDDEV_SAMP(intc) FROM data;
DESC foo; -- double(26,4)
DROP TEMPORARY TABLE IF EXISTS foo;
CREATE TEMPORARY TABLE foo SELECT STDDEV_SAMP(floatc) FROM data;
DESC foo; -- double
DROP TEMPORARY TABLE IF EXISTS foo;
CREATE TEMPORARY TABLE foo SELECT STDDEV_SAMP(doublec) FROM data;
DESC foo; -- double
DROP TEMPORARY TABLE IF EXISTS foo;
CREATE TEMPORARY TABLE foo SELECT STDDEV_SAMP(decimalc) FROM data;
DESC foo; -- double(26,4)
-- No input values:
SELECT STDDEV_SAMP(decimalc) FROM data WHERE decimalc < 0; -- NULL