Skip to content

SQL Server: Translate statistics aggregate functions #28104

@roji

Description

@roji

With #22957 done, we can now add translations for additional aggregate operators.

Name SQL Server PostgreSQL MySQL Oracle
Standard deviation (sample) STDEV stddev/stddev_samp STDDEV_SAMP STDDEV/STDDEV_SAMP1
Standard deviation (population) STDEVP stdddev_pop STDDEV/STDDEV_POP STDDEV_POP
Variance (sample) VAR variance/var_samp VAR_SAMP VARIANCE/VAR_SAMP
Variance (population) VARP var_pop VARIANCE/VAR_POP VAR_POP
  • SQLite is the only database that doesn't have built-in standard deviation and variance.
  • Unfortunately, these functions return different types based on their argument type across databases:
    • SQL Server always returns float
    • PG returns double precision for real or double precision, otherwise numeric.
    • MariaDB always returns double, with different precision for float/double than for int/decimal (though it doesn't seem to matter)
  • As a result, we probably shouldn't have .NET functions in relational, but add them to each provider separately (signatures are different).
  • When there are no input values, these functions return NULL on all databases.
  • Standard deviation was already discussed in #11850, but that issue now tracks aggregate UDFs.
  • Old EF6 supported translating to the SQL Server aggregate statistical functions (docs), so this is ef6-parity.
Detailed investigation across databases

SQL Server

DROP TABLE IF EXISTS data;
CREATE TABLE data
(
    int INT,
    real REAL,
    float FLOAT,
    decimal DECIMAL
);
INSERT INTO data (int, real, float, decimal) VALUES (1, 1, 1, 1), (10, 10, 10, 10), (NULL, NULL, NULL, NULL);

-- Output types:
SELECT SQL_VARIANT_PROPERTY(STDEV(int), 'BaseType') FROM data; -- float
SELECT SQL_VARIANT_PROPERTY(STDEV(real), 'BaseType') FROM data; -- float
SELECT SQL_VARIANT_PROPERTY(STDEV(float), 'BaseType') FROM data; -- float
SELECT SQL_VARIANT_PROPERTY(STDEV(decimal), 'BaseType') FROM data; -- float

-- No input values:
SELECT STDEV(decimal) FROM data WHERE decimal < 0; -- NULL

PostgreSQL

DROP TABLE IF EXISTS data;
CREATE TABLE data
(
    int INT,
    real REAL,
    double DOUBLE PRECISION,
    decimal DECIMAL
);
INSERT INTO data (int, real, double, decimal) VALUES (1, 1, 1, 1), (10, 10, 10, 10), (NULL, NULL, NULL, NULL);

-- Output types:
SELECT pg_typeof(stddev(int)) FROM data; -- numeric
SELECT pg_typeof(stddev(real)) FROM data; -- double
SELECT pg_typeof(stddev(double)) FROM data; -- double
SELECT pg_typeof(stddev(decimal)) FROM data; -- numeric

-- No input values:
SELECT stddev(decimal) FROM data WHERE decimal < 0; -- NULL

MariaDB

DROP TABLE IF EXISTS data;
CREATE TABLE data
(
    intc INT,
    floatc FLOAT,
    doublec DOUBLE,
    decimalc DECIMAL
);
INSERT INTO data (intc, floatc, doublec, decimalc) VALUES (1, 1, 1, 1), (10, 10, 10, 10), (NULL, NULL, NULL, NULL);

-- Output types:
DROP TEMPORARY TABLE IF EXISTS foo;
CREATE TEMPORARY TABLE foo SELECT STDDEV_SAMP(intc) FROM data;
DESC foo; -- double(26,4)

DROP TEMPORARY TABLE IF EXISTS foo;
CREATE TEMPORARY TABLE foo SELECT STDDEV_SAMP(floatc) FROM data;
DESC foo; -- double

DROP TEMPORARY TABLE IF EXISTS foo;
CREATE TEMPORARY TABLE foo SELECT STDDEV_SAMP(doublec) FROM data;
DESC foo; -- double

DROP TEMPORARY TABLE IF EXISTS foo;
CREATE TEMPORARY TABLE foo SELECT STDDEV_SAMP(decimalc) FROM data;
DESC foo; -- double(26,4)

-- No input values:
SELECT STDDEV_SAMP(decimalc) FROM data WHERE decimalc < 0; -- NULL

Metadata

Metadata

Assignees

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions