Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 6 of 6 total
Thread GROUP BY / subselect / WHERE problem
Mon, Oct 28 2019 5:55 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

I'm just working out a bit of SQL to summarise income by source of funds for a project  . It works well until I comment out the line indicated. With that left in I get two rows with the right values, takle it out and I get one row with the wrong information.

SELECT
(SELECT _fkNames FROM Transactions WHERE Transactions._ID = Analysis._fkTransaction),
(SELECT _Name FROM Names WHERE Names._ID = (SELECT _fkNames FROM Transactions WHERE Transactions._ID = Analysis._fkTransaction)),
SUM(_Income)
,* <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
FROM Analysis
WHERE
_fkProjects = 'One'
AND
_Income IS NOT NULL
GROUP BY
(SELECT _fkNames FROM Transactions WHERE Transactions._ID = Analysis._fkTransaction)

If I add _fkProjects to the SELECT clause again I get the right information.

So I have a workround but I don't understand what's going on. Any theories?

Roy Lambert
Tue, Oct 29 2019 10:40 PMPermanent Link

Richard Harding

Wise Nutrition Coaching

Roy,

How about posting the CREATE SQL for the tables, showing only the relevant columns?

Richard
Wed, Oct 30 2019 4:03 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Richard



CREATE TABLE "Analysis"
(
"_ID" INTEGER GENERATED ALWAYS AS IDENTITY (START WITH 0, INCREMENT BY 1) NOT NULL,
"_fkTransaction" INTEGER NOT NULL,
"_fkCoding" VARCHAR(10) COLLATE "ANSI_CI",
"_fkProjects" VARCHAR(10) COLLATE "ANSI_CI",
"_InOut" VARCHAR(1) COLLATE "ANSI_CI" DEFAULT '?' NOT NULL,
"_Income" DECIMAL(19,2),
"_Expenditure" DECIMAL(19,2),
"_Posted" TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
CONSTRAINT "PK" PRIMARY KEY ("_ID")
)

CREATE TABLE "Names"
(
"_ID" INTEGER GENERATED ALWAYS AS IDENTITY (START WITH 0, INCREMENT BY 1) NOT NULL,
"_Name" VARCHAR(50) COLLATE "ANSI_CI" NOT NULL,
"_Type" VARCHAR(10) COLLATE "ANSI_CI" NOT NULL,
"_Phone" VARCHAR(15) COLLATE "ANSI_CI",
"_EMail" VARCHAR(60) COLLATE "ANSI_CI",
"_Active" BOOLEAN DEFAULT TRUE NOT NULL,
"_fkCodingI" VARCHAR(10) COLLATE "ANSI_CI" DESCRIPTION 'Income default analysis code',
"_fkCodingE" VARCHAR(10) COLLATE "ANSI_CI" DESCRIPTION 'Expense default analysis code',
"_fkProjects" VARCHAR(10) COLLATE "ANSI_CI" DESCRIPTION 'Expense default project',
"_Address" CLOB COLLATE "ANSI_CI",
"_Notes" CLOB COLLATE "ANSI_CI",
CONSTRAINT "PK" PRIMARY KEY ("_ID")
)


CREATE TABLE "Transactions"
(
"_ID" INTEGER GENERATED ALWAYS AS IDENTITY (START WITH 0, INCREMENT BY 1) NOT NULL,
"_InOut" VARCHAR(1) COLLATE "ANSI" NOT NULL,
"_AnalysisLevel" INTEGER DEFAULT 3 NOT NULL,
"_SplitLink" INTEGER,
"_DatePosted" DATE DEFAULT CURRENT_DATE NOT NULL,
"_Date" DATE DEFAULT CURRENT_DATE NOT NULL,
"_fkNames" INTEGER NOT NULL,
"_fkBanks" INTEGER DEFAULT -1 NOT NULL,
"_fkTransactions" INTEGER,
"_Method" VARCHAR(10) COLLATE "ANSI_CI" DEFAULT 'Cheque' NOT NULL,
"_Document" VARCHAR(15) COLLATE "ANSI_CI" DEFAULT 'Invoice',
"_Reference" VARCHAR(20) COLLATE "ANSI_CI",
"_fkCheques" INTEGER,
"_fkSlips" INTEGER,
"_Income" DECIMAL(19,2),
"_Expenditure" DECIMAL(19,2),
"_DatePaid" DATE,
"_DateReconciled" DATE,
"_xReconciled" BOOLEAN COMPUTED ALWAYS AS _DateReconciled IS NOT NULL,
"_xPaidIn" BOOLEAN COMPUTED ALWAYS AS (_fkSlips IS NOT NULL) OR (_Document = 'Statement'),
"_xPaid" BOOLEAN COMPUTED ALWAYS AS _DatePaid IS NOT NULL,
"_xSplit" BOOLEAN COMPUTED ALWAYS AS _SplitLink IS NOT NULL,
"_Notes" CLOB COLLATE "ANSI_CI",
CONSTRAINT "PK" PRIMARY KEY ("_ID"),
CONSTRAINT "MoneyEntered" CHECK ((_Income IS NOT NULL) OR (_Expenditure IS NOT NULL))

)Roy Lambert
Wed, Oct 30 2019 8:03 PMPermanent Link

Richard Harding

Wise Nutrition Coaching

Roy

Valid SQL requires that the rows of a grouped table must only contain:
   * Grouping columns
   * Aggregate columns relating to the group
   * a function or constant
   * Expression consisting of the above items

So - both versions of the SQL are invalid. The rows of the grouped table must relate only to the GROUP.

The GROUP BY clause will partition the original table into groups that have the same value in the <grouping column list>. NULLs are in their own group which may be placed at the beginning or end of the group table.

The new grouped table is produced with each group being reduced to a single row.

To test what is happening, create a view consisting of a small subset of the Analysis table without the GROUP BY clause and then check that the group table is really a partition of the original table.

The SQL that is currently working could easily fail with a different set of data.

Does that make sense?

Also, the relationship between the tables Analysis and Names ON _fkProjects does not seem to be taken care of.


SELECT Analysis._fkProjects, Names._Name, Sum(_Income)
 FROM
   Analysis
     INNER JOIN Names ON Analysis._fkProjects = Names._fkProjects
     INNER JOIN Transactions ON Analysis._fkTransaction = Transactions._ID
 GROUP BY Analysis._fkProjects, Names._Name

"_fkProjects","_Name","SUM of _Income"
"One","Ophelia",45
"One","Portia",45
"Two","Ophelia",21
"Two","Portia",21
"Two","Puck",21

What do you think?

Richard
Thu, Oct 31 2019 5:43 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Richard

>Valid SQL requires that the rows of a grouped table must only contain:


I'm not sure that your definition of valid sql is 100% correct since you can have other items in there with it being non-deterministic which value you get

>To test what is happening, create a view consisting of a small subset of the Analysis table without the GROUP BY clause and then check that the group table is really a partition of the original table.

That is how I know one result is correct and the other isn't Smiley

I had thought that I can accomplish what I want with a join rather than a sub-select and that's one of the things on my list for today - may need LEFT OUTER rather than INNER though because of nulls

Roy Lambert
Thu, Oct 31 2019 3:45 PMPermanent Link

Richard Harding

Wise Nutrition Coaching

Roy

<<I'm not sure that your definition of valid sql is 100% correct since you can have other items in there with it being non-deterministic which value you get>>

They are not my rules for valid SQL. The wise people who gave us SQL devised these rules to ensure that GROUP BY clause would give consistent and valid results. Break them and it is highly likely that it will end in tears.

Richard
Image