Best approach for "walking" a 1:m relationship?

Bobby_Hall · November 23, 2010, 4:57am

Hi, folks. Suppose I need to create invoices for a set of customers, each invoice consisting of a block of customer information followed by a variable number of detail lines. Standard stuff, but I’m used to reading records instead of executing queries, so I’m a little confused here.

Do I use one query to gather the desired set of customers, and then loop through the set of customers, issuing a query for each customer’s detail lines? Or do I use one query that gathers the desired set of customers AND their detail lines all at once?

In the first case, I see that many more queries are involved, while in the second case, the customer information is needlessly repeated in each row. However, the amount of data involved is not large enough to make a performance difference either way, so I’m asking more if there’s a standard way.

system · November 23, 2010, 5:07am

normally, you could get all the info (invoices etc etc) for a customer(s) using a single query.

the info would normally be stored in multiple tables and you join the tables using common columns to get the required output from each table.

we would really need to see the table structure of your tables in order to provide you more detail unless you want someone to post a generic plain vanilla basic example.

oddz · November 23, 2010, 5:49am

A single query to gather all the necessary data is preferred over multiple queries when not utilizing some type of ORM automation, caching layer. When using an ORM, with cache support it would probably be more efficient, in the long run to issue single queries for each separate entity. To gain the advantage of an ORM, it is many times necessary to break up what can be done in a single queries into multiple requests. However, that doesn’t seem to be the case here, so stick to a single query.

Bobby_Hall · November 23, 2010, 9:08pm

Thanks Kalon and oddz, a single query it is.

And yes, the data is already stored in two tables. Briefly,

Table: Customers
RecID int
Name string
Address string
City string
yada-yada

Table: Detail
CustID int
RecID int
Date date
ProductID int
Qty int
Price
yada-yada

I just wasn’t clear on the preferred approach to get the data for reporting purposes. Although using a single query gives you a table with lots of redundant data, I guess that’s better than hammering the server with queries.

r937 · November 23, 2010, 9:26pm

here’s an alternate starategy – Minimize Bandwith in One-to-Many Joins

you might not think this technique would apply in this instance since the columns between the customers and the order details are going to be quite disiimilar… still, there’s no problem doing it, really, if you can live with multiple NULL columns in the two different row types from the UNION

system · November 23, 2010, 9:28pm

there doesn’t have to be redundant data returned by the query.

you can get just the columns you want for a particular customer

example

 
select Name, Address, CustID, Date, ProductID
from Customers c, Detail d
where c.RecID = d.RecID
and Name = 'fred flinstone';

r937 · November 23, 2010, 9:33pm

two things

first, in your example, the customer data ~is~ repeated on each row, once per order detail

second, post #1 clearly says “a set of customers”

please try to keep up

system · November 23, 2010, 9:43pm

ooops

multi-tasking has never been my strong point

working on too many things atm.

Mrs K. has the :whip: going atm

Bobby_Hall · November 24, 2010, 8:41pm

r937 – Thanks for the UNION explanation. I may have to use that since I’m not getting a single query to work.

The sticking point are the aggregate functions. The balance for each invoice is not stored but rather calculated from the detail records when needed. For a single customer, the SUM() function works just fine, but if the query includes the invoice detail of several customers (as it would during a batch print of invoices), the SUM() function gives me the sum of all transactions. If I add a GROUP BY clause, the SUM() then correctly gives me the balance for each customer, but it also returns only a one row summary per customer.

Suppose…


Table: Transactions
CustomerID: str
TransID: int
TransDesc: str
TransAmt: int

Data:
[B]Cust T Desc  Amt[/B]
SMIT 1 Shoes 100
SMIT 2 Purse  75
FORD 3 Shoes 125
SMIT 4 Pants  50
FORD 5 Dress  60
JONE 6 Scarf  10

I want to issue a query that returns this:
[B]Cust T Desc  Amt [I]Bal[/I][/B]
FORD 3 Shoes 125 185
FORD 5 Dress  60 185
JONE 6 Scarf  10  10
SMIT 1 Shoes 100 225
SMIT 2 Purse  75 225
SMIT 4 Pants  50 225

But
SELECT CustomerID, TransID, TransDesc, TransAmt, SUM(TransAmt)
FROM Transactions
GROUP BY CustomerID
returns just one row per CustomerID, like this.

[B]Cust T Desc  Amt [I]Bal[/I][/B]
FORD 5 Dress  60 185
JONE 6 Scarf  10  10
SMIT 4 Pants  50 225

So I’m hacking away. Meanwhile, any clues are very appreciated!

r937 · November 24, 2010, 8:56pm

are we still talking about the same problem as post #1?

SELECT t.CustomerID
     , t.TransID
     , t.TransDesc
     , t.TransAmt
     , s.Balance
  FROM Transactions AS t
INNER
  JOIN ( SELECT CustomerID
              , SUM(TransAmt) AS Balance
           FROM Transactions
         GROUP
             BY CustomerID ) AS s
    ON s.CustomerID = t.CustomerID

Bobby_Hall · November 27, 2010, 1:53am

Ha, you make it look so easy!

I always knew that SQL wasn’t a trivial thing to learn, but I didn’t realize I’d hit the brick wall this soon. Rather than imposing on you to retype your book, I guess I’ll just buy it.

I was away the last couple of days for Thanksgiving holiday in the US. Thanks for your patient answers. I’m grateful for them as well.