From time to time I’m asked to aggregate strings from multiple records into a single column using SQL. Here’s an example, showing a comma-separated list of ordered employee names per department based on the famous EMP and DEPT tables.
DEPTNO | DNAME | ENAME_LIST |
---|---|---|
10 | ACCOUNTING | CLARK, KING, MILLER |
20 | RESEARCH | ADAMS, FORD, JONES, SCOTT, SMITH |
30 | SALES | ALLEN, BLAKE, JAMES, MARTIN, TURNER, WARD |
40 | OPERATIONS |
Oracle introduced the aggregate function LISTAGG for that purpose in 11.2. If you may use LISTAGG go for it, but if you have to work with an older version of the Oracle Database Server you might be interested in some other options which I discuss per Oracle Database version.
I cover just some options in this post, if you are interested in more then please visit Tim Hall’s String Aggregation Techniques on oracle-base.com.
Oracle7
More than twenty years ago PL/SQL was introduced as part of the Oracle Database Server version 7.0 allowing to write functions to be used in SQL statements. Back then something like the following was necessary to build comma-separated values:
CREATE OR REPLACE FUNCTION deptno_to_ename_list(in_deptno IN VARCHAR2)
RETURN VARCHAR2 IS
CURSOR l_cur IS
SELECT ename
FROM emp
WHERE deptno = in_deptno
ORDER BY ename;
l_ret VARCHAR2(2000);
BEGIN
FOR l_rec IN l_cur
LOOP
IF l_cur%ROWCOUNT > 1 THEN
l_ret := l_ret || ', ';
END IF;
l_ret := l_ret || l_rec.ename;
END LOOP;
RETURN l_ret;
END;
/
SELECT deptno, dname, deptno_to_ename_list(deptno) AS ename_list
FROM dept
ORDER BY deptno;
Oracle8
Version 8.0 came with the object option allowing to solve the problem in more generic ways.
CREATE OR REPLACE TYPE string_tabtype IS TABLE OF VARCHAR2(2000);
CREATE OR REPLACE FUNCTION collection_to_comma_list(
in_strings IN string_tabtype
) RETURN VARCHAR2 IS
l_ret VARCHAR2(2000);
BEGIN
IF in_strings.COUNT > 0 THEN
FOR i IN 1 .. in_strings.COUNT
LOOP
IF i > 1 THEN
l_ret := l_ret || ', ';
END IF;
l_ret := l_ret || in_strings(i);
END LOOP;
END IF;
RETURN l_ret;
END;
/
SELECT d.deptno,
d.dname,
collection_to_comma_list(
CAST(
MULTISET(
SELECT ename
FROM emp e
WHERE e.deptno = d.deptno
ORDER BY ename
) AS string_tabtype
)
) AS ename_list
FROM dept d
ORDER BY d.deptno;
Another option was to use a REF CURSOR instead of a collection type. The PL/SQL part was executable in Oracle7 too, but the CURSOR expression was not available back then. BTW: SYS_REFCURSOR was introduced in 9.0, so this specific PL/SQL type is really necessary with version 8.0.
CREATE OR REPLACE PACKAGE mytypes_pkg IS
TYPE refcursor_type IS REF CURSOR;
END mytypes_pkg;
/
CREATE OR REPLACE FUNCTION cursor_to_comma_list(
in_refcursor IN mytypes_pkg.refcursor_type
) RETURN VARCHAR2 IS
l_string VARCHAR2(2000);
l_ret VARCHAR2(2000);
BEGIN
LOOP
FETCH in_refcursor INTO l_string;
EXIT WHEN in_refcursor%NOTFOUND;
IF in_refcursor%ROWCOUNT > 1 THEN
l_ret := l_ret || ', ';
END IF;
l_ret := l_ret || l_string;
END LOOP;
CLOSE in_refcursor;
RETURN l_ret;
END;
/
SELECT d.deptno,
d.dname,
cursor_to_comma_list(
CURSOR(
SELECT ename
FROM emp e
WHERE e.deptno = d.deptno
ORDER BY ename
)
) AS ename_list
FROM dept d
ORDER BY d.deptno;
Oracle9i Release 1
Version 9.0 came with basic XML support which allowed to aggregate strings without the need for a helper function.
SELECT d.deptno,
d.dname,
RTRIM(
SYS_XMLAGG(
SYS_XMLGEN(
e.ename||', '
)
).EXTRACT(
'/ROWSET/ROW/text()'
).getStringVal(),
', '
) AS ename_list
FROM dept d
LEFT JOIN emp e ON e.deptno = d.deptno
GROUP BY d.deptno, d.dname
ORDER BY d.deptno, d.dname;
In this solution, the sort order of the aggregated strings is not definable. Version 9i introduced also user-defined aggregate functions. To use them you need to implement the ODCIAggregate interface which allows you to sort the result.
CREATE OR REPLACE TYPE string_tabtype AS TABLE OF VARCHAR2(2000);
CREATE OR REPLACE TYPE mylistagg_type AS OBJECT (
strings string_tabtype,
STATIC FUNCTION ODCIAggregateInitialize(
sctx IN OUT mylistagg_type
) RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateIterate(
SELF IN OUT mylistagg_type,
value IN VARCHAR2
) RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateTerminate(
SELF IN mylistagg_type,
returnValue OUT VARCHAR2,
flags IN NUMBER
) RETURN NUMBER,
MEMBER FUNCTION ODCIAggregateMerge(
SELF IN OUT mylistagg_type,
ctx2 IN mylistagg_type)
RETURN NUMBER
);
CREATE OR REPLACE TYPE BODY mylistagg_type IS
STATIC FUNCTION ODCIAggregateInitialize(
sctx IN OUT mylistagg_type
) RETURN NUMBER IS
BEGIN
sctx := mylistagg_type(string_tabtype());
RETURN ODCIConst.Success;
END;
MEMBER FUNCTION ODCIAggregateIterate(
SELF IN OUT mylistagg_type,
value IN VARCHAR2
) RETURN NUMBER IS
BEGIN
SELF.strings.EXTEND;
SELF.strings(SELF.strings.COUNT) := VALUE;
RETURN ODCIConst.Success;
END ODCIAggregateIterate;
MEMBER FUNCTION ODCIAggregateTerminate(
SELF IN mylistagg_type,
returnValue OUT VARCHAR2,
flags IN NUMBER
) RETURN NUMBER IS
l_sorted_strings string_tabtype;
l_return_value VARCHAR2(2000);
BEGIN
SELECT COLUMN_VALUE
BULK COLLECT INTO l_sorted_strings
FROM TABLE(strings)
ORDER BY COLUMN_VALUE;
FOR i IN 1 .. l_sorted_strings.COUNT
LOOP
IF i > 1 THEN
l_return_value := l_return_value || ', ';
END IF;
l_return_value := l_return_value || l_sorted_strings(i);
END LOOP;
returnValue := l_return_value;
RETURN ODCIConst.Success;
END ODCIAggregateTerminate;
MEMBER FUNCTION ODCIAggregateMerge(
SELF IN OUT mylistagg_type,
ctx2 IN mylistagg_type
) RETURN NUMBER IS
BEGIN
FOR i IN 1 .. ctx2.strings.COUNT
LOOP
SELF.strings.EXTEND;
SELF.strings(SELF.strings.COUNT) := ctx2.strings(i);
END LOOP;
RETURN ODCIConst.Success;
END ODCIAggregateMerge;
END;
/
CREATE OR REPLACE FUNCTION mylistagg (
in_string IN VARCHAR2
) RETURN VARCHAR2 PARALLEL_ENABLE AGGREGATE USING mylistagg_type;
/
SELECT d.deptno, d.dname, mylistagg(e.ename) AS ename_list
FROM dept d
LEFT JOIN emp e ON e.deptno = d.deptno
GROUP BY d.deptno, d.dname
ORDER BY d.deptno, d.dname;
Oracle9i Release 2
Version 9i Release 2 came with SQL/XML support and the function XMLAGG which replaces SYS_XMLAGG and allows sorting the elements to be aggregated (see line 7).
SELECT d.deptno,
d.dname,
RTRIM(
XMLAGG(
XMLELEMENT(
"e", e.ename || ', '
) ORDER BY e.ename
).EXTRACT(
'/e/text()'
).getStringVal(), ', '
) AS ename_list
FROM dept d
LEFT JOIN emp e ON e.deptno = d.deptno
GROUP BY d.deptno, d.dname
ORDER BY d.deptno, d.dname;
One may argue that using RTRIM to get rid of the last comma is not the way to interact with XML, especially since SQL/XML supports XSLT. But probably no one can deny that writing the appropriate stylesheet is a bit more complex and time-consuming. Nonetheless, here’s an XSLT example:
SELECT d.deptno,
d.dname,
XMLTRANSFORM(
XMLAGG(
XMLELEMENT(
"e", e.ename
) ORDER BY e.ename
), '<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:for-each select="*">
<xsl:if test="position() != 1">
<xsl:value-of select="'', ''"/>
</xsl:if>
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>'
).getStringVal() AS ename_list
FROM dept d
LEFT JOIN emp e ON e.deptno = d.deptno
GROUP BY d.deptno, d.dname
ORDER BY d.deptno, d.dname;
Oracle Database 11g Release 2
As mentioned at the beginning, in version 11g Release 2 Oracle finally introduced the aggregate function LISTAGG to conveniently aggregate strings.
SELECT d.deptno,
d.dname,
LISTAGG (
e.ename, ', '
) WITHIN GROUP (
ORDER BY e.ename
) AS ename_list
FROM dept d
LEFT JOIN emp e ON e.deptno = d.deptno
GROUP BY d.deptno, d.dname
ORDER BY d.deptno, d.dname;
Performance Comparison
To compare the runtime performance of the different solution approaches I created 1 million rows in the dept table and 5 million rows in the emp table using this script and measured the second serial execution of a CREATE TABLE AS SELECT statement for each of the 8 described approaches against my 11.2.0.3 instance. The following figure summarizes the results.
Conclusion
A lot of things have changed in the Oracle Database area since Oracle7, even in the niche of string aggregation. I recommend using LISTAGG (8) whenever possible and avoid the use of SYS_XMLAGG (4) or XSLT (7) for string aggregation. The Collection Type (2) approach is a good alternative if you do not mind creating helper objects otherwise use XMLAGG (6).
2 Comments
Thanks, this made my day, not only did it solve the request I had, but also I loved trying all the different methods of achieving the same result.
If you are looking to aggregate data more then 4000 characters then see below link.
ListAgg_CLOB Custom Function
http://sql-plsql-de.blogspot.com.au/2014/01/sql-listagg-mit-clob-ausgabe-kein.html