sql - How to apply aggregate function only on contiguous rows? -


on postgresql 9.4, i'm trying achieve i'll call "aggregate function" on contiguous rows. example:

input data:

recipe  prod1   prod2   timestamp 0       5       4       2015-07-02 08:10:34.357 0       2       7       2015-07-02 08:13:45.352 0       7       0       2015-07-02 08:16:22.098 1       3       2       2015-07-02 08:22:14.678 1       9       4       2015-07-02 08:22:56.123 2       2       6       2015-07-02 08:26:37.564 2       1       7       2015-07-02 08:27:33.109 2       0       8       2015-07-02 08:31:11.687 0       3       5       2015-07-02 08:40:01.345 1       4       2       2015-07-02 08:42:23.210 

desired output:

recipe  prod1_sum   prod2_avg   timestamp_first             timestamp_last 0       14          3.6666      2015-07-02 08:10:34.357     2015-07-02 08:16:22.098 1       12          3           2015-07-02 08:22:14.678     2015-07-02 08:22:56.123 2       3           7           2015-07-02 08:26:37.564     2015-07-02 08:31:11.687 0       3           5           2015-07-02 08:40:01.345     2015-07-02 08:40:01.345 1       4           2           2015-07-02 08:42:23.210     2015-07-02 08:42:23.210 

basically, 1 output line each "group" of contiguous rows (when table sorted on timestamp column) same "recipe" value. in output, prod1_sum sum of prod1 in "group", prod2_avg average of prod2 in same "group", , 2 last columns respectively first , last timestamps in group. there several distinct groups same "recipe" value, , want output row each of them.

at moment, i've ugly way of obtaining based on several requests , lot of data processing outside of db, want avoid, , not worth showing.

my problem "grouping" of rows. know how create aggregate function want, if apply each group individually. have looked windows functions, seems group values recipe, not conforming "contiguous rows" principle need respect.

you can use following query:

select recipe, sum(prod1) prod1_sum,        avg(prod2) prod2_avg,         min(timestamp) timestamp_first, max(timestamp) timestamp_last (           select recipe, prod1, prod2, timestamp,           row_number() on (order timestamp)            -            row_number() on (partition recipe                               order timestamp) grp    mytable ) t group recipe, grp order timestamp_first 

the trick here usage of row_number window function identify islands of continuous recipe values: grp calculated field this.

demo here


Comments

Popular posts from this blog

OpenCV OpenCL: Convert Mat to Bitmap in JNI Layer for Android -

android - org.xmlpull.v1.XmlPullParserException: expected: START_TAG {http://schemas.xmlsoap.org/soap/envelope/}Envelope -

python - How to remove the Xframe Options header in django? -