Friday, February 22, 2008

Exploring the Aggregate - avg function

avg


result = avg(set)



Lets say you have a text file named avg.txt which contains
1
2
3
4
5

So add that selecting 'use simple processing for standard CSV'
Add an avg function
And connect your Field1 to the nodes/rows of the avg function
Then create an output text component with 'simple processing'





Now if you click the Output tab you should get 3

Now, I don't know why you don't connect the Rows to the in-context it sort of seems like you should, but if you do it will give you a 5 (the last item in your list of numbers) maybe someone out there can explain this to me?

Lets do
1.1234
2.3
4.6
5.678
7.0000008

Which gets us 4.14028016

Well I didn't figure out where the rounding point of this happens but I think its around the 26th decimal point. I'll leave that to someone else to pursue.

1 comment:

Travis Waldorf said...

With regards to why connecting Rows to the parent-context of avg results in 5 (the last item), I think we can deduce that it is still performing the average function, but only on the last item. I think that connection is saying for each row (instead of for all rows) perform the average function, but the average block can only produce one result, and so it settles on the final result of the last item. This is confirmed by attempting to connecting the entire file group into the parent-context in the example. In this scenario, the average is correct as it is essentially calculating for each of it's children (for each group of rows, in our case we only have one group of rows, all rows) calculate the average of Field1's.

So it seems this is useful for escalating or deescalating the assumed group of averages, testing with multiple hierarchy layers would probably reveal what the aggregate average assumes with no connection (the absolute parent or the relative parent).