[[statistic()]]

`mixed `

**statistic**(string *statistic*, array|string *variables*, mixed *option*, [boolean *alldata*])

The function statistic() can determine specific univariate data from the data record (across all previous questionnaires).

*statistic*

Which statistic should be calculated?`'count`

' – counts the frequency of the value specified as

.*option*`'percent`

' – percentage of the value specified as

.*option*`'crosscount`

' – counts the frequency of the joint occurrence of two values in two variables. The two variables should be specified as an array (or separated with a comma), as well as their values that are specified as

.*option*`'mode`

' – most commonly occurring value.`'min`

' – lowest value.`'max`

' – highest value.`'mean`

' – arithmetic mean of the values.`'groupmean`

' – Arithmetic mean of the values of a subgroup defined by*Option*, specified as Sting consisting of variable name and code for the cases to be counted`'AB01=2`

'.`'filter`

' – Determines which cases should be used for further calls to the`statistic()`

function (for details see down).

*variables*

Determines which variable(s) the statistic should be calculated for. The IDs of the individual variables can be found in the**Variables Overview**. If the statistic requires multiple variables, these can be given as a comma-separated string or as an array.*option*

Some statistics call for or allow a third entry which is set with this parameter (see below).*alldata*

This entry is optional and determines that all questionnaires be entered into the statistics; not just those that have been completed.

**Note:** If `true`

is not explicitly specified for the parameter *alldata*, only completed questionnaires are included when calculating the statistical values.

**Note:** Test data collected during the developing of the questionnaire and pretesting is only included if the current questionnaire is a part of the test as well. If the questionnaire is being carried out as part of the regular data collection, `statistic()`

only counts data from the regular data collection.

**Note:** The data from the current interview are not considered by `statistic()`

.

**Tip:** The function `statistic()`

can be used to close the questionnaire after reaching a predefined quota (Quota) and either display a message to further respondents or redirect them to the quota stop link of a panel provider.

**Tip:** If you do not want to count all completed interviews (e.g. if dropouts were redirected to another page using `redirect()`

), it makes sense to copy the variable to be counted to a Internal Variables further back in the questionnaire.

When counting the frequency (`count`

), a third argument can be specified: which value the frequency should be determined for. If a third value is not given, the number of valid responses is output. Missing data is not counted.

For example, in the questionnaire there is a question where the respondent selects their gender (1=female, 2=male, -9=no input). The number of women who entered the third value `1`

can be determined like so:

$numberwomen = statistic('count', 'SD01', 1); // frequency of women (1) $numbermen = statistic('count', 'SD01', 2); // frequency of men (2) $numbercompleted = statistic('count', 'SD01'); // number of valid data $numberall = statistic('count', 'SD01', false, true); // all data records html(' <p>So far,'.$numberall.' people specified their gender in this survey, but the questionnaire was only completed in '.$numbercompleted.' cases.</p> <p>The questionnaires completed are made up of '. $numberwomen.' women and '. $numbermen.' men.</p> '); question('SD01'); // question about the respondent's gender

The `'crosscount`

' statistic counts the cases (like in cross-tabulations) in which multiple variables apply.

Instead of a single variable, two or more variables are specified as an array or separated with a comma (`,`

). The values being counted for each variable are specified as the third parameter *option*. Only cases which have specified the first value for the first variable, the second value for the second variable and so on are counted.

$nYoungFemale = statistic('crosscount', 'SD01,SD02', '2,1'); // variables and values in a list with commas ... $nGrownFemale = statistic('crosscount', array('SD01','SD02'), array(2,2)); // ... or in arrays html(' <p>So far, '.$nYoungFemale.' people have stated in this survey that they are female and in age group 1 (up to 18 years old). '.$nGrownFemale.' women stated they were older than 19 years old.</p> '); question('SD01'); // question about the respondent's gender question('SD02'); // question about the respondent's age

The output is the percentage of a value within all valid data. The value to be counted must be given as the third argument.

$numberwomen = statistic('percent', 'SD01', 1); // percentage of women html(' <p>So far, '. $numberwomen.' women have taken part in this survey.</p> '); question('SD01'); // question about the respondent's gender

This returns the value that has been selected most frequently so far. If multiple values have been selected equally often then these are returned separated by a comma.

As a third argument (in this instance a Boolean), it is possible to specify if invalid values (no answer etc.) should also be counted.

$mode = statistic('mode', 'AB01_02', true); $modes = explode(',', $mode); // separate multiple values if (count($modes) > 1) { // multiple values stated most frequently html(' <p>Multiple answers were selected equally often.</p> '); } else { // answer options text (statistic() only provides the numeric code) $text = getValueText('AB01_02', $mode); html(' <p>The most common answer for this question was: '.$text.'.</p> '); }

The statistics `'min`

', `'mean`

' und `'max`

' only calculate a correct value if numerical values exist for the question. Data in a text input is ignored if it is not a number – unless is it is specified that invalid values should also be entered into the statistics (`true`

) as the third parameter.

If no valid values are available, 0 is returned as the `'mean`

, and the value `false`

as the `min`

and `max`

.

$min = statistic('min', 'BB01_03'); $max = statistic('max', 'BB01_03'); $mean = statistic('mean', 'BB01_03'); html(' <p>The participant has given the programme an average rating of '.$mean.' so far.</p> <p>The ratings lie between '.$min.' und '.$max.'.</p> ');

By using `statistic('filter', …)`

a filter can be set, which will be applied for all further calls of `statistic()`

. The second parameter can be *variables* for acceleration (optional), which are needed in subsequent calls.

The number of cases matching the filter is returned. The fourth parameter *AllData* only affects the return value, but not the further counting.

// Statistics on female respondents only (SD02 = 1) // RT variables are loaded immediately to reduce latency $n = statistic('filter', array('RT02_01', 'RT02_02', 'RT02_03'), 'SD02==1'); // Mean value of ratings (women only) $mean1 = statistic('mean', 'RT02_01'); $mean2 = statistic('mean', 'RT02_02'); $mean3 = statistic('mean', 'RT02_03');

The filter allows common comparison operators (`>`

, `>=`

, `<`

, `<=`

, `!=`

, `==`

), brackets and and Boolean operators (`AND`

, `&&`

, `OR`

, `||`

, `NOT`

, `!`

).

**Note:** Comparisons are only possible between one variable and a constant value (a number or string), e.g. `SD02==2`

, comparisons between two variables (`SD03>SD04`

) are not supported.

// Statistics only on female respondents (SD02 = 1) aged 35 and over (SD03 >= 35) $n = statistic('filter', false, '(SD02==1) AND (SD03 >= 35)');

Besides the variable names you can also use `QUESTNNR`

, `CASE`

and `LANGUAGE`

for the filter.

// Statistics only on female respondents (SD02 = 1) aged 35 and over (SD03 >= 35) in the German language version $n = statistic('filter', false, '(SD02==1) AND (SD03 >= 35) AND (LANGUAGE == "ger")');

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International