# Statistics

## Create Statistics

Statistics can be created for files, compensated files, and gates. To create a statistic, select the Statistic button from the side panel. There are several different statistics that can be added. If statistics are added to compensated files, the stat is calculated after compensation is applied.

### Mean

Calculated by adding the values of all events for a parameter and diving by the total number of events.

### Geometric Mean

Calculated by raising e to the power of the arithmetic mean of the natural logarithm of all events for a parameter.

Geometric Mean(α) = exp(arithmetic mean(ln(α)))

#### Warning about Geometric Means

Even though this statistic is used extensively in a bunch of different disciplines, it's not well suited for flow cytometry in our opinion. The biggest reason is that it does not work if your dataset includes any event that has a non-positive value (i.e. if any event is 0 or negative).

Some flow cytometry software try to sidestep this issue with tricks that cause some unfortunate effects. One such trick is to use binned data. What this means is that all of the events are put into a bucket from 1 to some number (255, or more) based on whatever axis transform is currently set on the parameter of interest. The geometric mean is calculated on these binned values and then scaled back using the transform.

We have decided against implementing anything like this in Floreada. The reason is that if you calculate the geometric mean with a certain axis transform set for a parameter, if you change the parameter transform and recalcuate the geometric mean, *you will get a different result with the same underlying data*. Not only that but the transform and the number of bins are not specified in the result, so its not reproducible.

Our implementation therefore only works for datasets with all positive values. For other datasets it will error.

### Median

Equivalent to taking 50th percentile.

### Std Dev

Standard Deviation - Calculated by first finding the mean for all the events for a parameter. Then for each event, we find the square of its distance to the mean and sum them. Then we divide by the the total number of events and take the square root.

### CV

Coefficient of Variation - the ratio between the standard deviation and mean.

### Correlation

Pearson Correlation Coefficient for two parameters. Calculated with the following:

r = Σ(xᵢ - x̄)(yᵢ - ȳ) / sqrt(Σ(xᵢ - x̄)² * Σ(yᵢ - ȳ)²)

where:

- xᵢ = values of first parameter
- x̄ = mean value of first parameter
- yᵢ = values of second parameter
- ȳ = mean value of second parameter

### Count

The number of events that are contained within the population.

### Concentration

The concentration of the population in cells/milliliter. In order to use this statistic your FCS file must contain the $VOL keyword, which provides the amount of volume in the sample run. If your file doesn't have this keyword, the "Concentration" option will be greyed out. Note that although the FCS3.1 specification mandates that this value be in nanoliters, some cytometers completely ignore this and use other units, most often microliters. If your FCS files are non compliant with the specification, concentration statistics will be incorrect.

### Min

The minimum value within the population.

### Max

The maximum value within the population.

### Percent

Calculated by taking the number of events in a population divided by the number of events in the file.

### Parent Percent

Calculated by taking the number of events in a population divided by the number of events in its parent. If this stat is applied to a population without a parent (a file for instance), the percent will be 100%.

### Percentile

The value below which a certain percentage of events (cells) in a dataset would be found.

### Keyword

Every FCS file has a bunch of included keywords. These are key:value pairs that give information about the name of the file, the parameters, the cytometer, and a bunch more information about the data. Using the Keyword statistic you can show the value of a particular keyword.

### Label

Allows you to attach a label with a custom value of your choosing to a population.

## Delete Statistic

Statistics can be deleted by right clicking them in the file tree and selecting "Delete".

## Copy Statistic

Statistics can be dragged and dropped onto other files, compensated files, or gates.

## Renaming Statistics

When you create a new stat, you have the option to input a custom display name if you so choose. You can also rename the stat at any time by right clicking its entry in the file tree and selecting "Rename". Providing a custom display name for a statistic may be useful if the fully qualified name of the stat is very long because of a very nested gating tree, which can be difficult to work with when exported as a CSV file (See next section).

## Export Statistics

Statistics can be exported to a CSV file which can be opened in any spreadsheet-type application (Excel) for further analysis. To export stats, click "File" then "Save CSV Statistic File". When exported, all statistics from all open files in the current project will be tabulated.

Statistics will be shown in the CSV file using a fully qualified path (i.e. [File]/[Gate]/[Gate]/[Stat]). If you have a very nested gating hierarchy this can be difficult to work with. If you have assigned a custom name to the statistic, both the custom name and the full name will be exported and can make further analysis a little easier.