| Summarize {NCStats} | R Documentation |
Summary statistics for a single numeric or factor variable, possibly separated by the levels of a factor variable. Very similar to summary for a numeric variables and table for factor variables.
Summarize(object,...)
## Default S3 method:
Summarize(object,numdigs=NULL,addtotal=TRUE,percent=TRUE,percdigs=2,
na.rm=TRUE,exclude=NULL,...)
## S3 method for class 'formula':
Summarize(object,numdigs=NULL,addtotal=TRUE,percent=TRUE,percdigs=2,
na.rm=TRUE,exclude=NULL,...)
object |
A numeric vector. |
numdigs |
A numeric indicating the number of decimals to round the numeric summaries to. If left at NULL (default) then the number of digits will be obtained from getOption('digits'). |
addtotal |
A logical indicating whether totals should be added to tables (=TRUE, default) or not. |
percent |
A logical that indicates whether frequency tables should include percentages (=TRUE, default) or not. |
percdigs |
A numeric indicating the number of decimals to round the percentage summaries to. |
na.rm |
A logical that indicates whether numeric missing values (NA) should be removed (=TRUE, default) or not. |
exclude |
A string containing the code that should be excluded from the levels of the factor variable. |
... |
Other arguments to the generic summary, sd, or table functions. |
For numeric data this is the same as summary except that Summarize includes the sample size, valid sample size (sample size minus number of NAs) and standard deviation (i.e., sd). Also the output is ordered slightly differently.
For a factor variable this function computes a frequency table, a percentage table (if percent=TRUE), and a valid percentage table (percentage if "NA"s are excluded; if percent=TRUE). The tables will contain a total row if addtotal=TRUE.
The object argument can be a formula of form y~x where y can be either a numeric or factor variable and x can be only a factor variable. More complicated formulas are not supported. When y is numeric then the summary statistics of y will be computed for each level in x. When y is a factor then a two-way table will be computed. If addtotal=TRUE then row totals only will be added. If percent=TRUE then a row percentages table will be computed such that the percentages represent the percent in the levels of x for each level of y.
A named vector of summary statistics for numeric data and a matrix of frequencies and, possibly, percentages for factor variables.
Derek H. Ogle, dogle@northland.edu
summary, table, and tapply. Also look at summaryBy in doBy, describe and describe.by in psych, describe in prettyR, and basicStats in fBasics.
## Numeric vector
y <- runif(100)
summary(y) # typical output
Summarize(y) # this function
Summarize(y,numdigs=3) # this function, controlling the number of digits
## Factor vector
z <- factor(sample(c("A","B","C"),90,replace=TRUE))
Summarize(z)
## Factor vector with NAs
x <- factor(c(z,rep("NA",10)))
Summarize(x)
## Factor vector with NAs excluded
Summarize(x,exclude="NA")
## Numeric vector by levels of a factor variable
Summarize(y~x,numdigs=3)
## Numeric vector by factor levels with NAs excluded
Summarize(y~x,numdigs=3,exclude="NA")
## Summarizing all variables in a data frame
df <- data.frame(y,x)
lapply(as.list(df),Summarize,numdigs=4)