\name{estimateVarianceFunctions}
\Rdversion{1.1}
\alias{estimateVarianceFunctions}
\title{
    Estimate the variance functions for a CountDataSet.
}
\description{
    This function calls, for each condition that has replicates, the 
    lower-level function \code{\link{estimateVarianceFunctionForMatrix}} 
    to estimate the raw variance function for this condition.     
}
\usage{
estimateVarianceFunctions(cds, method = c( "normal", "blind", "pooled" ),
   pool = NULL, locfit_extra_args = list(), lp_extra_args = list(),
   modelFrame = NULL )
}
\arguments{
  \item{cds}{
      a CountDataSet with size factors
}
  \item{method}{
     There are three ways how the variance functions can be estimated:
     
     \itemize{     
     \item{normal}{For each condition with replicates, estimate a variance function
        by considering the data from samples for this condition. Then, construct
        a variance function '_max' that takes the maximum over all other variance
        functions and assign this one to all samples of unreplicated conditions.}
     
     \item{blind}{Ignore the sample labels and pretend that all samples are 
        replicates of a single condition. This allows to get a variance estimate
        even if one does not have any biological replicates. However, this can 
        leed to drastic loss of power; see the vignette for details. The single
        estimated variance condition is the called "_blind" and assigned to all
        samples.}
     
     \item{pooled}{Use the samples from all conditions with replicates to estimate
        a single pooled variance function, to be called "_pool" and assign it to all
        samples.} }
}  
  \item{pool}{This argument is deprecated; do not use it. It is (for now) 
        retained for compatibility with code written for DESeq versions 1.1.6 or
        eralier. Setting 'pool=FALSE' is the same as 'method="normal"' and setting
        'pool=TRUE' is the same as 'method="blind"' (not 'method="pool"). Note that
        the deprecation also resolved the issue that calling the 'blind' estimation
        a 'pooled' was incorrect use of terminology.}
      
  \item{locfit_extra_args, lp_extra_args}{
      Options to be passed to the \code{locfit} and to the \code{lp} function of the 
      locfit package. Use this  to adjust the local fitting. For example, you may pass 
      a value for \code{nn} different from the default (0.7) if the fit seems too smooth 
      or too rough by setting \code{lp_extra_agrs=list(nn=0.9)} or you can set
      \code{locfit_extra_args=list(maxk=200)} if you get the error that locfit ran
      out of nodes. See the documentation of the locfit package for details. Usually, 
      you will not need to adjuste the fitting parameters, as the defaults seem to 
      work quite fine. }
      
   \item{modelFrame}{
      By default, the information in conditions(cds) or pData(cds) are used to 
      determine, which samples are replicates (see \code{\link{newCountDataSet}}).
      Alternatively (and only for method "pooled"), a data frame can be passed 
      here, and all rows that are identical in this dat frame are considered
      replicated.
   }
      
}
\details{
   Behaviour for pooled=FALSE: The estimated raw variance functions are placed in the
   environment rawVarFuncs, which is a slot in CountDataSet, using the condition labels
   as names. A further function, named "_max", is placed there as well, which always
   return the maximum of all the other functions.
   
   Then, the \code{\link{rawVarFuncTable}} (q.v.) is filled to assign to each replicated
   condition the raw variance function estimated for it, and to each condition without
   replicates, the "_max" function. 
   
   Behaviour for pooled=TRUE: A single raw variance function is estimated from all the
   count data, ignoring the condition labels. It is stored in the rawVarFuncs slot
   under the name "_pooled". In the rawVarFuncTable, "_pooled" is assigned to all
   conditions.
   
   In either case, all the variance adjustment factors (see 
   \code{\link{varAdjFactors}}) are set to 1.
   
   It is advisable to always call \code{\link{residualsEcdfPlot}} afterwards to verify
   the fit.
}
\value{
   The CountDataSet cds, with the slots rawVarFuncs and rawVarFuncTable updated.
}
\seealso{
   \code{\link{scvPlot}} to visualize the result and \code{\link{varianceFitDiagnostics}} and 
   \code{\link{residualsEcdfPlot}} to check the fit.
}
\author{
   Simon Anders, sanders@fs.tum.de
}
\examples{
cds <- makeExampleCountDataSet()
cds <- estimateSizeFactors( cds )
cds <- estimateVarianceFunctions( cds )
vf <- rawVarFunc( cds, "A" )
vf( head( counts(cds)[,1] / sizeFactors(cds)[1] ) )
}