A generic continuous random variable class meant for subclassing.
`rv_continuous` is a base class to construct specific distribution classes and instances for continuous random variables. It cannot be used directly as a distribution.
Parameters ---------- momtype : int, optional The type of generic moment calculation to use: 0 for pdf, 1 (default) for ppf. a : float, optional Lower bound of the support of the distribution, default is minus infinity. b : float, optional Upper bound of the support of the distribution, default is plus infinity. xtol : float, optional The tolerance for fixed point calculation for generic ppf. badvalue : float, optional The value in a result arrays that indicates a value that for which some argument restriction is violated, default is np.nan. name : str, optional The name of the instance. This string is used to construct the default example for distributions. longname : str, optional This string is used as part of the first line of the docstring returned when a subclass has no docstring of its own. Note: `longname` exists for backwards compatibility, do not use for new subclasses. shapes : str, optional The shape of the distribution. For example ``'m, n'`` for a distribution that takes two integers as the two shape arguments for all its methods. If not provided, shape parameters will be inferred from the signature of the private methods, ``_pdf`` and ``_cdf`` of the instance. extradoc : str, optional, deprecated This string is used as the last part of the docstring returned when a subclass has no docstring of its own. Note: `extradoc` exists for backwards compatibility, do not use for new subclasses. seed : None, int, `~np.random.RandomState`, `~np.random.Generator`
, optional This parameter defines the object to use for drawing random variates. If `seed` is `None` the `~np.random.RandomState` singleton is used. If `seed` is an int, a new ``RandomState`` instance is used, seeded with seed. If `seed` is already a ``RandomState`` or ``Generator`` instance, then that object is used. Default is None.
Methods ------- rvs pdf logpdf cdf logcdf sf logsf ppf isf moment stats entropy expect median mean std var interval __call__ fit fit_loc_scale nnlf support
Notes ----- Public methods of an instance of a distribution class (e.g., ``pdf``, ``cdf``) check their arguments and pass valid arguments to private, computational methods (``_pdf``, ``_cdf``). For ``pdf(x)``, ``x`` is valid if it is within the support of the distribution. Whether a shape parameter is valid is decided by an ``_argcheck`` method (which defaults to checking that its arguments are strictly positive.)
**Subclassing**
New random variables can be defined by subclassing the `rv_continuous` class and re-defining at least the ``_pdf`` or the ``_cdf`` method (normalized to location 0 and scale 1).
If positive argument checking is not correct for your RV then you will also need to re-define the ``_argcheck`` method.
For most of the scipy.stats distributions, the support interval doesn't depend on the shape parameters. ``x`` being in the support interval is equivalent to ``self.a <= x <= self.b``. If either of the endpoints of the support do depend on the shape parameters, then i) the distribution must implement the ``_get_support`` method; and ii) those dependent endpoints must be omitted from the distribution's call to the ``rv_continuous`` initializer.
Correct, but potentially slow defaults exist for the remaining methods but for speed and/or accuracy you can over-ride::
_logpdf, _cdf, _logcdf, _ppf, _rvs, _isf, _sf, _logsf
The default method ``_rvs`` relies on the inverse of the cdf, ``_ppf``, applied to a uniform random variate. In order to generate random variates efficiently, either the default ``_ppf`` needs to be overwritten (e.g. if the inverse cdf can expressed in an explicit form) or a sampling method needs to be implemented in a custom ``_rvs`` method.
If possible, you should override ``_isf``, ``_sf`` or ``_logsf``. The main reason would be to improve numerical accuracy: for example, the survival function ``_sf`` is computed as ``1 - _cdf`` which can result in loss of precision if ``_cdf(x)`` is close to one.
**Methods that can be overwritten by subclasses** ::
_rvs _pdf _cdf _sf _ppf _isf _stats _munp _entropy _argcheck _get_support
There are additional (internal and private) generic methods that can be useful for cross-checking and for debugging, but might work in all cases when directly called.
A note on ``shapes``: subclasses need not specify them explicitly. In this case, `shapes` will be automatically deduced from the signatures of the overridden methods (`pdf`, `cdf` etc). If, for some reason, you prefer to avoid relying on introspection, you can specify ``shapes`` explicitly as an argument to the instance constructor.
**Frozen Distributions**
Normally, you must provide shape parameters (and, optionally, location and scale parameters to each call of a method of a distribution.
Alternatively, the object may be called (as a function) to fix the shape, location, and scale parameters returning a 'frozen' continuous RV object:
rv = generic(<shape(s)>, loc=0, scale=1) `rv_frozen` object with the same methods but holding the given shape, location, and scale fixed
**Statistics**
Statistics are computed using numerical integration by default. For speed you can redefine this using ``_stats``:
- take shape parameters and return mu, mu2, g1, g2
- If you can't compute one of these, return it as None
- Can also be defined with a keyword argument ``moments``, which is a string composed of 'm', 'v', 's', and/or 'k'. Only the components appearing in string should be computed and returned in the order 'm', 'v', 's', or 'k' with missing values returned as None.
Alternatively, you can override ``_munp``, which takes ``n`` and shape parameters and returns the n-th non-central moment of the distribution.
Examples -------- To create a new Gaussian distribution, we would do the following:
>>> from scipy.stats import rv_continuous >>> class gaussian_gen(rv_continuous): ... 'Gaussian distribution' ... def _pdf(self, x): ... return np.exp(-x**2 / 2.) / np.sqrt(2.0 * np.pi) >>> gaussian = gaussian_gen(name='gaussian')
``scipy.stats`` distributions are *instances*, so here we subclass `rv_continuous` and create an instance. With this, we now have a fully functional distribution with all relevant methods automagically generated by the framework.
Note that above we defined a standard normal distribution, with zero mean and unit variance. Shifting and scaling of the distribution can be done by using ``loc`` and ``scale`` parameters: ``gaussian.pdf(x, loc, scale)`` essentially computes ``y = (x - loc) / scale`` and ``gaussian._pdf(y) / scale``.