Skip to content
Drew Resnick edited this page Jul 20, 2022 · 3 revisions
calc_classifiability(P, Q)
    Implement the Michaelangeli (1995) Classifiability Index.
    
    The variable naming here is not pythonic but follows the notation in the 1995 paper
    which makes it easier to follow what is going on. You shouldn't need to call
    this function directly but it is called in cluster_xr_eof.
    
    Parameters
    ----------
        P : array
            A cluster centroid
        Q : array
            Another cluster centroid
    Returns
    -------
        ci : float
            Classifiability index value.

cos(x, /)
    Return the cosine of x (measured in radians).

download_data(url, authkey, outfile, force_download=False)
    A smart function to download data from IRI Data Library.
    
    If the data can be read in and force_download is False, will read from file
    Otherwise will download from IRIDL and then read from file
    
    Parameters
    ----------
        url : str
            The url pointing to the data.nc file.
        authkey : str
            The authentication key for IRI DL (see above).
        outfile : str
            The data filename.
        force_download: Bool, optional
            False if it's OK to read from existing file, 
            True if data *must* be re-downloaded.
    Returns
    -------
        data : dataFrame
            Dataframe of dataset specified in url or file.

get_classifiability_index(centroids: numpy.ndarray) -> Tuple[float, int]
    Get the classifiability of a set of centroids.
    
    This function will compute the classifiability index for a set of centroids and
    indicate which is the best one.
    
    Parameters
    ----------
    centroids: array
        Input array of centroids, indexed [simulation, dimension]
    Returns
    -------
    classifiability : float
        Classifiability index value.
    best_part : int
        The best centroid.

get_number_eof(X: numpy.ndarray, var_to_explain: float, plot=False) -> int
    Get the number of EOFs of X that explain a given variance proportion.
    
    Parameters
    ----------
    X : ndarray
    var_to_explain : float
        Proportion (0 to 1) of variance to be explained.
    plot : Bool, optional
        Default plot=Flase will not generate plot
    Returns
    -------
    n_eof : float
    Number of EOF's retained for the chosen percentage of variance.
    Notes
    -----
    Plot generated by the function is 'number of EOFs' versus 
    'Cumulative proportion of variance explained'.

loop_kmeans(X: numpy.ndarray, n_cluster: int, n_sim: int) -> Tuple[numpy.ndarray, numpy.ndarray]
    Compute weather types using k means clustering.
    
    Should have more information on what this does.
    
    Parameters
    ----------
    X: array 
        PCA reanalysis data.
    n_cluster: int 
        How many clusters to compute.
    n_sim: int 
        how many times to initialize the clusters 
        (note: computation increases order (n_sim**2)).
    Returns
    -------
    centroids : array
        Centroids.
    w_types : array
        Weather types.
    Notes
    -----
    X should be in reduced dimension space already; indexed [time, dimension].

plot_procrustesAnalysis(Procrustes, savefig=False)
    Plot results of procrustes analysis.
    
    Contour plots which show the different elements of the 
    procrustes analysis used to correct the weather types.
    
    Parameters
    ----------
    Procrustes : dataFrame
        Output of procrustesAnalysis(). 
    savefig : Bool optional
         Determines if plot will be saved.
    Returns
    -------
    plt.show() : matplotlib plot   
        Contour plots showing model weather types without correction, 
        and the scaled, rotated, and translated data.
    Notes
    -----
    If savefig=True, figure will be saved to current directory as 
    'ProcrustesAnalysis_ + {model} + .pdf' 
    
    `Procrustes` includes the original model weather type data, as well as the 
    scale,rotation, and translation data used to correct the model 
    data in the procrustes analysis.

plot_procrustesCorrection(WTf, savefig=False)
    Plot the WTf output of the procrustes analysis.
    
    Plot the corrected weather type model outputs against the 
    original WT model and reanalysis datasets.
    
    Parameters
    ----------
    WTf : dataFrame
        Output of procrustesAnalysis() which includes 
        reanalysis and model WTs, and the corrected WTs.
    savefig : Bool, optional
        Determines if plot will be saved.
    Returns
    -------
    plt.show() : matplotlib plot   
        Contour plots showing comparison between model, reanalysis,
        and corrected weather types.
    Notes
    -----
    If savefig=True, figure will be saved to current directory as 
    'ProcrustesCorrection_ + {model} + .pdf'

plot_reaVSmod(WTmod, WTrea, model, reanalysis='MERRA', savefig=False)
    Plot reanalysis and model datasets
    
    Plots smoothed reanalysis data and 
    smoothed, interpolated model datasets as contour maps.
    
    Parameters
    ----------
    WTmod : dataFrame
        Model dataset.
    WTrea : dataFrame
        Reanalysis dataset.
    model : str
        Name of model dataset.
    reanalysis : str, optional
        Name of reanalysis dataset. 
        Default reanalysis='MERRA' (data used in example calculations).
    savefig : Bool, optional
        Determines if plot will be saved.
    Returns
    -------
    plt.show() : matplotlib plot
        Contour plot showing reanalysis and model WTs.
    Notes
    -----
    If savefig=True, figure will be saved to current directory as 'plot_reaVSmod.png'

prepareDS_procrustes(WTmod, WTrea)
    Prepare model and reanalysis datasets for analysis / plotting.
    
    Takes raw mode and datasets
    
    Parameters
    ----------
    WTmod : dataFrame
        Model dataset
    WTrea : dataFrame
        Reanalysis dataset
    Returns
    -------
    WTmod : dataFrame
        Smoothed version of the original model dataset
    WTrea : dataFrame
        Smoothed, interpolated version of the reanalysis dataset 
    Notes
    ------
    `WTrea` domain needs to be slightly larger than `WTmod` domain 
    in order to avoid NaNs after interpolation.

procrustes2(data1, data2)
    Perform procrustes analysis.
    
    Needs more info on what exactly this is doing.
    
    Parameters
    ----------
    data1 : dataFrame
        Reanalysis dataFrame.
    data2 : dataFrame
        Model dataFrame.
    Returns
    -------
    mtx1 : array
        Matrix to be mapped.
    mtx2 : array
        Target matrix.
    disparity : float
        Dissimilarity between the two datasets.
    R : ndarray
        The matrix solution of the orthogonal Procrustes problem.
        Returned from scipy's orthogonal_procrustes() function.
    s : float
        Scale; Sum of the singular values of mtx1.T @ mtx2

procrustes2d(X, Y, scaling=True, reflection='best')
    Procrustes analysis
    
    A port of MATLAB's `procrustes` function to Numpy. 
    -- Modified by Á.G. Muñoz (agmunoz@iri.columbia.edu)
    
    Procrustes analysis determines a linear transformation (translation,
    reflection, orthogonal rotation and scaling) of the points in Y to best
    conform them to the points in matrix X, using the sum of squared errors
    as the goodness of fit criterion.
    
        d, Z, [tform] = procrustes(X, Y)
    
    Parameters
    ----------
    X : array   
        The reference or target field.
    Y : array
        The field to be transformed.
    scaling : Bool, optional
        If False, the scaling component of the transformation is forced
        to 1.
    reflection : str or Bool, optional
        If 'best' (default), the transformation solution may or may not
        include a reflection component, depending on which fits the data
        best. setting reflection to True or False forces a solution with
        reflection or no reflection respectively.
    Returns
    --------
    d : float      
        The residual sum of squared errors, normalized according to a
        measure of the scale of X, ((X - X.mean(axis=0))**2).sum()
    Z : array
        The matrix of transformed Y-values.
    tform : dict  
        Specifying the rotation, translation and scaling that
        maps X --> Y.
    Notes
    -----
    X and Y must have equal numbers of  points (rows), 
    but Y may have fewer dimensions (columns) than X.
    
    c: The translation component
    T: The orthogonal rotation and reflection component
    b: The scale component
    That is, Z = TRANSFORM.b * Y * TRANSFORM.T + TRANSFORM.c.

procrustesAnalysis(WTmod, WTrea, model, reanalysis='MERRA', smooth='SingleDay', printDisparity=False)
    Run procrustes analysis
    
    Takes model weather type data and reanalysis weather type data and performs 
    procrustes analysis to correct and improve reanalysis WT dataset.
    
    Parameters
    ----------
    WTmod : dataFrame
        Model WT dataset.
    WTrea : dataFrame.
        Reanalysis WT dataset.
    model : str
        Name of the model data used.
    reanalysis : str
        Name of the reanalysis data used.
    smooth : str, optional
        Determines if data will be smoothed, 
        can be either 'SingleDay' (no smoothing) or '5DayAVG' (smoothing).
    printDisparity : Bool, optional
        If True, disparity values for each weather type will be printed. 
        Default printDisparity=False where no output is printed.
    Returns
    -------
    WTf : Dataframe 
        Includes model, reanalysis, and adjusted reanalysis WT data.
    Procrustes : Dataframe 
        Includes model data and procrustes analysis with
        components of scale, rotation, translation.

resort_labels(old_labels)
    Re-sort cluster labels.
    
    Re-orders labels so that the lowest number is the most common, 
    and the highest number is the least common.
    
    Parameters
    ----------
    old_labels: vector
        The previous labels of the clusters.
    Returns
    -------
    new_labels : vector
        The new cluster labels, ranked by frequency of occurrence.

shiftedColorMap(cmap, start=0, midpoint=0.5, stop=1.0, name='shiftedcmap')
    Offset center of colormap
    
    Useful for data with a negative min and positive max and you want the
    middle of the colormap's dynamic range to be at zero.
    
    Parameters
    ----------
        cmap : matplotlib colormap
            The matplotlib colormap to be altered.
        start : float, optional
            Offset from lowest point in the colormap's range.
            Defaults to 0.0 (no lower offset). 
            Should be between 0.0 and `midpoint`.
        midpoint : float, optional 
            The new center of the colormap. 
            Defaults to 0.5 (no shift). 
            Should be between 0.0 and 1.0. 
        stop : float, optional 
            Offset from highest point in the colormap's range.
            Defaults to 1.0 (no upper offset). 
            Should be between `midpoint` and 1.0.
    Returns
    -------
    newcmap : matplotlib colormap
        New colormap that can be used for plotting.  
    Notes
    -----
    For midpoint, in general it should be  1 - vmax / (vmax + abs(vmin)) 
        For example if your data range from -15.0 to +5.0 and you want 
        the center of the colormap at 0.0, `midpoint` should be set to  
        1 - 5/(5 + 15)) or 0.75

sin(x, /)
    Return the sine of x (measured in radians).

Clone this wiki locally