-
Notifications
You must be signed in to change notification settings - Fork 4
PyWR
Drew Resnick edited this page Jul 20, 2022
·
3 revisions
calc_classifiability(P, Q)
Implement the Michaelangeli (1995) Classifiability Index.
The variable naming here is not pythonic but follows the notation in the 1995 paper
which makes it easier to follow what is going on. You shouldn't need to call
this function directly but it is called in cluster_xr_eof.
Parameters
----------
P : array
A cluster centroid
Q : array
Another cluster centroid
Returns
-------
ci : float
Classifiability index value.
cos(x, /)
Return the cosine of x (measured in radians).
download_data(url, authkey, outfile, force_download=False)
A smart function to download data from IRI Data Library.
If the data can be read in and force_download is False, will read from file
Otherwise will download from IRIDL and then read from file
Parameters
----------
url : str
The url pointing to the data.nc file.
authkey : str
The authentication key for IRI DL (see above).
outfile : str
The data filename.
force_download: Bool, optional
False if it's OK to read from existing file,
True if data *must* be re-downloaded.
Returns
-------
data : dataFrame
Dataframe of dataset specified in url or file.
get_classifiability_index(centroids: numpy.ndarray) -> Tuple[float, int]
Get the classifiability of a set of centroids.
This function will compute the classifiability index for a set of centroids and
indicate which is the best one.
Parameters
----------
centroids: array
Input array of centroids, indexed [simulation, dimension]
Returns
-------
classifiability : float
Classifiability index value.
best_part : int
The best centroid.
get_number_eof(X: numpy.ndarray, var_to_explain: float, plot=False) -> int
Get the number of EOFs of X that explain a given variance proportion.
Parameters
----------
X : ndarray
var_to_explain : float
Proportion (0 to 1) of variance to be explained.
plot : Bool, optional
Default plot=Flase will not generate plot
Returns
-------
n_eof : float
Number of EOF's retained for the chosen percentage of variance.
Notes
-----
Plot generated by the function is 'number of EOFs' versus
'Cumulative proportion of variance explained'.
loop_kmeans(X: numpy.ndarray, n_cluster: int, n_sim: int) -> Tuple[numpy.ndarray, numpy.ndarray]
Compute weather types using k means clustering.
Should have more information on what this does.
Parameters
----------
X: array
PCA reanalysis data.
n_cluster: int
How many clusters to compute.
n_sim: int
how many times to initialize the clusters
(note: computation increases order (n_sim**2)).
Returns
-------
centroids : array
Centroids.
w_types : array
Weather types.
Notes
-----
X should be in reduced dimension space already; indexed [time, dimension].
plot_procrustesAnalysis(Procrustes, savefig=False)
Plot results of procrustes analysis.
Contour plots which show the different elements of the
procrustes analysis used to correct the weather types.
Parameters
----------
Procrustes : dataFrame
Output of procrustesAnalysis().
savefig : Bool optional
Determines if plot will be saved.
Returns
-------
plt.show() : matplotlib plot
Contour plots showing model weather types without correction,
and the scaled, rotated, and translated data.
Notes
-----
If savefig=True, figure will be saved to current directory as
'ProcrustesAnalysis_ + {model} + .pdf'
`Procrustes` includes the original model weather type data, as well as the
scale,rotation, and translation data used to correct the model
data in the procrustes analysis.
plot_procrustesCorrection(WTf, savefig=False)
Plot the WTf output of the procrustes analysis.
Plot the corrected weather type model outputs against the
original WT model and reanalysis datasets.
Parameters
----------
WTf : dataFrame
Output of procrustesAnalysis() which includes
reanalysis and model WTs, and the corrected WTs.
savefig : Bool, optional
Determines if plot will be saved.
Returns
-------
plt.show() : matplotlib plot
Contour plots showing comparison between model, reanalysis,
and corrected weather types.
Notes
-----
If savefig=True, figure will be saved to current directory as
'ProcrustesCorrection_ + {model} + .pdf'
plot_reaVSmod(WTmod, WTrea, model, reanalysis='MERRA', savefig=False)
Plot reanalysis and model datasets
Plots smoothed reanalysis data and
smoothed, interpolated model datasets as contour maps.
Parameters
----------
WTmod : dataFrame
Model dataset.
WTrea : dataFrame
Reanalysis dataset.
model : str
Name of model dataset.
reanalysis : str, optional
Name of reanalysis dataset.
Default reanalysis='MERRA' (data used in example calculations).
savefig : Bool, optional
Determines if plot will be saved.
Returns
-------
plt.show() : matplotlib plot
Contour plot showing reanalysis and model WTs.
Notes
-----
If savefig=True, figure will be saved to current directory as 'plot_reaVSmod.png'
prepareDS_procrustes(WTmod, WTrea)
Prepare model and reanalysis datasets for analysis / plotting.
Takes raw mode and datasets
Parameters
----------
WTmod : dataFrame
Model dataset
WTrea : dataFrame
Reanalysis dataset
Returns
-------
WTmod : dataFrame
Smoothed version of the original model dataset
WTrea : dataFrame
Smoothed, interpolated version of the reanalysis dataset
Notes
------
`WTrea` domain needs to be slightly larger than `WTmod` domain
in order to avoid NaNs after interpolation.
procrustes2(data1, data2)
Perform procrustes analysis.
Needs more info on what exactly this is doing.
Parameters
----------
data1 : dataFrame
Reanalysis dataFrame.
data2 : dataFrame
Model dataFrame.
Returns
-------
mtx1 : array
Matrix to be mapped.
mtx2 : array
Target matrix.
disparity : float
Dissimilarity between the two datasets.
R : ndarray
The matrix solution of the orthogonal Procrustes problem.
Returned from scipy's orthogonal_procrustes() function.
s : float
Scale; Sum of the singular values of mtx1.T @ mtx2
procrustes2d(X, Y, scaling=True, reflection='best')
Procrustes analysis
A port of MATLAB's `procrustes` function to Numpy.
-- Modified by Á.G. Muñoz (agmunoz@iri.columbia.edu)
Procrustes analysis determines a linear transformation (translation,
reflection, orthogonal rotation and scaling) of the points in Y to best
conform them to the points in matrix X, using the sum of squared errors
as the goodness of fit criterion.
d, Z, [tform] = procrustes(X, Y)
Parameters
----------
X : array
The reference or target field.
Y : array
The field to be transformed.
scaling : Bool, optional
If False, the scaling component of the transformation is forced
to 1.
reflection : str or Bool, optional
If 'best' (default), the transformation solution may or may not
include a reflection component, depending on which fits the data
best. setting reflection to True or False forces a solution with
reflection or no reflection respectively.
Returns
--------
d : float
The residual sum of squared errors, normalized according to a
measure of the scale of X, ((X - X.mean(axis=0))**2).sum()
Z : array
The matrix of transformed Y-values.
tform : dict
Specifying the rotation, translation and scaling that
maps X --> Y.
Notes
-----
X and Y must have equal numbers of points (rows),
but Y may have fewer dimensions (columns) than X.
c: The translation component
T: The orthogonal rotation and reflection component
b: The scale component
That is, Z = TRANSFORM.b * Y * TRANSFORM.T + TRANSFORM.c.
procrustesAnalysis(WTmod, WTrea, model, reanalysis='MERRA', smooth='SingleDay', printDisparity=False)
Run procrustes analysis
Takes model weather type data and reanalysis weather type data and performs
procrustes analysis to correct and improve reanalysis WT dataset.
Parameters
----------
WTmod : dataFrame
Model WT dataset.
WTrea : dataFrame.
Reanalysis WT dataset.
model : str
Name of the model data used.
reanalysis : str
Name of the reanalysis data used.
smooth : str, optional
Determines if data will be smoothed,
can be either 'SingleDay' (no smoothing) or '5DayAVG' (smoothing).
printDisparity : Bool, optional
If True, disparity values for each weather type will be printed.
Default printDisparity=False where no output is printed.
Returns
-------
WTf : Dataframe
Includes model, reanalysis, and adjusted reanalysis WT data.
Procrustes : Dataframe
Includes model data and procrustes analysis with
components of scale, rotation, translation.
resort_labels(old_labels)
Re-sort cluster labels.
Re-orders labels so that the lowest number is the most common,
and the highest number is the least common.
Parameters
----------
old_labels: vector
The previous labels of the clusters.
Returns
-------
new_labels : vector
The new cluster labels, ranked by frequency of occurrence.
shiftedColorMap(cmap, start=0, midpoint=0.5, stop=1.0, name='shiftedcmap')
Offset center of colormap
Useful for data with a negative min and positive max and you want the
middle of the colormap's dynamic range to be at zero.
Parameters
----------
cmap : matplotlib colormap
The matplotlib colormap to be altered.
start : float, optional
Offset from lowest point in the colormap's range.
Defaults to 0.0 (no lower offset).
Should be between 0.0 and `midpoint`.
midpoint : float, optional
The new center of the colormap.
Defaults to 0.5 (no shift).
Should be between 0.0 and 1.0.
stop : float, optional
Offset from highest point in the colormap's range.
Defaults to 1.0 (no upper offset).
Should be between `midpoint` and 1.0.
Returns
-------
newcmap : matplotlib colormap
New colormap that can be used for plotting.
Notes
-----
For midpoint, in general it should be 1 - vmax / (vmax + abs(vmin))
For example if your data range from -15.0 to +5.0 and you want
the center of the colormap at 0.0, `midpoint` should be set to
1 - 5/(5 + 15)) or 0.75
sin(x, /)
Return the sine of x (measured in radians).