Add NumPy-style docstrings across all DashAI components#521
Merged
cristian-tamblay merged 59 commits intodevelopfrom Apr 6, 2026
Merged
Add NumPy-style docstrings across all DashAI components#521cristian-tamblay merged 59 commits intodevelopfrom
cristian-tamblay merged 59 commits intodevelopfrom
Conversation
…e class docstrings
…ry class docstrings
…tyle Add detailed multi-paragraph class descriptions and schema docstrings across all 29 scikit-learn converter wrappers, following the DistilBERT pattern established for the codebase.
Add detailed class descriptions with References sections for SMOTE (Chawla et al. 2002), SMOTEENN (Batista et al. 2004), and RandomUnderSampler, plus expanded schema docstrings.
Add detailed multi-paragraph descriptions to all 14 explorer classes covering chart types, use cases, and parameter guidance; add schema docstrings describing what each explorer configures.
Add detailed descriptions to base explainer classes and all three explainer implementations (KernelSHAP, PartialDependence, PermutationFeatureImportance) with literature References sections.
Add detailed descriptions to all 8 task classes covering input/output types, compatible metrics, and multi-paragraph context for each ML task type (classification, regression, translation, image generation, etc.).
Add detailed class descriptions to CSVDataLoader, ExcelDataLoader, and JSONDataLoader covering parsing behaviour, multi-file handling, and split logic; expand schema docstrings with parameter configuration notes.
…tyle Add detailed multi-paragraph class descriptions with References sections for MistralModel (Jiang et al. 2023), MixtralModel (Jiang et al. 2024), QwenModel (Qwen Team 2024), and SmolLMModel (Allal et al. 2024); expand schema docstrings with quantization and variant configuration notes.
Add detailed multi-paragraph descriptions with References sections to all 13 scikit-learn model wrappers covering algorithm theory, strengths, and limitations (LogisticRegression, SVC, RandomForest, GradientBoosting, MLP, KNeighbors, Ridge, Linear/SVR, DecisionTree, DummyClassifier, etc.).
Add full NumPy-style docstrings to the two private llama_utils GPU-check helpers; expand weak docstrings on get_model_params_from_task, DistilBERT/OpusMT SavedModel.__init__, and save/load methods in BagOfWordsTextClassificationModel and SklearnLikeModel.
Add Parameters/Returns/Raises sections to fit, transform, get_output_type, and helper methods across base_converter, hugging_face_wrapper, imbalanced_learn_wrapper, bag_of_words, label_encoder, polynomial_features, tf_idf, the three imbalanced_learn converters, and character_replacer/nan_remover.
Add NumPy Parameters/Returns sections to DummyTextClassificationModel fit/predict, EmbeddingConverter get_output_type/_process_batch, and TokenizerConverter _process_batch.
Add multi-paragraph descriptions to BaseMetric, ClassificationMetric, RegressionMetric, and TranslationMetric covering MAXIMIZE semantics, compatible tasks, and helper function roles.
Add multi-paragraph descriptions with formulas, value ranges, use-case guidance, and References sections to Accuracy, CohenKappa, F1, HammingDistance, LogLoss, Precision, Recall, and ROCAUC.
Add multi-paragraph descriptions with formulas, value ranges, outlier sensitivity notes, and References sections to ExplainedVariance, MAE, MedianAbsoluteError, MSE, R2, and RMSE.
Expand class, schema, and __init__ docstrings for StableDiffusionV2Model, PixArtSigmaModel, and SDXLTurboModel following the established pattern.
Expand SD15DepthControlNetSchema/Model, SD15HEDControlNetSchema/Model, SD15OpenPoseControlNetSchema/Model, and SDXLCannyControlNetSchema/Model class docstrings. All __init__ methods were already documented.
Remove stale NumericalWrapperForText reference from schema docstring. Change kwargs : dict to **kwargs : dict in __init__ per NumPy standard.
Expand one-liner class docstrings for LlamaSchema/Model, StableDiffusionV3 Schema/Model, StableDiffusionXL Schema/Model, TongyiZImage Schema/Model, StableDiffusionXLV1ControlNetSchema, SklearnLikeModel/Classifier/Regressor, and the nested MLP helper class. Audit now reports 0 issues.
…locks .. math:: directives emit curly braces that the MDX acorn parser treats as JSX expressions, breaking doc generation. Replace all .. math:: blocks across 13 metric files with indented :: code blocks containing Unicode plain-text formulas, following the pattern used in partial_dependence.py.
…to use a consistent bullet point style.
Update type annotations in classification metric modules to explicitly use numpy arrays for predicted labels and other relevant parameters.
92365ad to
5011ea9
Compare
…odels adding item list dash
cristian-tamblay
approved these changes
Apr 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds comprehensive docstrings across all major DashAI components—including converters, explorers, models, metrics, generative models, generative tasks, tasks, dataloaders, and explainers. These docstrings follow the NumPy-style format, providing clear, consistent documentation of responsibilities, parameters, return values, and usage.
Additionally, docstrings now include references to original implementations (libraries) and/or research papers where the methods or models were introduced, offering better context and traceability.
Type of Change
Check all that apply like this [x]:
Changes (by file)
converters/*: Added detailed NumPy-style docstrings and references to original implementations or papers.explorers/*: Documented core classes and methods, including relevant references where applicable.models/*: Added NumPy-style docstrings describing model interfaces, parameters, behavior, and source references.metrics/*: Improved documentation for metric definitions, inputs, outputs, and references.generative_models/*: Documented architecture, usage, and linked to original implementations/papers.generative_tasks/*: Clarified task structure, inputs, outputs, and contextual references.tasks/*: Added descriptions of task responsibilities and execution flow.dataloaders/*: Documented data loading logic, expected formats, and outputs.explainers/*: Added docstrings explaining interpretation methods along with relevant references.Testing (optional)
No testing required, changes are strictly documentation and do not affect runtime behavior.