Skip to content

修改语言检测文档,涉政模型文档,敏感词代码及文档#426

Merged
darkrush merged 2 commits into
ccprocessor:devfrom
darkrush:dev
Jun 19, 2025
Merged

修改语言检测文档,涉政模型文档,敏感词代码及文档#426
darkrush merged 2 commits into
ccprocessor:devfrom
darkrush:dev

Conversation

@darkrush

Copy link
Copy Markdown
Collaborator

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Please describe the motivation of this PR and the goal you want to achieve through this PR.

Modification

Please briefly describe what modification is made in this PR.

BC-breaking (Optional)

Does the modification introduce changes that break the backward compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here and update the documentation.

Checklist

Before PR:

  • Pre-commit or other linting tools are used to fix the potential lint issues.
  • Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
  • The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

  • If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects.
  • CLA has been signed and all committers have signed the CLA in this PR.

2471023025 and others added 2 commits May 22, 2025 19:38
* revise

* revise cache_dir

* Add files via upload

* 更新cpu涉政模型至25m3

* 修改语言检测文档,涉政模型文档,敏感词代码及文档

* 修改语言检测文档,涉政模型文档,敏感词代码及文档

* 修改语言检测文档,涉政模型文档,敏感词代码及文档

* 修改语言检测文档,涉政模型文档,敏感词代码及文档

* 修改语言检测文档,涉政模型文档,敏感词代码及文档

* 修改语言检测文档,涉政模型文档,敏感词代码及文档

* 修改测试函数

* 修改测试函数

* 修改语言分类性能说明文档

* 修改语言分类性能说明文档

* 修改语言分类性能说明文档

* 修改测试函数

* 修改测试函数

* 修改语言检测文档,删除unsafe_words_detector.md

---------

Co-authored-by: huyc <huyucheng1@pjlab.org.cn>
* backup

* update test_unsafe_words_detector

* update test_unsafe_words_detector

* update test_unsafe_words_detector

* add test_domain_safety_detector

* add test_domain_safety_detector

* fix: 修复文件锁定机制,确保锁文件在异常情况下被正确删除

* feat: 添加基于模型的安全模块,支持内容安全检测和处理

* test rule-based-safety-module

* test rule-based-safety-module

* test rule-based-safety-module

* test rule-based-safety-module

* test rule-based-safety-module

* test rule-based-safety-module

* model_based_safety_module

* model_based_safety_module

* part merge https://github.com/yogacc33/llm-webkit-mirror/blob/feature/model_api/

* feat: 添加 ModelRuntimeException 异常处理,优化模型资源管理

* refactor: 优化模型加载和资源配置,调整类属性以增强可读性

* add top readme of models

* backup tests

* backup tests

* lint code

* lint readme

* lint all code

* add error code

* bug fix

* roll back end of html

* roll back end of html

* roll back end of jsonl

* clean unused code

* clean unused code

* backup code and ready to test

* bug fix

* add tests

* bug fix

* update readme of rule_based_safety module

* Rule-based safety model

* bug fix of porn zh model define

* Dev yujing : fix bug of xlmr-cls (#11)

* backup

* update test_unsafe_words_detector

* update test_unsafe_words_detector

* update test_unsafe_words_detector

* add test_domain_safety_detector

* add test_domain_safety_detector

* fix: 修复文件锁定机制,确保锁文件在异常情况下被正确删除

* feat: 添加基于模型的安全模块,支持内容安全检测和处理

* test rule-based-safety-module

* test rule-based-safety-module

* test rule-based-safety-module

* test rule-based-safety-module

* test rule-based-safety-module

* test rule-based-safety-module

* model_based_safety_module

* model_based_safety_module

* part merge https://github.com/yogacc33/llm-webkit-mirror/blob/feature/model_api/

* feat: 添加 ModelRuntimeException 异常处理,优化模型资源管理

* refactor: 优化模型加载和资源配置,调整类属性以增强可读性

* add top readme of models

* backup tests

* backup tests

* lint code

* lint readme

* lint all code

* add error code

* bug fix

* roll back end of html

* roll back end of html

* roll back end of jsonl

* clean unused code

* clean unused code

* backup code and ready to test

* bug fix

* add tests

* bug fix

* update readme of rule_based_safety module

* bug fix of porn zh model define

---------

Co-authored-by: yujing <yujing@pjlab.org.cn>
Co-authored-by: qiujiantao <qiujiantao@hotmail.com>

* add tests for zh_porn_detector

* add gte political detector (#13)

Co-authored-by: ningwenchang <ningwenchang@pjlab.org.cn>

* add tests for xlmr porn model (#15)

* backup

* update test_unsafe_words_detector

* update test_unsafe_words_detector

* update test_unsafe_words_detector

* add test_domain_safety_detector

* add test_domain_safety_detector

* fix: 修复文件锁定机制,确保锁文件在异常情况下被正确删除

* feat: 添加基于模型的安全模块,支持内容安全检测和处理

* test rule-based-safety-module

* test rule-based-safety-module

* test rule-based-safety-module

* test rule-based-safety-module

* test rule-based-safety-module

* test rule-based-safety-module

* model_based_safety_module

* model_based_safety_module

* part merge https://github.com/yogacc33/llm-webkit-mirror/blob/feature/model_api/

* feat: 添加 ModelRuntimeException 异常处理,优化模型资源管理

* refactor: 优化模型加载和资源配置,调整类属性以增强可读性

* add top readme of models

* backup tests

* backup tests

* lint code

* lint readme

* lint all code

* add error code

* bug fix

* roll back end of html

* roll back end of html

* roll back end of jsonl

* clean unused code

* clean unused code

* backup code and ready to test

* bug fix

* add tests

* bug fix

* update readme of rule_based_safety module

* bug fix of porn zh model define

* add tests for zh_porn_detector

---------

Co-authored-by: yujing <yujing@pjlab.org.cn>
Co-authored-by: qiujiantao <qiujiantao@hotmail.com>

* fix bug of default max_tokens in code

---------

Co-authored-by: yujing <yujing@pjlab.org.cn>
Co-authored-by: qiujiantao <qiujiantao@hotmail.com>
Co-authored-by: idea_overflow <793884420@qq.com>
Co-authored-by: ningwenchang <ningwenchang@pjlab.org.cn>
@codecov

codecov Bot commented May 22, 2025

Copy link
Copy Markdown

Codecov Report

Attention: Patch coverage is 90.00000% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
llm_web_kit/model/unsafe_words_detector.py 50.00% 1 Missing ⚠️

Impacted file tree graph

@@            Coverage Diff             @@
##              dev     #426      +/-   ##
==========================================
+ Coverage   89.96%   90.09%   +0.12%     
==========================================
  Files          82      102      +20     
  Lines        5581     8184    +2603     
==========================================
+ Hits         5021     7373    +2352     
- Misses        560      811     +251     
Files with missing lines Coverage Δ
llm_web_kit/model/model_impl.py 92.97% <ø> (ø)
llm_web_kit/model/politics_detector.py 87.07% <100.00%> (ø)
llm_web_kit/model/porn_detector.py 81.60% <100.00%> (+7.48%) ⬆️
llm_web_kit/model/unsafe_words_detector.py 73.94% <50.00%> (-20.41%) ⬇️

... and 25 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@darkrush darkrush merged commit 9601592 into ccprocessor:dev Jun 19, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants