Skip to content

chore(deps-dev): bump unstructured from 0.18.14 to 0.18.32 in /backend#2730

Open
dependabot[bot] wants to merge 2 commits into
mainfrom
dependabot/pip/backend/unstructured-0.18.32
Open

chore(deps-dev): bump unstructured from 0.18.14 to 0.18.32 in /backend#2730
dependabot[bot] wants to merge 2 commits into
mainfrom
dependabot/pip/backend/unstructured-0.18.32

Conversation

@dependabot
Copy link
Copy Markdown

@dependabot dependabot Bot commented on behalf of github Mar 27, 2026

Bumps unstructured from 0.18.14 to 0.18.32.

Release notes

Sourced from unstructured's releases.

0.18.32

What's Changed

Full Changelog: Unstructured-IO/unstructured@0.18.31...0.18.32

0.18.31

What's Changed

New Contributors

Full Changelog: Unstructured-IO/unstructured@0.18.28...0.18.31

0.18.28

Enhancement

  • Optimize clean_extra_whitespace_with_index_run (codeflash)
  • Optimize recursive_xy_cut_swapped (codeflash)
  • Optimize _DocxPartitioner._parse_category_depth_by_style_name (codeflash)
  • Optimize VertexAIEmbeddingEncoder._add_embeddings_to_elements (codeflash)
  • Optimize ngrams (codeflash)
  • Optimize stage_for_datasaur (codeflash)

0.18.27

Fixes

  • Comment no-ops in zoom_image (codeflash)
  • Fix an issue where elements with partially filled extracted text are marked as extracted

... (truncated)

Changelog

Sourced from unstructured's changelog.

0.18.32

Enhancements

  • put pdfium calls behind a thread lock

0.18.31

Enhancements

  • Changed default DPI to 350
  • Add token-based chunking support: Added max_tokens, new_after_n_tokens, and tokenizer parameters to chunk_by_title() and chunk_elements() for chunking by token count instead of character count. Uses tiktoken for token counting. Install with pip install "unstructured[chunking-tokens]". (fixes #4127)

Fixes

0.18.30

Enhancements

  • Updated the Dockerfile to build from the chainguard base. Implemented updating and added base-packages that was done in the base-images repo to instead all be done here.
  • is_text_embedded now considers rotated text as low fidelity and and elements with no trivial amount of it are considered not embedded
  • Replace pdf2image with PyPDFium2 for PDF rendering
  • Optimize _get_optimal_value_for_bbox (codeflash)
  • Optimize _DocxPartitioner._style_based_element_type (codeflash)

Fixes

  • Fix EN DASH not cleaned by clean_bullets: Added EN DASH (\u2013) to UNICODE_BULLETS pattern so clean_bullets properly removes EN DASH bullet points without requiring clean_dashes (fixes #4105)
  • Change languages parameter default from ["auto"] to None: Updated default value in detect_languages() and partition_epub() functions. Behavior unchanged as None is converted to ["auto"] internally. (fixes #2471)
  • Resolve GHSA-58pv-8j8x-9vj2
  • use render mode data to determine if a character extracted by pdfminer is invisible or not

0.18.28

Enhancement

  • Optimize clean_extra_whitespace_with_index_run (codeflash)
  • Optimize recursive_xy_cut_swapped (codeflash)
  • Optimize _DocxPartitioner._parse_category_depth_by_style_name (codeflash)
  • Optimize VertexAIEmbeddingEncoder._add_embeddings_to_elements (codeflash)

... (truncated)

Commits
  • 4bbb1ff feat: put pdfium call behind a threadlock (#4211)
  • d1f1bdf chorse sep bump to resolve open CVEs (#4205)
  • d4caedf fix: Preserve Line Breaks in Code Blocks During Chunking (#4196)
  • 8f32550 fix(deps): Update semitechnologies/weaviate Docker tag to v1.35.3 (#4135)
  • dbe96e2 fix(deps): Update opensearchproject/opensearch Docker tag to v2.19.4 (#4134)
  • 7b366c5 fix(deps): Update docker.elastic.co/elasticsearch/elasticsearch Docker tag to...
  • f0b0e7c fix: filter coordinates kwargs to prevent TypeError in hi_res PDF processing ...
  • 01c3f7c Token-Based Chunking Support (#4203)
  • c0323a6 fix: remove sandbox=True from pypandoc to fix ODT conversion (#4193)
  • 95fea7e fix(deps): switch from pip-compile to uv pip compile (#4202)
  • Additional commits viewable in compare view

@dependabot @github
Copy link
Copy Markdown
Author

dependabot Bot commented on behalf of github Mar 27, 2026

Labels

The following labels could not be found: dependencies. Please create it before Dependabot can add it to a pull request.

Please fix the above issues or remove invalid values from dependabot.yml.

@dependabot dependabot Bot requested review from Phinease and WMC001 as code owners March 27, 2026 15:46
Release v2.1.1 with multiple bug fixes and feature enhancements
@dependabot dependabot Bot changed the base branch from main to feature/build-offline-package May 18, 2026 12:33
@dependabot dependabot Bot requested a review from Dallas98 as a code owner May 18, 2026 12:33
@dependabot dependabot Bot force-pushed the dependabot/pip/backend/unstructured-0.18.32 branch from 76421d6 to 6e7c363 Compare May 18, 2026 12:33
@dependabot dependabot Bot changed the title build(deps-dev): bump unstructured from 0.18.14 to 0.18.32 in /backend chore(deps-dev): bump unstructured from 0.18.14 to 0.18.32 in /backend May 18, 2026
Comment thread .github/workflows/build-offline-package.yml Fixed
Bumps [unstructured](https://github.com/Unstructured-IO/unstructured) from 0.18.14 to 0.18.32.
- [Release notes](https://github.com/Unstructured-IO/unstructured/releases)
- [Changelog](https://github.com/Unstructured-IO/unstructured/blob/main/CHANGELOG.md)
- [Commits](Unstructured-IO/unstructured@unstructured_0.18.14...0.18.32)

---
updated-dependencies:
- dependency-name: unstructured
  dependency-version: 0.18.32
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot changed the base branch from feature/build-offline-package to main May 18, 2026 12:35
@dependabot dependabot Bot force-pushed the dependabot/pip/backend/unstructured-0.18.32 branch from 6e7c363 to a9b47c8 Compare May 18, 2026 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants