Skip to content

Commit 9522954

Browse files
alejandrormclaude
andauthored
Migrate to pyparsing 3.x and drop Python 2.7 support (#337)
* Migrate from pyparsing 2.x to pyparsing 3.x with PEP8-compliant API MOTIVATION ---------- Pyparsing 3.0 was released in 2021 and introduced PEP8-compliant naming conventions for all methods and parameters (camelCase → snake_case). While backward-compatible aliases were initially provided, pyparsing 3.3.0+ now emits DeprecationWarnings for old names, and they will be removed in future versions. This migration future-proofs pyhocon against breaking changes and aligns the codebase with modern Python conventions. Additionally, pyparsing 3.x requires Python 3.6.8+, allowing us to drop Python 2.7 compatibility code and simplify the codebase. HIGH-LEVEL CHANGES ------------------ 1. Updated pyparsing dependency from '>=2,<4' to '>=3.0.0' 2. Dropped Python 2.7, 3.4, 3.5, 3.6 support (now requires Python 3.7+) 3. Updated all pyparsing method calls to PEP8 snake_case names 4. Updated all pyparsing parameter names to PEP8 conventions 5. Removed Python 2.x compatibility shims (basestring, unicode, urllib2, etc.) 6. Added support for Python 3.10, 3.11, 3.12, 3.13 DETAILED CHANGES ---------------- ### setup.py - Changed: 'pyparsing~=2.0;python_version<"3.0"' and 'pyparsing>=2,<4;python_version>="3.0"' - To: 'pyparsing>=3.0.0' (unified version constraint) - Removed Python 2.7, 3.4, 3.5, 3.6 from classifiers - Added Python 3.10, 3.11, 3.12 to classifiers ### tox.ini - Changed: envlist = flake8, py{27,38,39,310,311,312} - To: envlist = flake8, py{37,38,39,310,311,312,313} ### pyhocon/config_parser.py **Import changes:** - replaceWith → replace_with **Method name changes (all occurrences):** - .parseString() → .parse_string() - .setParseAction() → .set_parse_action() - .setDefaultWhitespaceChars() → .set_default_whitespace_chars() **Parameter name changes (all occurrences):** - caseless=True → case_insensitive=True - escChar='\' → esc_char='\' - unquoteResults=False → unquote_results=False - parseAll=True → parse_all=True **Python 2.x compatibility removal:** - Removed try/except blocks for basestring/unicode definitions - Removed urllib2 fallback imports (now using urllib.request directly) - Removed Python < 3.5 glob fallback - Removed Python < 3.4 imp module fallback (now using importlib.util) - Changed unicode() calls to str() - Changed isinstance(x, unicode) to isinstance(x, str) - Updated all docstring type annotations: basestring → str ### pyhocon/period_parser.py **Method name changes:** - .parseString() → .parse_string() - .setParseAction() → .set_parse_action() **Parameter name changes:** - parseAll=True → parse_all=True ### pyhocon/config_tree.py **Python 2.x compatibility removal:** - Removed try/except block for basestring/unicode - Changed unicode() calls to str() - Changed ConfigUnquotedString parent class: unicode → str - Updated all docstring type annotations: basestring → str ### pyhocon/converter.py **Python 2.x compatibility removal:** - Removed try/except block for basestring/unicode - Changed isinstance(config, basestring) to isinstance(config, str) - Updated all docstring type annotations: basestring → str TESTING ------- All 309 existing tests pass successfully with pyparsing 3.3.1. No deprecation warnings are emitted when running the test suite. CLI tool (pyhocon) verified working with new pyparsing version. BACKWARD COMPATIBILITY ---------------------- This is a BREAKING CHANGE for users: - Python 2.7 and Python 3.4-3.6 are no longer supported - Users must upgrade to Python 3.7+ and pyparsing 3.0+ The pyhocon API itself remains unchanged - only internal implementation and minimum version requirements have changed. DOCUMENTATION ------------- Added PYPARSING_MIGRATION_PLAN.md documenting the complete migration strategy, including all API mappings and testing procedures. Added CLAUDE.md providing guidance for AI assistants working with this codebase. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * remove extra documentation --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent 50e9c6e commit 9522954

6 files changed

Lines changed: 69 additions & 120 deletions

File tree

pyhocon/config_parser.py

Lines changed: 40 additions & 76 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
ParserElement, ParseSyntaxException, QuotedString,
1313
Regex, SkipTo, StringEnd, Suppress, TokenConverter,
1414
Word, ZeroOrMore, alphanums, alphas8bit, col, lineno,
15-
replaceWith)
15+
replace_with)
1616

1717
from pyhocon.period_parser import get_period_expr
1818

@@ -35,53 +35,20 @@ def fixed_get_attr(self, item):
3535
from pyhocon.exceptions import (ConfigException, ConfigMissingException,
3636
ConfigSubstitutionException)
3737

38-
use_urllib2 = False
39-
try:
40-
# For Python 3.0 and later
41-
from urllib.request import urlopen
42-
from urllib.error import HTTPError, URLError
43-
except ImportError: # pragma: no cover
44-
# Fall back to Python 2's urllib2
45-
from urllib2 import urlopen, HTTPError, URLError
46-
47-
use_urllib2 = True
48-
try:
49-
basestring
50-
except NameError: # pragma: no cover
51-
basestring = str
52-
unicode = str
53-
54-
if sys.version_info < (3, 5):
55-
def glob(pathname, recursive=False):
56-
if recursive and '**' in pathname:
57-
import warnings
58-
warnings.warn('This version of python (%s) does not support recursive import' % sys.version)
59-
from glob import glob as _glob
60-
return _glob(pathname)
61-
else:
62-
from glob import glob
63-
64-
# Fix deprecated warning with 'imp' library and Python 3.4+.
65-
# See: https://github.com/chimpler/pyhocon/issues/248
66-
if sys.version_info >= (3, 4):
67-
import importlib.util
68-
69-
70-
def find_package_dirs(name):
71-
spec = importlib.util.find_spec(name)
72-
# When `imp.find_module()` cannot find a package it raises ImportError.
73-
# Here we should simulate it to keep the compatibility with older
74-
# versions.
75-
if not spec:
76-
raise ImportError('No module named {!r}'.format(name))
77-
return spec.submodule_search_locations
78-
else:
79-
import imp
80-
import importlib
81-
82-
83-
def find_package_dirs(name):
84-
return [imp.find_module(name)[1]]
38+
from urllib.request import urlopen
39+
from urllib.error import HTTPError, URLError
40+
from glob import glob
41+
import importlib.util
42+
43+
44+
def find_package_dirs(name):
45+
spec = importlib.util.find_spec(name)
46+
# When `imp.find_module()` cannot find a package it raises ImportError.
47+
# Here we should simulate it to keep the compatibility with older
48+
# versions.
49+
if not spec:
50+
raise ImportError('No module named {!r}'.format(name))
51+
return spec.submodule_search_locations
8552

8653
logger = logging.getLogger(__name__)
8754

@@ -107,11 +74,8 @@ class STR_SUBSTITUTION(object):
10774
pass
10875

10976

110-
U_KEY_SEP = unicode('.')
111-
U_KEY_FMT = unicode('"{0}"')
112-
113-
U_KEY_SEP = unicode('.')
114-
U_KEY_FMT = unicode('"{0}"')
77+
U_KEY_SEP = '.'
78+
U_KEY_FMT = '"{0}"'
11579

11680

11781
class ConfigFactory(object):
@@ -121,9 +85,9 @@ def parse_file(cls, filename, encoding='utf-8', required=True, resolve=True, unr
12185
"""Parse file
12286
12387
:param filename: filename
124-
:type filename: basestring
88+
:type filename: str
12589
:param encoding: file encoding
126-
:type encoding: basestring
90+
:type encoding: str
12791
:param required: If true, raises an exception if can't load file
12892
:type required: boolean
12993
:param resolve: if true, resolve substitutions
@@ -150,7 +114,7 @@ def parse_URL(cls, url, timeout=None, resolve=True, required=False, unresolved_v
150114
"""Parse URL
151115
152116
:param url: url to parse
153-
:type url: basestring
117+
:type url: str
154118
:param resolve: if true, resolve substitutions
155119
:type resolve: boolean
156120
:param unresolved_value: assigned value to unresolved substitution.
@@ -164,7 +128,7 @@ def parse_URL(cls, url, timeout=None, resolve=True, required=False, unresolved_v
164128

165129
try:
166130
with contextlib.closing(urlopen(url, timeout=socket_timeout)) as fd:
167-
content = fd.read() if use_urllib2 else fd.read().decode('utf-8')
131+
content = fd.read().decode('utf-8')
168132
return cls.parse_string(content, os.path.dirname(url), resolve, unresolved_value)
169133
except (HTTPError, URLError) as e:
170134
logger.warn('Cannot include url %s. Resource is inaccessible.', url)
@@ -178,7 +142,7 @@ def parse_string(cls, content, basedir=None, resolve=True, unresolved_value=DEFA
178142
"""Parse string
179143
180144
:param content: content to parse
181-
:type content: basestring
145+
:type content: str
182146
:param resolve: if true, resolve substitutions
183147
:type resolve: boolean
184148
:param unresolved_value: assigned value to unresolved substitution.
@@ -235,7 +199,7 @@ def parse(cls, content, basedir=None, resolve=True, unresolved_value=DEFAULT_SUB
235199
"""parse a HOCON content
236200
237201
:param content: HOCON content to parse
238-
:type content: basestring
202+
:type content: str
239203
:param resolve: if true, resolve substitutions
240204
:type resolve: boolean
241205
:param unresolved_value: assigned value to unresolved substitution.
@@ -376,38 +340,38 @@ def _merge(a, b):
376340
@contextlib.contextmanager
377341
def set_default_white_spaces():
378342
default = ParserElement.DEFAULT_WHITE_CHARS
379-
ParserElement.setDefaultWhitespaceChars(' \t')
343+
ParserElement.set_default_whitespace_chars(' \t')
380344
yield
381-
ParserElement.setDefaultWhitespaceChars(default)
345+
ParserElement.set_default_whitespace_chars(default)
382346

383347
with set_default_white_spaces():
384348
assign_expr = Forward()
385-
true_expr = Keyword("true", caseless=True).setParseAction(replaceWith(True))
386-
false_expr = Keyword("false", caseless=True).setParseAction(replaceWith(False))
387-
null_expr = Keyword("null", caseless=True).setParseAction(replaceWith(NoneValue()))
388-
key = QuotedString('"""', escChar='\\', unquoteResults=False) | \
389-
QuotedString('"', escChar='\\', unquoteResults=False) | Word(alphanums + alphas8bit + '._- /')
349+
true_expr = Keyword("true", case_insensitive=True).set_parse_action(replace_with(True))
350+
false_expr = Keyword("false", case_insensitive=True).set_parse_action(replace_with(False))
351+
null_expr = Keyword("null", case_insensitive=True).set_parse_action(replace_with(NoneValue()))
352+
key = QuotedString('"""', esc_char='\\', unquote_results=False) | \
353+
QuotedString('"', esc_char='\\', unquote_results=False) | Word(alphanums + alphas8bit + '._- /')
390354

391355
eol = Word('\n\r').suppress()
392356
eol_comma = Word('\n\r,').suppress()
393357
comment = (Literal('#') | Literal('//')) - SkipTo(eol | StringEnd())
394358
comment_eol = Suppress(Optional(eol_comma) + comment)
395359
comment_no_comma_eol = (comment | eol).suppress()
396360
number_expr = Regex(r'[+-]?(\d*\.\d+|\d+(\.\d+)?)([eE][+\-]?\d+)?(?=$|[ \t]*([\$\}\],#\n\r]|//))',
397-
re.DOTALL).setParseAction(convert_number)
361+
re.DOTALL).set_parse_action(convert_number)
398362
# multi line string using """
399363
# Using fix described in http://pyparsing.wikispaces.com/share/view/3778969
400-
multiline_string = Regex('""".*?"*"""', re.DOTALL | re.UNICODE).setParseAction(parse_multi_string)
364+
multiline_string = Regex('""".*?"*"""', re.DOTALL | re.UNICODE).set_parse_action(parse_multi_string)
401365
# single quoted line string
402-
quoted_string = Regex(r'"(?:[^"\\\n]|\\.)*"[ \t]*', re.UNICODE).setParseAction(create_quoted_string)
366+
quoted_string = Regex(r'"(?:[^"\\\n]|\\.)*"[ \t]*', re.UNICODE).set_parse_action(create_quoted_string)
403367
# unquoted string that takes the rest of the line until an optional comment
404368
# we support .properties multiline support which is like this:
405369
# line1 \
406370
# line2 \
407371
# so a backslash precedes the \n
408-
unquoted_string = Regex(r'(?:[^^`+?!@*&"\[\{\s\]\}#,=\$\\]|\\.)+[ \t]*', re.UNICODE).setParseAction(
372+
unquoted_string = Regex(r'(?:[^^`+?!@*&"\[\{\s\]\}#,=\$\\]|\\.)+[ \t]*', re.UNICODE).set_parse_action(
409373
unescape_string)
410-
substitution_expr = Regex(r'[ \t]*\$\{[^\}]+\}[ \t]*').setParseAction(create_substitution)
374+
substitution_expr = Regex(r'[ \t]*\$\{[^\}]+\}[ \t]*').set_parse_action(create_substitution)
411375
string_expr = multiline_string | quoted_string | unquoted_string
412376

413377
value_expr = get_period_expr() | number_expr | true_expr | false_expr | null_expr | string_expr
@@ -417,12 +381,12 @@ def set_default_white_spaces():
417381
'(').suppress() - quoted_string - Literal(')').suppress())
418382
)
419383
include_expr = (
420-
Keyword("include", caseless=True).suppress() + (
384+
Keyword("include", case_insensitive=True).suppress() + (
421385
include_content | (
422386
Keyword("required") - Literal('(').suppress() - include_content - Literal(')').suppress()
423387
)
424388
)
425-
).setParseAction(include_config)
389+
).set_parse_action(include_config)
426390

427391
root_dict_expr = Forward()
428392
dict_expr = Forward()
@@ -451,7 +415,7 @@ def set_default_white_spaces():
451415
config_expr = ZeroOrMore(comment_eol | eol) + (
452416
list_expr | root_dict_expr | inside_root_dict_expr) + ZeroOrMore(
453417
comment_eol | eol_comma)
454-
config = config_expr.parseString(content, parseAll=True)[0]
418+
config = config_expr.parse_string(content, parse_all=True)[0]
455419

456420
if resolve:
457421
allow_unresolved = resolve and unresolved_value is not DEFAULT_SUBSTITUTION \
@@ -542,7 +506,7 @@ def _find_substitutions(cls, item):
542506
"""Convert HOCON input into a JSON output
543507
544508
:return: JSON string representation
545-
:type return: basestring
509+
:type return: str
546510
"""
547511
if isinstance(item, ConfigValues):
548512
return item.get_substitutions()
@@ -821,7 +785,7 @@ def postParse(self, instring, loc, token_list):
821785
if isinstance(value, list) and operator == "+=":
822786
value = ConfigValues([ConfigSubstitution(key, True, '', False, loc), value], False, loc)
823787
config_tree.put(key, value, False)
824-
elif isinstance(value, unicode) and operator == "+=":
788+
elif isinstance(value, str) and operator == "+=":
825789
value = ConfigValues([ConfigSubstitution(key, True, '', True, loc), ' ' + value], True, loc)
826790
config_tree.put(key, value, False)
827791
elif isinstance(value, list):

pyhocon/config_tree.py

Lines changed: 14 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,6 @@
44
import copy
55
from pyhocon.exceptions import ConfigException, ConfigWrongTypeException, ConfigMissingException
66

7-
try:
8-
basestring
9-
except NameError: # pragma: no cover
10-
basestring = str
11-
unicode = str
12-
137

148
class UndefinedKey(object):
159
pass
@@ -219,7 +213,7 @@ def put(self, key, value, append=False):
219213
"""Put a value in the tree (dot separated)
220214
221215
:param key: key to use (dot separated). E.g., a.b.c
222-
:type key: basestring
216+
:type key: str
223217
:param value: value to put
224218
"""
225219
self._put(ConfigTree.parse_key(key), value, append)
@@ -228,7 +222,7 @@ def get(self, key, default=UndefinedKey):
228222
"""Get a value from the tree
229223
230224
:param key: key to use (dot separated). E.g., a.b.c
231-
:type key: basestring
225+
:type key: str
232226
:param default: default value if key not found
233227
:type default: object
234228
:return: value in the tree located at key
@@ -239,17 +233,17 @@ def get_string(self, key, default=UndefinedKey):
239233
"""Return string representation of value found at key
240234
241235
:param key: key to use (dot separated). E.g., a.b.c
242-
:type key: basestring
236+
:type key: str
243237
:param default: default value if key not found
244-
:type default: basestring
238+
:type default: str
245239
:return: string value
246-
:type return: basestring
240+
:type return: str
247241
"""
248242
value = self.get(key, default)
249243
if value is None:
250244
return None
251245

252-
string_value = unicode(value)
246+
string_value = str(value)
253247
if isinstance(value, bool):
254248
string_value = string_value.lower()
255249
return string_value
@@ -262,7 +256,7 @@ def pop(self, key, default=UndefinedKey):
262256
and pops the last value out of the dict.
263257
264258
:param key: key to use (dot separated). E.g., a.b.c
265-
:type key: basestring
259+
:type key: str
266260
:param default: default value if key not found
267261
:type default: object
268262
:param default: default value if key not found
@@ -286,7 +280,7 @@ def get_int(self, key, default=UndefinedKey):
286280
"""Return int representation of value found at key
287281
288282
:param key: key to use (dot separated). E.g., a.b.c
289-
:type key: basestring
283+
:type key: str
290284
:param default: default value if key not found
291285
:type default: int
292286
:return: int value
@@ -303,7 +297,7 @@ def get_float(self, key, default=UndefinedKey):
303297
"""Return float representation of value found at key
304298
305299
:param key: key to use (dot separated). E.g., a.b.c
306-
:type key: basestring
300+
:type key: str
307301
:param default: default value if key not found
308302
:type default: float
309303
:return: float value
@@ -320,7 +314,7 @@ def get_bool(self, key, default=UndefinedKey):
320314
"""Return boolean representation of value found at key
321315
322316
:param key: key to use (dot separated). E.g., a.b.c
323-
:type key: basestring
317+
:type key: str
324318
:param default: default value if key not found
325319
:type default: bool
326320
:return: boolean value
@@ -347,7 +341,7 @@ def get_list(self, key, default=UndefinedKey):
347341
"""Return list representation of value found at key
348342
349343
:param key: key to use (dot separated). E.g., a.b.c
350-
:type key: basestring
344+
:type key: str
351345
:param default: default value if key not found
352346
:type default: list
353347
:return: list value
@@ -374,7 +368,7 @@ def get_config(self, key, default=UndefinedKey):
374368
"""Return tree config representation of value found at key
375369
376370
:param key: key to use (dot separated). E.g., a.b.c
377-
:type key: basestring
371+
:type key: str
378372
:param default: default value if key not found
379373
:type default: config
380374
:return: config value
@@ -522,7 +516,7 @@ def format_str(v, last=False):
522516
if isinstance(v, ConfigQuotedString):
523517
return v.value + ('' if last else v.ws)
524518
else:
525-
return '' if v is None else unicode(v)
519+
return '' if v is None else str(v)
526520

527521
if self.has_substitution():
528522
return self
@@ -618,7 +612,7 @@ def __repr__(self): # pragma: no cover
618612
return '[ConfigSubstitution: ' + self.variable + ']'
619613

620614

621-
class ConfigUnquotedString(unicode):
615+
class ConfigUnquotedString(str):
622616
def __new__(cls, value):
623617
return super(ConfigUnquotedString, cls).__new__(cls, value)
624618

0 commit comments

Comments
 (0)