Skip to content

Parsing NBT #40

@gilesknap

Description

@gilesknap

I'm opening this issue to discuss NBT parsing.

(I note that NBTs are a Java edition feature and I'm not familiar with how similar information is handled in other editions)

I have implemented a simple deserializer for Stringified Named Binary Tag data which is the format returned by commands like data get

# extract preamble from string responses to commands (benign for raw SNBT)
preamble_re = re.compile(r"[^\[{]*(.*)")
# extract list type identifiers
list_types_re = re.compile(r"[LBI];")
# regex to extract all unquoted items
unquoted_re = re.compile(r'([-.A-Za-z0-9]+)(?=([^"]*"[^"]*")*[^"]*$)')
# regex to extract numeric values
integers_re = re.compile(r'"(\d+)[bsl]?"')
no_decimal_floats_re = re.compile(r'"([0-9]+)[fd]"')
floats_re = re.compile(r'"(\d+.\d+)[fd]"')


def parse_nbt(snbt_text: str) -> object:
    """
    Naive deserialization of an SNBT string into a object graph of Python types.

    Note that this is one way only since the following details are lost:
    - distinction between byte, short, int long, types (suffixes of b,s,none,l)
    - distinction between float, double types (suffixes of f,d)
    - distinction between SNBT and raw JSON (enclosed in single quotes)

    See https://minecraft.fandom.com/wiki/NBT_format
    """
    text = preamble_re.sub(r"\1", snbt_text)
    text = list_types_re.sub(r"", text)
    text = unquoted_re.sub(r'"\1"', text).replace("'", "")
    text = no_decimal_floats_re.sub(r"\1.0", text)
    text = floats_re.sub(r"\1", text)
    text = integers_re.sub(r"\1", text)
    text = text.replace('"true"', '"True"').replace('"false"', '"False"')

    return json.loads(text)

I'm not sure the above approach is worthy of the nicely typed mcipc library.

There is a lot more work to do to make a serializable NBT class in python. A useful NBT class would need to:

  • represent all of the numeric types that are not native to python
  • support arithmetic with python floats/int
  • represent pure JSON attributes (so they can be enclosed in single quotes on serialise)
  • support dot notation for accessing child nodes

This would mean you could do something like this:

# increase the number of items in slot 0 of the chest at 626, 73, -1654
nbt = client.data.get(block=Vec3(626, 73, -1654)) 
nbt.Items[0].Count += 10
client.data.merge(block=Vec3(626, 73, -1654), nbt)

So is this worth implementing? The nbt serialize would be limited to the following commands that I can think of:

  • data merge
  • summon
  • These commands when used with a block entity
    • setblock
    • give
    • fill

Wheras the dumb deserialize specified above is useful for querying information about Players, Mobs, Entities, Block Entities etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions