Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic packet #140

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Generic packet #140

wants to merge 15 commits into from

Conversation

greglucas
Copy link
Collaborator

This is an attempt to refactor our xtce parsing to be more generic and allow any RawDataObject, not just ccsds packets.

  • RawPacketData has the read_* methods, RawCCSDSPacketData subclasses that and adds on cached properties to access specific header items.
  • Moved the ccsds continuation packet logic into the ccsds_generator to try and keep all ccsds logic in one place.

Thoughts/suggestions on how to make this better for other packet types are welcome!

  • I was thinking it might be nice if a user could pass in raw bytes directly to definition.parse_packet(bytes) and not our wrapped classes that add the reading methods. Perhaps we could wrap the input bytes into our own container class within the function itself which would remove some of the user-facing need to create a RawPacketData-like object with read_* methods on it.

closes #130

Checklist

  • Changes are fully implemented without dangling issues or TODO items
  • Deprecated/superseded code is removed or marked with deprecation warning
  • Current dependencies have been properly specified and old dependencies removed
  • New code/functionality has accompanying tests and any old tests have been updated to match any new assumptions
  • The changelog.md has been updated

Sorry, something went wrong.

@greglucas greglucas requested a review from medley56 as a code owner February 13, 2025 04:57
Copy link

codecov bot commented Feb 17, 2025

Codecov Report

Attention: Patch coverage is 97.36264% with 12 lines in your changes missing coverage. Please review.

Project coverage is 92.11%. Comparing base (bc05878) to head (11dc83b).

Files with missing lines Patch % Lines
space_packet_parser/ccsds.py 93.29% 11 Missing ⚠️
space_packet_parser/xtce/definitions.py 92.85% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #140      +/-   ##
==========================================
+ Coverage   91.89%   92.11%   +0.22%     
==========================================
  Files          36       37       +1     
  Lines        2553     2614      +61     
==========================================
+ Hits         2346     2408      +62     
+ Misses        207      206       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@medley56
Copy link
Member

@greglucas Can you rebase this? Then I'll review.

@greglucas
Copy link
Collaborator Author

Rebased.

Some other thoughts here:

  • Do we want to move Packet to common, and have a separate ccsds module for the generator and CCSDS-specific things instead of packets.py?
  • I'm wondering if there is a way we can avoid wrapping/subclassing Bytes objects. It isn't that bad, but we do already have the binary data in memory, so I'm wondering if we could do something like a memoryview(binary_data[packetstart:packetend]) to pass around pointers to the direct memory locations instead of creating new objects. I'm not sure this really matters in practice yet, but something to keep in mind. I'm mentioning it here in case there are thoughts on how we structure this for a future layout too. But that is likely too much to get into the next release at this point and I haven't thought about it enough.

Copy link
Member

@medley56 medley56 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to be really thorough. I had a few suggestions on semantics and a significant change in inheritance structure. One big question about how to make the definition-level packet_generator() agnostic to CCSDS.

self.pos += nbits
return int_data

class CCSDSPacketBytes(bytes):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intuitively, I would expect CCSDSPacketBytes to subclass RawPacketData and to be called CCSDSRawPacketData to make it clear that they are related. Perhaps I misunderstand but both RawPacketData and CCSDSPacketBytes are just wrappers around bytes, but every method for RawPacketData should also be applicable to CCSDSPacketBytes, right?

Maybe more simply, RawPacketData(bytes) -> PacketData(bytes) (data meaning bytes), and CCSDSPacketBytes(bytes) -> CCSDSPacketData(PacketData)?

CCSDSPacketData would be strictly Liskov in that every CCSDSPacketData is a valid PacketData object because we're not adding limitations, only functionality (for parsing header fields as properties).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Said in the other comment, but I don't think that the CCSDSPacketBytes should be "readable". It is just a stream of bytes, and there are some extra methods that do some nice things on the header bits for ease of access. The reader stuff within RawPacketData is really only useful for XTCE-parsing things. This is external to that and someone can just iterate over the raw bytes to count packets etc.. without any reading (like we are doing in some of the cli stuff).

Suggestions for improving this are welcome, but I don't think we should require someone to bring a RawPacketData subclass to the definition parsing, we should help them out and add the readers ourselves since that is really only meant for us.

class CCSDSPacketBytes(bytes):
"""Binary representation of a CCSDS packet.

Methods to extract the header fields are added to the raw bytes object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The __str__ dunder is still calling this RawPacketData. Probably meant to change that.

class and used to keep track of the current parsing position (accessible through the `pos` attribute).
"""
pos = 0 # Current position in bits

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you want to give RawPacketData a __str__ dunder like CCSDSPacketBytes? If we follow the inheritance suggestion below, we could just define it once in the (this) parent class.

if not all((sequence_counts[i + 1] - sequence_counts[i]) % 16384 == 1
for i in range(len(sequence_counts) - 1)):
warnings.warn(f"Continuation packets for apid {ccsds_packet.apid} "
f"are not in sequence {sequence_counts}, skipping these packets.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be safe to just reorder these according the sequence counter, but maybe leave the warning even if we can fix it because it's probably not expected that the packets are out of order.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This accounts for gaps though too, so it isn't just re-ordering.

raw_data = _segmented_packets[ccsds_packet.apid][0]
# Add the continuation packets to the first packet, skipping the headers
for p in _segmented_packets[ccsds_packet.apid][1:]:
raw_data += p[header_length_bytes + secondary_header_bytes:]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still bothers me that we're simply losing headers and secondary headers by doing this. Could we store the ignored headers and secondary headers as a list instance attribute on the CCSDSPacketBytes object that we yield? See my comment above about bookkeeping this information.

# Add the continuation packets to the first packet, skipping the headers
for p in _segmented_packets[ccsds_packet.apid][1:]:
raw_data += p[header_length_bytes + secondary_header_bytes:]
yield CCSDSPacketBytes(raw_data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have a minor memory leak in _segmented_packets. We should pop the APID key of the _segmented_packets dict for this packet in order to yield the combined CCSDSPacketBytes object.

Upon further inspection, this probably isn't a huge deal because the biggest this dictionary will ever be is the number of distinct muxed APIDs in the packet stream being parsed. It still feels cleaner to me to pop the continued packet from the dict rather than just leaving it there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every time we get a new START packet, we create a new storage list. But, I think you're right that this could grow if we get something like LAST, CONTINUATION, CONTINUATION, LAST, CONTINUATION, because there would be packets in the queue still because we never got another START. I added the pop in because I think that is a good way to handle this, thanks for the suggestion!

packet: packets.CCSDSPacket,
*,
root_container_name: Optional[str] = None) -> packets.CCSDSPacket:
def parse_bytes(self,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not just parsing any bytes object though. We're parsing bytes of packet data. I'd suggest naming this parse_packet_data. I'm noticing that I the idiom where "packet data" is always a bytes object and the "Packet" is a dict.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is related to some of the other comments about inheritance, but this is just any bytes object actually! That is the point of this new function, we want to allow bytes as input, whatever a user wants to pass to us bytes-like.

Once we get the bytes, we then wrap that in our own class that has the reader methods on it. This is why I didn't do the full inheritance from RawPacketData elsewhere on CCSDSPacketBytes. I want a user to be able to make their own bytes generator, I don't want them to have to make a RawPacketData(bytes) generator if that makes sense. Maybe there is a better way to do this though? I don't love that we are wrapping a bytes object in another bytes-like object, but this was the quickest I came up with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see how this is generic now (not CCSDS only) but why do we have the layer of abstraction of parse_bytes calling parse_packet? I don't see parse_packet being used anywhere else. Is there a use case for parse_packet that is not being called by parse_bytes?

@@ -434,7 +482,7 @@ def packet_generator(
show_progress: bool = False,
buffer_read_size_bytes: Optional[int] = None,
skip_header_bytes: int = 0
) -> Iterator[Union[packets.CCSDSPacket, UnrecognizedPacketTypeError]]:
) -> Iterator[Union[packets.Packet, UnrecognizedPacketTypeError]]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we had a non-CCSDS packet, would we need a separate generator method (and associated generator function)? This method sounds agnostic to CCSDS packets (packet_generator()) but internally it only works for CCSDS packets.

I'm not sure how to handle that better. i.e. I'm not sure how to decide whether to call the CCSDS generator or the non-CCSDS generator. Do we want the user telling us whether they are using CCSDS packets or not when initializing the packet definition object? That would make it awkward to support mixed definitions where some are CCSDS and others are not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is definitely CCSDS specific, because that is where we started from. We could do something like "bring your own generator" as an input argument. But, this sort of gets at a definition probably shouldn't have the generator on it, the generator should be separate and then each yielded item can be parsed by the definition.parse_bytes() method. Or we would grow a separate generator for each yielded possibility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per your Slack message, I think you suggested we make this generator part of the definition but I'm not sure I fully understand how that would work.

@medley56 medley56 mentioned this pull request Mar 8, 2025
4 tasks
greglucas added 13 commits March 7, 2025 21:29

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas
This moves towards a generic packet interface and removes the need
for passing in bytes with a read_* interface, we now wrap that in
our own reader class to simplify that interaction.

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas
We can add these properties back onto the main "Packet" class and
add a user warning when being accessed to make it easier to
transition to the new style and provide a helpful message.

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas
Deprecate parse_ccsds_packet() as the XTCE spec doesn't need to know
about the ccsds standard. We should handle parsing raw bytes objects
and create the underlying objects for the users ourselves.

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas
…a stream

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas
…ave changed

Verified

This commit was signed with the committer’s verified signature.
greglucas Greg Lucas
It is specific to ccsds packets and hasn't been released yet, so
just keep it over there instead.
Copy link
Member

@medley56 medley56 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still am not convinced the packet_generator method on the definition object is truly generic because it's calling ccsds_generator internally to create the CCSDSPacketBytes objects for parsing. The parsing appears to be fully generic but the generation of the packet bytes seems to still rely on CCSDS.

def header(self) -> dict:
"""The header content of the packet."""
warnings.warn("The header property is deprecated and will be removed in a future release. "
"To access the header fields of a CCSDS packet, use the CCSDSPacketBytes class.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intention that users should be creating Packets where the binary_data is a CCSDSPacketBytes object rather than a plain bytes object? Then they would call my_packet.binary_data.header? I think we should be more explicit about how to do this and maybe even explain that this is to support generic, non CCSDS packets. Also this applies to the warning in the user_data property.

packet: packets.CCSDSPacket,
*,
root_container_name: Optional[str] = None) -> packets.CCSDSPacket:
def parse_bytes(self,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see how this is generic now (not CCSDS only) but why do we have the layer of abstraction of parse_bytes calling parse_packet? I don't see parse_packet being used anywhere else. Is there a use case for parse_packet that is not being called by parse_bytes?

@@ -434,7 +482,7 @@ def packet_generator(
show_progress: bool = False,
buffer_read_size_bytes: Optional[int] = None,
skip_header_bytes: int = 0
) -> Iterator[Union[packets.CCSDSPacket, UnrecognizedPacketTypeError]]:
) -> Iterator[Union[packets.Packet, UnrecognizedPacketTypeError]]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per your Slack message, I think you suggested we make this generator part of the definition but I'm not sure I fully understand how that would work.

@@ -455,17 +503,8 @@ def packet_generator(
point for parsing. Default is taken from the parent XtcePacketDefinition object.
ccsds_headers_only : bool
Default False. If True, only parses the packet headers (does not use the provided packet definition).
``space_packet_parser.packets.ccsds_packet_generator`` can be used directly to parse only the CCSDS headers
``space_packet_parser.ccsds.ccsds_packet_generator`` can be used directly to parse only the CCSDS headers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the ccsds_headers_only kwarg be deprecated since packet_generator is supposed to be generic?

buffer_read_size_bytes=buffer_read_size_bytes,
show_progress=show_progress,
skip_header_bytes=skip_header_bytes):
for raw_packet_data in ccsds.ccsds_generator(binary_data,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this generator be default kwarg to packet_generator so users can pass custom non-CCSDS generators? I know you have a vision for this but this feels like we're only halfway to being truly generic because the generic packet_generator is hard-coded to use the CCSDS-specific bytes generator here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add the ability to parse any binary packet from XTCE
2 participants