Conversation
|
note: warnings due to eval_smt.ml |
|
#77 contains the smt generator for datatypes but has no evaluation representation or semantics, it probably makes sense to make the records in this pr a specialisation of ADTs. For the purposes of generating proofs:
|
I did think about this, but thinking about it more, it is a lot of effort and at least in my head would require infinite passes (not realistically but theoretically).
True
I would have to double check, but I believe load and store operations are the only thing that can make pointers so yeah they could
The current key is based on the bit offset, it can show gaps in the record, though this would be only the gaps inbetween fields or at the start and cannot show gaps at the end.
Could / Should this be extracted out into it's own shared generator? i.e. all type declarations are named by the same function which uses the ID.gen setup?
Meaning I should add / make the record stuff intrinsic operations instead?
What do you mean by subtyping here? As in records cannot have a field that has a type? I don't think records really have subtypes but are composed of types instead. |
The pointer values could store this information if it is needed. Although it may be possible for pointers to be joined when they are pointing at different regions. |
It shouldn't be intractable; we could use a modified taint analysis and similarly to any other analysis we can push the precise pointer semantics through as far as possible and in the worst case when the safety of pointer operations degrades to a too-hard termination proof we can overapproximate/widen by inserting a trapping coercion. Like I said we can separate concerns by just always insert a trapping coercion initially and later write analyses that clean it up and improve the precision. That's not to say its neccessarily the best idea, currently all our pointers are represented by bitvectors anyway so we get some benefit from just tagging variables/values as storing pointers.
If you allow integer values to be record fields you are making it impossible define a finite layout for the record I am saying you have to choose one well-defined sematics for your records: either (1) a record is an algebraic product type with disjoint fields that are accessed by a denotational accessor that is purely a name matching the field name, record values are incomparable with any other type. In this case numeric offsets are just misleading. I believe the binsub paper is probably going to infer something more similar to (2) and we can abstract that to a type like (1) by using nested records when multiple-field accesses have been observed. (1) however puts more burden on the transform as we need a new memory representation to allow storing record values to memory without converting them to a bitvector representation. My point is just that we make a deliberate choice and don't end up with something messy in between these two.
Yeah the question is more whether we keep this attached to the program to allow generating distinct types later, it might be useful. It would probably make sense to add a field to program storing a typing context for non-local type information, but I want to ensuring that as much as possible type information is stored locally (i.e. attached to expressions not in some external datatstructure) so analyses don't have to pass through a type context object in order to reason about types. |
|
@katrinafyi looks like we use different formatters for nix, what one do you use? |
I actually do the formatting myself, because I don't like the formatter styles (they all prioritise smaller diff over anything else). But if you want to use a formatter, that's fine too. I can suck it up (or make my own formatter). |
Okay, I shouldn't be touching those files often anyway, and probably only flake.nix, but you may have to suck it up. |
Adds two new types Records and Pointers. Primarily to support #58.
The value for Pointer is just a Bitvec.t and Records are [offset : Bitvec.t], I think records should probably be any const from AllOps.const, or something but this adds lots of complexities and importing issues
Types
Records
Records consist of a list of fields, fields are defined to be a offset, size and a type.
Syntax
{(offset): type, ...}PoInters
Pointers consist of a lower and upper type.
Syntax:
ptr(lower_type, upper_type)Operations
Records
FSET FACCESS - Field set and Field access
Pointers
PADD - Pointer addition
Changes made to other works
-lib/lang/expr_smt- Added Record to logic type, extremely unsure about thislib/analysis/defuse_bool- Added basic logic about records and pointers, field information isn't taken into consideration and would require a significant rework I believe.