Skip to content

Releases: zeek/spicy

v1.14.0

18 Aug 10:33
Compare
Choose a tag to compare

New Functionality

  • GH-2028: New interprocedural optimizations.

    We added infrastructure for performing interprocedural optimizations, and as a first user added a pass which removes unused function parameters in GH-2030. While this works on any code it is mainly intended to simply generated parser code for better runtime performance.

  • GH-1697: Remove some dead statements based on control and data flow.

    We now collect control and data flow information. We use this to detect and remove "dead statements", i.e., statements which are not seen by any other needed computations. Currently we handle two classes of dead statements:

    • assignments which are override before being used
    • unreachable code, e.g., due to preceding return, break or throw

    The implementation for this is still not able to cover all possible Spicy language constructs, so it is behind a feature flag and not enabled by default. To enable it one needs to set the environment variable HILTI_OPTIMIZER_ENABLE_CFG=1 when compiling Spicy code with e.g., spicyc.

    We encourage users to test this compilation mode and if possible use the compiled parsers in production. If parsers compiled this way show the intended runtime behavior in tests they should also be fine to use in production.

Changed Functionality

  • GH-2050: Prefer stdout over stderr for --help messages.

    Spicy tools now emit --help output to stdout instead of stderr.

  • GH-2068: Allow disabling building of tests.

    We added a new CMake option SPICY_ENABLE_TESTS which if toggled on forces building of test and benchmark binaries; it is ON by default. This flag can be used by projects building Spicy to disable building of tests if they are not interested in them. We also provide a configure flag --disable-tests which has has the effect of turning it off.

  • GH-1663: Speed up checking of iterator compatibility.

    We were previously using a control block which held a weak_ptr to the protected data. This was pretty inefficient for a number of reasons:

    • access to the controlled data always required a weak_ptr::lock which created a temporary shared_ptr copy and immediately destroyed it after access
    • to check whether the control block was expired we used lock instead of expired which introduced the same overhead
    • to check compatibility of iterators we compared shared_ptrs to the control data which again required full locks instead of using owner_before

    This manifested in e.g., loops often being less performant than possible. We now changed how we hold data to make iterating collections cheaper.

  • GH-2086: Fix scope resolution of local variables.

    If usage of a local comes before its declaration, we now no longer resolve that usage to this local. It'll either be resolved to an upper layer ID (if there is one of the same name), or rejected if it's otherwise unknown.

  • GH-2066: When C++ compilation fails, ask user for help.

    We do expect C++ code generated by Spicy to be valid, so C++ compiler errors in generated code are likely bugs. We now record the output of the C++ compiler in a dedicated file hilti-jit-error.log and ask users to file a ticket in case C++ compilation failed.

  • GH-1660: When printing anonymous bitfields inside a struct, lift up the fields.

    This now prints, e.g., [$fin=1, $rsv=0, $opcode=2, $remaining=255] instead of [$<anon>=(1, 0, 2, 255)].

    In addition, we also prettify non-anonymous bitfields. They now print as, e.g., [$y=(a: 4, b: 8)] instead of [$y=(4, 8)].

  • GH-1085: Allow registering a module twice.

    So far, if one compiled the same HILTI module twice, each into its own HLTO, then when loading the two HLTOs, the runtime system would skip the second instance. However, that's not really what we want: a module could intentionally be part of multiple HLTOs, in which case each should get its own copy of that module state (i.e., its globals).

    This change allows the same module to be registered multiple times, with the HLTO linker scope distinguishing between the instances at runtime, as usual. To make that work, we move computation of the scope from compile time to runtime, using the library's absolute path as the scope.

  • GH-1905: Fix operator precedence in Spicy grammar.

    We fixed the precedence of a number of operators to be closer to what users would expect from other language like C++ or Python.

    • we reduced the precedence of the in operator
    • pre- and postfix operators ++ and -- now have same precedence and are right associative
    • unary negate was change to match the precedence of other unary operators.
  • Switch compilation to C++20.

    Like Zeek Spicy now requires a C++ compiler. As part of this change we cleaned up the implementation to take advantage of C++ functionality in a number of places. We also moved from the external libraries linb::any to std::any, and ghc::filesystem to std::filesystem.

  • Update supported platforms.

    We dropped support for the following platforms:

    • debian-11
    • fedora-40

    We added support for

    • debian-13
    • fedora-42
  • GH-1660: Render all bitfield instances with included field names.

  • GH-2099: Fully implement iterator interface for set::Iterator.

  • GH-2052: Move calling convention from function to function type.

Bug fixes

  • GH-2057: Fix bytes iterator dereference operation.
  • GH-2065: Error for redefined locals from statement inits.
  • GH-2061: Fix cyclic usage of units types inside other types.
  • GH-2074: Fix fiber abortion.
  • GH-2063: Fix C++ compilation issue with weak->strong refs.
  • GH-2064: Ensure generated typeinfos are declared before used.
  • GH-2044: Catch if methods are implemented multiple times.
  • GH-2078: Fix C++ output for constants of constant type.
  • GH-1988: Enforce that block-local declarations must be variables.
  • GH-1996: Catch exceptions in processInput gracefully.
  • GH-2091: Fix strong->value reference coercion in calls.
  • GH-2100: Add missing deref operations for struct try-member/has-member operators.
  • GH-2119: Fix missing inline functions in enum prototypes.
  • GH-2142, GH-2134: Complete information exposed for reflection in typeinfo.
  • GH-2135: Add &cxx-any-as-ptr attribute.

Documentation

  • GH-1905: Document operator precedence.

v1.11.6

17 Jul 14:49
Compare
Choose a tag to compare

Bug fixes

  • GH-2074: Fix fiber abortion.

    When aborting a fiber, we need to activate it once more, to then leave it for good by raising an AbortException. Problem was that that exception ended up being caught by user code because it was derived from std::exception`. This change removes the base class so that the exception is guaranteed to go back to the managing fiber code, where we just ignore it.

  • GH-2073: Prevent throwing naked exception when yielding from aborted fiber.

v1.13.2

16 Jul 14:19
Compare
Choose a tag to compare

Bug fixes

  • GH-2119: Fix missing inline functions in enum prototypes.

    Our prototype generation could miss function bodies for inline functions.

  • GH-2074: Fix fiber abortion.

    When aborting a fiber, we need to activate it once more, to then leave it for good by raising an AbortException. Problem was that that exception ended up being caught by user code because it was derived from std::exception`. This change removes the base class so that the exception is guaranteed to go back to the managing fiber code, where we just ignore it.

v1.13.1

19 May 12:48
Compare
Choose a tag to compare

Bug fixes

  • GH-2057: Fix bytes iterator dereference operation.

v1.11.5

19 May 12:50
Compare
Choose a tag to compare

Bug fixes

  • GH-2057: Fix bytes iterator dereference operation.

v1.13.0

08 May 19:44
Compare
Choose a tag to compare

New Functionality

  • GH-1788: We now support decoding and encoding to UTF16, in particular
    the new UTF16LE and UTF16BE charsets for little and big endian
    encoding, respectively.

  • GH-1961: We now support creating type values in Spicy code. The
    primary use case for this is to pass type information to host
    applications, and debugging.

    A type value is typically created from either typeinfo(TYPE) or
    typinfo(value), or coercion from an existing ID of a custom type
    like global T: type = MyStruct;. The resulting value can be printed,
    or stored in a variable of type type, e.g.,
    global bool_t: type = typeinfo(bool);.

  • GH-1971: Extend unit switch based on look-ahead to support blocks of
    items.

    In 1.12.0 we added support grouping related unit fields in blocks;
    there the primary use case were if blocks to group fields with
    identical dependencies. We now also support such blocks inside unit
    switch constructs with lookahead so one can write the following
    code:

    # Parses either `a` followed by another `a`, or `b`.
    type X = unit {
        switch {
            -> {
                : b"a";
                : b"a";
            }
            -> : b"b";
        };
    };
  • GH-1538: Implement compound statements ({...}). This allows
    introducing local scopes, e.g., to group related code.

  • GH-1946: string's encode method gained an optional errors
    argument to influence error handling. The parameter defaults to
    DecodeErrorStrategy::REPLACE reproducing the previous implicit
    behavior.

  • GH-2010: bytes and string gained ends_with methods

  • GH-1965: Add support for case-insensitive matching to regular
    expressions.

    By adding an i flag to a regular expression pattern, it will now be
    matched case-insensitively (e.g. /foobar/i).

  • GH-1962: Add spicy-dump option to enable profiling.

Changed Functionality

  • GH-1981, GH-1982, GH-1991: We now catch more user errors in defining
    function overloads. Previously these would likely (hopefully) have
    failed in C++ compilation down the line, but are now cleanly rejected.

  • GH-1977: We now reject function overloads which only differ in their
    return type.

  • GH-1991: We now reject function prototypes without &cxxname.

    Since in Spicy global declarations can be in any order there is no
    need to introduce a function with a prototype if it is declared later.
    The only valid use case for function prototypes was if the function
    was implemented in C++ and bound to the Spicy name with &cxxname.

  • We have cleaned up our implementation for runtime type information,
    primarily intended for custom host applications.

    • type_info::Value instances obtained through runtime type
      introspection can now be rendered to a user-facing representation
      with a new to_string method.
    • The runtime representation was changed to correctly encode that
      tuple elements can remain unset. A Spicy-side tuple
      tuple<T1, T2, T3> now gets turned into
      std::tuple<std::optional<T1>, std::optional<T2>, std::optional<T3>>
      which captures the full semantics.
    • We added type information for types previously not exposed, namely
      Null, Nothing and List. We also fixed the exposed type
      information for result<void>.
  • GH-2011: We have optimized allocations for unit fields extracting
    vectors which should speed up extracting especially small and
    medium-size vectors.

  • GH-2035: We have dropped support for Ubuntu 20.04 (Focal Fossa) since
    it has reached end of standard support upstream.

  • GH-2026: Speed up matching of character classes in regexps

Bug fixes

  • GH-1580: Catch when functions aren't called.
  • GH-1961: Fix generated C++ prototype header.
  • GH-1966: Reject anonymous units in variables and fields.
  • GH-1967: Fix inactive stack size check during module initialization.
  • GH-1968: Fix coercion of function call arguments.
  • GH-1976: Fix unit &max-size not returning to proper loc.
  • GH-2007: Fix using &try with &max-size, and potentially other
    cases.
  • GH-2016: Fix &size expressions evaluating multiple times.
  • GH-2038: Prevent escape of non-HILTI exception in lower-level driver
    functions.
  • GH-2047: Make sure bytes::to[U]Int returns runtime integers.
  • GH-2049: Add #include <cstdint> for fixed-width integers

Documentation

  • GH-1155: Document iteration over maps/set/vectors.
  • GH-1963: Document assert-exception.
  • GH-1964: Document use of $$ inside &{while,until,until-including}.
  • GH-1973: Remove documentation of unsupported &nosub.
  • GH-1974: Add documentation on how to interpret stack traces involving
    fibers.
  • GH-1975: Fix possibly-incorrect custom host compile command
  • GH-2039: Touchup docs style section.
  • GH-1970, GH-2003: Fix minor typos in documentation.

v1.11.4

08 May 19:44
Compare
Choose a tag to compare

Bug fixes

  • GH-2047: Make sure bytes::to[U]Int returns runtime integers.
  • GH-2049: Fix building with GCC15.
  • GH-1999, GH-2004: Adjust build setup for cmake-4.
  • GH-2038: Prevent escape of non-HILTI exception in lower-level driver functions.
  • GH-1918: Fix potential segfault with stream iterators.
  • GH-1871: Fix &max-size on unit containing a switch.

v1.12.0

06 Jan 14:58
Compare
Choose a tag to compare

New Functionality

  • We now support if around a block of unit items:

    type X = unit {
         x: uint8;
    
         if ( self.x == 1 ) {
             a1: bytes &size=2;
             a2: bytes &size=2;
         };
    };
    

    One can also add an else-block:

    type X = unit {
         x: uint8;
    
         if ( self.x == 1 ) {
             a1: bytes &size=2;
             a2: bytes &size=2;
         }
         else {
             b1: bytes &size=2;
             b2: bytes &size=2;
         };
    };
    
  • We now support attaching an %error handler to an individual field:

    type Test = unit {
        a: b"A";
        b: b"B" %error { print "field B %error", self; }
        c: b"C";
    };
    

    With input AxC, that handler will trigger, whereas with ABx it won't. If the unit had a unit-wide %error handler as well, that one would trigger in both cases (i.e., for b, in addition to its field local handler).

    The handler can also be provided separately from the field:

    on b %error { ... }
    

    In that separate version, one can receive the error message as well by declaring a corresponding string parameter:

    on b(msg: string) %error { ... }
    

    This works externally, from outside the unit, as well:

    on Test::b(msg: string) %error { ... }
    
  • GH-1856: We added support for specifying a dedicated error message for requires failures.

    This now allows creating custom error messages when a &require condition fails. Example:

    type Foo = unit {
        x: uint8 &requires=($$ == 1 : error"Deep trouble!'");
    
        # or, shorter:
        y: uint8 &requires=($$ == 1 : "Deep trouble!'");
    };
    

    This is powered by a new condition test expression COND : ERROR.

  • We reworked C++ code generation so now many parsers should compile faster. This is accomplished by both improved dependency tracking when emitting C++ code for a module as well as by a couple of new peephole optimization passes which additionally reduced the emitted code.

Changed Functionality

  • Add CMAKE_CXX_FLAGS to HILTI_CONFIG_RUNTIME_LD_FLAGS.
  • Speed up compilation of many parsers by streamlining generated C++ code.
  • Add starts_with split, split1, lower and upper methods to string.
  • GH-1874: Add new library function spicy::bytes_to_mac.
  • Optimize spicy::bytes_to_hexstring and spicy::bytes_to_mac.
  • Improve validation of attributes so incompatible or invalid attributes should be rejected more reliably.
  • Optimize parsing for bytes of fixed size as well as literals.
  • Add a couple of peephole optimizations to reduce emitted C++ code.
  • GH-1790: Provide proper error message when trying access an unknown unit field.
  • GH-1792: Prioritize error message reporting unknown field.
  • GH-1803: Fix namespacing of hilti IDs in Spicy-side diagnostic output.
  • GH-1895: Do no longer escape backslashes when printing strings or bytes.
  • GH-1857: Support &requires for individual vector items.
  • GH-1859: Improve error message when a unit parameter is used as a field.
  • GH-1898: Disallow attributes on "type aliases".
  • GH-1938: Deprecate &count attribute.

Bug fixes

  • GH-1815: Disallow expanding limited View's again with limit.
  • Fix to_uint(ByteOrder) for empty byte ranges.
  • Fix undefined shifts of 32bit integer in toInt().
  • GH-1817: Prevent null ptr dereference when looking on nodes without Scope.
  • Fix use of move'd from variable.
  • GH-1823: Don't qualify magic linker symbols with C++ namespace.
  • Fix diagnostics seen when compiling with GCC.
  • GH-1852: Fix skip with units.
  • GH-1832: Fail for vectors with bytes but no stop.
  • GH-1860: Fix parsing for vectors of literals.
  • GH-1847: Fix resynchronization issue with trimmed input.
  • GH-1844: Fix nested look-ahead parsing.
  • GH-1842: Fix when input redirection becomes visible.
  • GH-1846: Fix bug with captures groups.
  • GH-1875: Fix potential nullptr dereference when comparing streams.
  • GH-1867: Fix infinite loops with recursive types.
  • GH-1868: Associate source code locations with current fiber instead of current thread.
  • GH-1871: Fix &max-size on unit containing a switch.
  • GH-1791: Fix usage of &convert with unit's requiring parameters.
  • GH-1858: Fix the literals parsers not following coercions.
  • GH-1893: Encompass child node's location in parent.
  • GH-1919: Validate that sets are sortable.
  • GH-1918: Fix potential segfault with stream iterators.
  • GH-1856: Disallow dereferencing a result<void> value.
  • Fix issue with type inference for result constructor.

Documentation

  • Redo error handling docs
  • Document continue statements.
  • GH-1063: Document arguments to new operator.
  • Updates <bytes>.to_int()/<bytes>.to_uint() documentation.
  • GH-1914: Make $$ documentation more precise.
  • Fix doc code snippet that won't compile.

v1.11.3

02 Oct 10:27
Compare
Choose a tag to compare

Bug fixes

  • GH-1846: Fix bug with captures groups.

    When extracting the data matching capture groups we'd take it from the beginning of the stream, not the beginning of the current view, even though the latter is what we are matching against.

  • Add missing trim after matching a regular expression.

  • GH-1875: Fix potential nullptr dereference when comparing streams.

    Because we are operating on unsafe iterators, need to catch when one goes out of bounds.

  • GH-1842: Fix when input redirection becomes visible.

    With &parse-at/from we were updating the internal state on our current position immediately, meaning they were visible already when evaluating other attributes on the same field afterwards, which is unexpected.

  • GH-1844: Fix nested look-ahead parsing.

    When parsing nested vectors all using look-ahead, we need to return control back to upper level when an inner look-ahead isn't found.

    This may change the error message for "normal" look-ahead parsing (see test baseline), but the new one seems fine and potentially even better.

v1.11.2

19 Sep 11:14
Compare
Choose a tag to compare

Bug fixes

  • GH-1860: Fix parsing for vectors of literals.

    This was broken in two ways:

    1. with the (LITERAL)[] syntax, the parser would not recognize literals using type constructors
    2. with the syntax LITERAL[], we'd try to store the parsed value into a vector
  • GH-1847: Fix resynchronization issue with trimmed input.

    When input had been trimmed, View::advanceToNextData could end up returning a view starting ahead of the valid area.

  • GH-1852: Fix skip with units.

    For unit parsing with skip, we would create a temporary instance but wouldn't properly initialize it, meaning for example that parameters weren't available. We now generally fully initialize any destination, even if temporary.