Skip to content

Commit 016ec10

Browse files
authored
Merge pull request #248 from Logofile/sync
Documentation change
2 parents c14731f + 96c7edd commit 016ec10

File tree

14 files changed

+192
-157
lines changed

14 files changed

+192
-157
lines changed

content/best-practices/no-cargo-cults.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@ type = "docs"
77

88
Do not
99
[cargo cult](https://en.wikipedia.org/wiki/Cargo_cult_programming)
10-
settings in proto files. If \
11-
you are creating a new proto file based on existing schema definitions, don't
12-
apply option settings except for those that you understand the need for.
10+
settings in proto files. If you are creating a new proto file based on existing
11+
schema definitions, don't apply option settings except for those that you
12+
understand the need for.
1313

1414
## Best Practices Specific to Editions {#editions}
1515

content/getting-started/pythontutorial.md

+9-37
Original file line numberDiff line numberDiff line change
@@ -76,14 +76,16 @@ each field in the message. Here is the `.proto` file that defines your messages,
7676
`addressbook.proto`.
7777

7878
```proto
79-
syntax = "proto2";
79+
edition = "2023";
8080
8181
package tutorial;
8282
83+
option features.field_presence = EXPLICIT;
84+
8385
message Person {
84-
optional string name = 1;
85-
optional int32 id = 2;
86-
optional string email = 3;
86+
string name = 1;
87+
int32 id = 2;
88+
string email = 3;
8789
8890
enum PhoneType {
8991
PHONE_TYPE_UNSPECIFIED = 0;
@@ -93,8 +95,8 @@ message Person {
9395
}
9496
9597
message PhoneNumber {
96-
optional string number = 1;
97-
optional PhoneType type = 2 [default = PHONE_TYPE_HOME];
98+
string number = 1;
99+
PhoneType type = 2 [default = PHONE_TYPE_HOME];
98100
}
99101
100102
repeated PhoneNumber phones = 4;
@@ -135,39 +137,9 @@ less-commonly used optional elements. Each element in a repeated field requires
135137
re-encoding the tag number, so repeated fields are particularly good candidates
136138
for this optimization.
137139

138-
Each field must be annotated with one of the following modifiers:
139-
140-
- `optional`: the field may or may not be set. If an optional field value
141-
isn't set, a default value is used. For simple types, you can specify your
142-
own default value, as we've done for the phone number `type` in the example.
143-
Otherwise, a system default is used: zero for numeric types, the empty
144-
string for strings, false for bools. For embedded messages, the default
145-
value is always the "default instance" or "prototype" of the message, which
146-
has none of its fields set. Calling the accessor to get the value of an
147-
optional (or required) field which has not been explicitly set always
148-
returns that field's default value.
149-
- `repeated`: the field may be repeated any number of times (including zero).
150-
The order of the repeated values will be preserved in the protocol buffer.
151-
Think of repeated fields as dynamically sized arrays.
152-
- `required`: a value for the field must be provided, otherwise the message
153-
will be considered "uninitialized". Serializing an uninitialized message
154-
will raise an exception. Parsing an uninitialized message will fail. Other
155-
than this, a required field behaves exactly like an optional field.
156-
157-
{{% alert title="Important" color="warning" %}} **Required Is Forever**
158-
You should be very careful about marking fields as `required`. If at some point
159-
you wish to stop writing or sending a required field, it will be problematic to
160-
change the field to an optional field -- old readers will consider messages
161-
without this field to be incomplete and may reject or drop them unintentionally.
162-
You should consider writing application-specific custom validation routines for
163-
your buffers instead. Within Google, `required` fields are strongly disfavored;
164-
most messages defined in proto2 syntax use `optional` and `repeated` only.
165-
(Proto3 does not support `required` fields at all.)
166-
{{% /alert %}}
167-
168140
You'll find a complete guide to writing `.proto` files -- including all the
169141
possible field types -- in the
170-
[Protocol Buffer Language Guide](/programming-guides/proto2).
142+
[Protocol Buffer Language Guide](/programming-guides/editions).
171143
Don't go looking for facilities similar to class inheritance, though -- protocol
172144
buffers don't do that.
173145

content/installation.md

+3-2
Original file line numberDiff line numberDiff line change
@@ -23,14 +23,15 @@ binaries, follow these instructions:
2323

2424
```sh
2525
PB_REL="https://github.com/protocolbuffers/protobuf/releases"
26-
curl -LO $PB_REL/download/v< param protoc-version >/protoc-< param protoc-version >-linux-x86_64.zip
26+
curl -LO $PB_REL/download/v30.2/protoc-30.2-linux-x86_64.zip
27+
2728
```
2829

2930
2. Unzip the file under `$HOME/.local` or a directory of your choice. For
3031
example:
3132

3233
```sh
33-
unzip protoc-< param protoc-version >-linux-x86_64.zip -d $HOME/.local
34+
unzip protoc-30.2-linux-x86_64.zip -d $HOME/.local
3435
```
3536

3637
3. Update your environment's path variable to include the path to the `protoc`

content/news/2025-03-18.md

+14
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
+++
2+
title = "Changes Announced on March 18, 2025"
3+
linkTitle = "March 18, 2025"
4+
toc_hide = "true"
5+
description = "Changes announced for Protocol Buffers on March 18, 2025."
6+
type = "docs"
7+
+++
8+
9+
## Dropping Ruby 3.0 Support
10+
11+
As per our official
12+
[Ruby support policy](https://cloud.google.com/ruby/getting-started/supported-ruby-versions),
13+
we will be dropping support for Ruby 3.0 and lower in Protobuf version 31, due
14+
to release in April, 2025. The minimum supported Ruby version will be 3.1.

content/news/v31.md

+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
+++
2+
title = "News Announcements for Version 31.x"
3+
linkTitle = "Version 31.x"
4+
toc_hide = "true"
5+
description = "Changes announced for Protocol Buffers version 31.x."
6+
type = "docs"
7+
+++
8+
9+
The following announcements are specific to Version 31.x. For information
10+
presented chronologically, see [News](/news).
11+
12+
The following sections cover planned breaking changes in the v31 release,
13+
expected in 2025 Q2. Also included are some changes that aren't breaking but may
14+
require action on your part. These describe changes as we anticipate them being
15+
implemented, but due to the flexible nature of software some of these changes
16+
may not land or may vary from how they are described in this topic.
17+
18+
### Dropping Ruby 3.0 Support
19+
20+
As per our official
21+
[Ruby support policy](https://cloud.google.com/ruby/getting-started/supported-ruby-versions),
22+
we will be dropping support for Ruby 3.0. The minimum supported Ruby version
23+
will be 3.1.

content/programming-guides/editions.md

+5-1
Original file line numberDiff line numberDiff line change
@@ -1003,7 +1003,11 @@ following rules:
10031003
effect as if you had cast the number to that type in C++ (for example, if a
10041004
64-bit number is read as an int32, it will be truncated to 32 bits).
10051005
* `sint32` and `sint64` are compatible with each other but are *not*
1006-
compatible with the other integer types.
1006+
compatible with the other integer types. If the value written was between
1007+
INT_MIN and INT_MAX inclusive it will parse as the same value with either
1008+
type. If an sint64 value was written outside of that range and parsed as an
1009+
sint32, the varint is truncated to 32 bits and then zigzag decoding occurs
1010+
(which will cause a different value to be observed).
10071011
* `string` and `bytes` are compatible as long as the bytes are valid UTF-8.
10081012
* Embedded messages are compatible with `bytes` if the bytes contain an
10091013
encoded instance of the message.

content/programming-guides/encoding.md

+52-66
Original file line numberDiff line numberDiff line change
@@ -29,13 +29,15 @@ discuss aspects of the wire format.
2929
The Protoscope tool can also dump encoded protocol buffers as text. See
3030
https://github.com/protocolbuffers/protoscope/tree/main/testdata for examples.
3131

32+
All examples in this topic assume that you are using Edition 2023 or later.
33+
3234
## A Simple Message {#simple}
3335

3436
Let's say you have the following very simple message definition:
3537

3638
```proto
3739
message Test1 {
38-
optional int32 a = 1;
40+
int32 a = 1;
3941
}
4042
```
4143

@@ -241,7 +243,7 @@ Consider this message schema:
241243

242244
```proto
243245
message Test2 {
244-
optional string b = 2;
246+
string b = 2;
245247
}
246248
```
247249

@@ -275,7 +277,7 @@ an embedded message of our original example message, `Test1`:
275277

276278
```proto
277279
message Test3 {
278-
optional Test1 c = 3;
280+
Test1 c = 3;
279281
}
280282
```
281283

@@ -293,36 +295,49 @@ and a length of 3, exactly the same way as strings are encoded.
293295
In Protoscope, submessages are quite succinct. ` ``1a03089601`` ` can be written
294296
as `3: {1: 150}`.
295297

296-
## Optional and Repeated Elements {#optional}
298+
## Missing Elements {#optional}
297299

298-
Missing `optional` fields are easy to encode: we just leave out the record if
300+
Missing fields are easy to encode: we just leave out the record if
299301
it's not present. This means that "huge" protos with only a few fields set are
300302
quite sparse.
301303

302-
`repeated` fields are a bit more complicated. Ordinary (not [packed](#packed))
303-
repeated fields emit one record for every element of the field. Thus, if we have
304+
<span id="packed"></span>
305+
306+
## Repeated Elements {#repeated}
307+
308+
Starting in Edition 2023, `repeated` fields of a primitive type
309+
(any [scalar type](/programming-guides/proto2#scalar)
310+
that is not `string` or `bytes`) are ["packed"](/editions/features#repeated_field_encoding) by default.
311+
312+
Packed `repeated` fields, instead of being encoded as one
313+
record per entry, are encoded as a single `LEN` record that contains each
314+
element concatenated. To decode, elements are decoded from the `LEN` record one
315+
by one until the payload is exhausted. The start of the next element is
316+
determined by the length of the previous, which itself depends on the type of
317+
the field. Thus, if we have:
304318

305319
```proto
306320
message Test4 {
307-
optional string d = 4;
308-
repeated int32 e = 5;
321+
string d = 4;
322+
repeated int32 e = 6;
309323
}
310324
```
311325

312326
and we construct a `Test4` message with `d` set to `"hello"`, and `e` set to
313-
`1`, `2`, and `3`, this *could* be encoded as `` `220568656c6c6f280128022803`
314-
``, or written out as Protoscope,
327+
`1`, `2`, and `3`, this *could* be encoded as `` `3206038e029ea705` ``, or
328+
written out as Protoscope,
315329

316330
```proto
317331
4: {"hello"}
318-
5: 1
319-
5: 2
320-
5: 3
332+
6: {3 270 86942}
321333
```
322334

323-
However, records for `e` do not need to appear consecutively, and can be
324-
interleaved with other fields; only the order of records for the same field with
325-
respect to each other is preserved. Thus, this could also have been encoded as
335+
However, if the repeated field is set to expanded (overriding the default packed
336+
state) or is not packable (strings and messages) then an entry for each
337+
individual value is encoded. Also, records for `e` do not need to appear
338+
consecutively, and can be interleaved with other fields; only the order of
339+
records for the same field with respect to each other is preserved. Thus, this
340+
could look like the following:
326341

327342
```proto
328343
5: 1
@@ -331,6 +346,24 @@ respect to each other is preserved. Thus, this could also have been encoded as
331346
5: 3
332347
```
333348

349+
Only repeated fields of primitive numeric types can be declared "packed". These
350+
are types that would normally use the `VARINT`, `I32`, or `I64` wire types.
351+
352+
Note that although there's usually no reason to encode more than one key-value
353+
pair for a packed repeated field, parsers must be prepared to accept multiple
354+
key-value pairs. In this case, the payloads should be concatenated. Each pair
355+
must contain a whole number of elements. The following is a valid encoding of
356+
the same message above that parsers must accept:
357+
358+
```proto
359+
6: {3 270}
360+
6: {86942}
361+
```
362+
363+
Protocol buffer parsers must be able to parse repeated fields that were compiled
364+
as `packed` as if they were not packed, and vice versa. This permits adding
365+
`[packed=true]` to existing fields in a forward- and backward-compatible way.
366+
334367
### Oneofs {#oneofs}
335368

336369
[`Oneof` fields](/programming-guides/proto2#oneof) are
@@ -368,53 +401,6 @@ message.MergeFrom(message2);
368401
This property is occasionally useful, as it allows you to merge two messages (by
369402
concatenation) even if you do not know their types.
370403

371-
### Packed Repeated Fields {#packed}
372-
373-
Starting in v2.1.0, `repeated` fields of a primitive type
374-
(any [scalar type](/programming-guides/proto2#scalar)
375-
that is not `string` or `bytes`) can be declared as "packed". In proto2 this is
376-
done using the field option `[packed=true]`. In proto3 it is the default.
377-
378-
Instead of being encoded as one record per entry, they are encoded as a single
379-
`LEN` record that contains each element concatenated. To decode, elements are
380-
decoded from the `LEN` record one by one until the payload is exhausted. The
381-
start of the next element is determined by the length of the previous, which
382-
itself depends on the type of the field.
383-
384-
For example, imagine you have the message type:
385-
386-
```proto
387-
message Test5 {
388-
repeated int32 f = 6 [packed=true];
389-
}
390-
```
391-
392-
Now let's say you construct a `Test5`, providing the values 3, 270, and 86942
393-
for the repeated field `f`. Encoded, this gives us `` `3206038e029ea705` ``, or
394-
as Protoscope text,
395-
396-
```proto
397-
6: {3 270 86942}
398-
```
399-
400-
Only repeated fields of primitive numeric types can be declared "packed". These
401-
are types that would normally use the `VARINT`, `I32`, or `I64` wire types.
402-
403-
Note that although there's usually no reason to encode more than one key-value
404-
pair for a packed repeated field, parsers must be prepared to accept multiple
405-
key-value pairs. In this case, the payloads should be concatenated. Each pair
406-
must contain a whole number of elements. The following is a valid encoding of
407-
the same message above that parsers must accept:
408-
409-
```proto
410-
6: {3 270}
411-
6: {86942}
412-
```
413-
414-
Protocol buffer parsers must be able to parse repeated fields that were compiled
415-
as `packed` as if they were not packed, and vice versa. This permits adding
416-
`[packed=true]` to existing fields in a forward- and backward-compatible way.
417-
418404
### Maps {#maps}
419405

420406
Map fields are just a shorthand for a special kind of repeated field. If we have
@@ -430,8 +416,8 @@ this is actually the same as
430416
```proto
431417
message Test6 {
432418
message g_Entry {
433-
optional string key = 1;
434-
optional int32 value = 2;
419+
string key = 1;
420+
int32 value = 2;
435421
}
436422
repeated g_Entry g = 7;
437423
}

content/programming-guides/field_presence.md

+24-10
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,6 @@ are two different manifestations of presence for protobufs: *implicit presence*,
1313
where the generated message API stores field values (only), and *explicit
1414
presence*, where the API also stores whether or not a field has been set.
1515

16-
Historically, proto2 has mostly followed *explicit presence*, while proto3
17-
exposes only *implicit presence* semantics. Singular proto3 fields of basic
18-
types (numeric, string, bytes, and enums) which are defined with the `optional`
19-
label have *explicit presence*, like proto2 (this feature is enabled by default
20-
as release 3.15).
21-
2216
{{% alert title="Note" color="note" %}} We
2317
recommend always adding the `optional` label for proto3 basic types. This
2418
provides a smoother path to editions, which uses explicit presence by
@@ -179,10 +173,8 @@ affirmatively expose presence, although the same set of hazzer methods may not
179173
generated as in proto2 APIs.
180174

181175
This default behavior of not tracking presence without the `optional` label is
182-
different from the proto2 behavior. We reintroduced
183-
[explicit presence](/editions/features#field_presence) as
184-
the default in edition 2023. We recommend using the `optional` field with proto3
185-
unless you have a specific reason not to.
176+
different from the proto2 behavior. We recommend using the `optional` label with
177+
proto3 unless you have a specific reason not to.
186178

187179
Under the *implicit presence* discipline, the default value is synonymous with
188180
"not present" for purposes of serialization. To notionally "clear" a field (so
@@ -195,6 +187,28 @@ required to have an enumerator value which maps to 0. By convention, this is an
195187
the domain of valid values for the application, this behavior can be thought of
196188
as tantamount to *explicit presence*.
197189

190+
### Presence in Editions APIs
191+
192+
This table outlines whether presence is tracked for fields in editions APIs
193+
(both for generated APIs and using dynamic reflection):
194+
195+
Field type | Explicit Presence
196+
-------------------------------------------- | -----------------
197+
Singular numeric (integer or floating point) | ✔️
198+
Singular enum | ✔️
199+
Singular string or bytes | ✔️
200+
Singular message&#8224; | ✔️
201+
Repeated |
202+
Oneofs&#8224; | ✔️
203+
Maps |
204+
205+
&#8224; Messages and oneofs have never had implicit presence, and editions
206+
doesn't allow you to set `field_presence = IMPLICIT`.
207+
208+
Editions-based APIs track field presence explicitly, similarly to proto2, unless
209+
`features.field_presence` is set to `IMPLICIT`. Similar to proto2 APIs,
210+
editions-based APIs do not track presence explicitly for repeated fields.
211+
198212
## Semantic Differences {#semantic-differences}
199213

200214
The *implicit presence* serialization discipline results in visible differences

0 commit comments

Comments
 (0)