diff --git a/content/programming-guides/encoding.md b/content/programming-guides/encoding.md index 265df609..7d5dd580 100644 --- a/content/programming-guides/encoding.md +++ b/content/programming-guides/encoding.md @@ -81,15 +81,15 @@ And here is 150, encoded as `` `9601` `` -- this is a bit more complicated: How do you figure out that this is 150? First you drop the MSB from each byte, as this is just there to tell us whether we've reached the end of the number (as you can see, it's set in the first byte as there is more than one byte in the -varint). Then we concatenate the 7-bit payloads, and interpret it as a -little-endian, 64-bit unsigned integer: +varint). These 7-bit payloads are in little-endian order. Convert to big-endian +order, concatenate, and interpret as an unsigned 64-bit integer: ```proto 10010110 00000001 // Original inputs. 0010110 0000001 // Drop continuation bits. - 0000001 0010110 // Put into little-endian order. - 10010110 // Concatenate. - 128 + 16 + 4 + 2 = 150 // Interpret as integer. + 0000001 0010110 // Convert to big-endian. + 00000010010110 // Concatenate. + 128 + 16 + 4 + 2 = 150 // Interpret as an unsigned 64-bit integer. ``` Because varints are so crucial to protocol buffers, in protoscope syntax, we @@ -218,7 +218,7 @@ original, signed version. In protoscope, suffixing an integer with a `z` will make it encode as ZigZag. For example, `-500z` is the same as the varint `999`. -### Non-varint Numbers +### Non-varint Numbers {#non-varints} Non-varint numeric types are simple -- `double` and `fixed64` have wire type `I64`, which tells the parser to expect a fixed eight-byte lump of data. We can @@ -331,7 +331,11 @@ respect to each other is preserved. Thus, this could also have been encoded as 5: 3 ``` -There is no special treatment for `oneof`s in the wire format. +### Oneofs {#oneofs} + +[`Oneof` fields](/programming-guides/proto2#oneof) are +encoded the same as if the fields were not in a `oneof`. The rules that apply to +`oneofs` are independent of how they are represented on the wire. ### Last One Wins {#last-one-wins} @@ -411,7 +415,7 @@ Protocol buffer parsers must be able to parse repeated fields that were compiled as `packed` as if they were not packed, and vice versa. This permits adding `[packed=true]` to existing fields in a forward- and backward-compatible way. -### Maps +### Maps {#maps} Map fields are just a shorthand for a special kind of repeated field. If we have @@ -436,7 +440,7 @@ message Test6 { Thus, maps are encoded exactly like a `repeated` message field: as a sequence of `LEN`-typed records, with two fields each. -## Groups +## Groups {#groups} Groups are a deprecated feature that should not be used, but they remain in the wire format, and deserve a passing mention. @@ -484,7 +488,7 @@ be written. Serialization order is an implementation detail, and the details of any particular implementation may change in the future. Therefore, protocol buffer parsers must be able to parse fields in any order. -### Implications +### Implications {#implications} * Do not assume the byte output of a serialized message is stable. This is especially true for messages with transitive bytes fields representing other diff --git a/content/programming-guides/proto2.md b/content/programming-guides/proto2.md index 02d3520f..ed9557f7 100644 --- a/content/programming-guides/proto2.md +++ b/content/programming-guides/proto2.md @@ -1189,6 +1189,8 @@ the oneof automatically clears all the other members. You can check which value in a oneof is set (if any) using a special `case()` or `WhichOneof()` method, depending on your chosen language. +Field numbers for oneof fields must be unique within the enclosing message. + ### Using Oneof {#using-oneof} To define a oneof in your `.proto` you use the `oneof` keyword followed by your diff --git a/content/programming-guides/proto3.md b/content/programming-guides/proto3.md index 7eb55d6e..a89792a4 100644 --- a/content/programming-guides/proto3.md +++ b/content/programming-guides/proto3.md @@ -1110,6 +1110,8 @@ depending on your chosen language. Note that if *multiple values are set, the last set value as determined by the order in the proto will overwrite all previous ones*. +Field numbers for oneof fields must be unique within the enclosing message. + ### Using Oneof To define a oneof in your `.proto` you use the `oneof` keyword followed by your diff --git a/content/programming-guides/techniques.md b/content/programming-guides/techniques.md index c6f63e4d..f0316a8c 100644 --- a/content/programming-guides/techniques.md +++ b/content/programming-guides/techniques.md @@ -9,6 +9,20 @@ You can also send design and usage questions to the [Protocol Buffers discussion group](http://groups.google.com/group/protobuf). +## Common Filename Suffixes {#suffixes} + +It is fairly common to write messages to files in several different formats. We +recommend using the following file extensions for these files. + +Content | Extension +------------------------------------------------------------------------- | --------- +[Text Format](/reference/protobuf/textformat-spec) | `.txtpb` +[Wire Format](/programming-guides/encoding) | `.binpb` +[JSON Format](/programming-guides/proto3#json) | `.json` + +For Text Format specifically, `.textproto` is also fairly common, but we +recommend `.txtpb` for its brevity. + ## Streaming Multiple Messages {#streaming} If you want to write multiple messages to a single file or stream, it is up to diff --git a/content/reference/cpp/cpp-generated.md b/content/reference/cpp/cpp-generated.md index d75367f0..cf27f76b 100644 --- a/content/reference/cpp/cpp-generated.md +++ b/content/reference/cpp/cpp-generated.md @@ -115,9 +115,13 @@ In addition to these methods, the `Foo` class defines the following methods: - `Foo& operator=(Foo&& other)`: Move-assignment operator. - `void Swap(Foo* other)`: Swap content with another message. - `const UnknownFieldSet& unknown_fields() const`: Returns the set of unknown - fields encountered while parsing this message. + fields encountered while parsing this message. If `option optimize_for = + LITE_RUNTIME` is specified in the `.proto` file, then the return type + changes to `std::string&`. - `UnknownFieldSet* mutable_unknown_fields()`: Returns a pointer to the - mutable set of unknown fields encountered while parsing this message. + mutable set of unknown fields encountered while parsing this message. If + `option optimize_for = LITE_RUNTIME` is specified in the `.proto` file, then + the return type changes to `std::string*`. The class also defines the following static methods: @@ -132,6 +136,38 @@ The class also defines the following static methods: default instance of a message can be used as a factory by calling its `New()` method. +### Generated Filenames {#generated-filenames} + +[Reserved keywords](https://github.com/protocolbuffers/protobuf/blob/main/src/google/protobuf/compiler/cpp/helpers.cc#L4) +are appended with an underscore in the generated output. + +For example, the following proto3 definition syntax: + +```proto +message MyMessage { + string false = 1; + string myFalse = 2; +} +``` + +generates the following partial output: + +```cpp + void clear_false_() ; + const std::string& false_() const; + void set_false_(Arg_&& arg, Args_... args); + std::string* mutable_false_(); + PROTOBUF_NODISCARD std::string* release_false_(); + void set_allocated_false_(std::string* ptr); + + void clear_myfalse() ; + const std::string& myfalse() const; + void set_myfalse(Arg_&& arg, Args_... args); + std::string* mutable_myfalse(); + PROTOBUF_NODISCARD std::string* release_myfalse(); + void set_allocated_myfalse(std::string* ptr); +``` + ### Nested Types A message can be declared inside another message. For example: @@ -184,7 +220,7 @@ any method inherited from `Message` or accessing the message through other ways Correspondingly, the value of the returned pointer is never guaranteed to be the same across two different invocations of the accessor. -### Singular Numeric Fields (proto2) +### Optional Numeric Fields (proto2 and proto3) For either of these field definitions: @@ -207,7 +243,7 @@ For other numeric field types (including `bool`), `int32` is replaced with the corresponding C++ type according to the [scalar value types table](/programming-guides/proto3#scalar). -### Singular Numeric Fields (proto3) +### Implicit Presence Numeric Fields (proto3) For these field definitions: @@ -229,7 +265,7 @@ For other numeric field types (including `bool`), `int32` is replaced with the corresponding C++ type according to the [scalar value types table](/programming-guides/proto3#scalar). -### Singular String/Bytes Fields (proto2) {#string} +### Optional String/Bytes Fields (proto2 and proto3) {#string} For any of these field definitions: @@ -278,7 +314,7 @@ The compiler will generate the following accessor methods: calling this, caller takes the ownership of the allocated `string` object, `has_foo()` will return `false`, and `foo()` will return the default value. -### Singular String/Bytes Fields (proto3) {#proto3_string} +### Implicit Presence String/Bytes Fields (proto3) {#proto3_string} For any of these field definitions: @@ -355,7 +391,7 @@ The compiler will generate the following accessor methods: `foo()` will return an empty `Cord` (proto3) or the default value (proto2). - `bool has_foo()`: Returns `true` if the field is set. -### Singular Enum Fields (proto2) {#enum_field} +### Optional Enum Fields (proto2 and proto3) {#enum_field} Given the enum type: @@ -386,7 +422,7 @@ The compiler will generate the following accessor methods: - `void clear_foo()`: Clears the value of the field. After calling this, `has_foo()` will return `false` and `foo()` will return the default value. -### Singular Enum Fields (proto3) +### Implicit Presence Enum Fields (proto3) Given the enum type: @@ -414,7 +450,7 @@ The compiler will generate the following accessor methods: - `void clear_foo()`: Clears the value of the field. After calling this, `foo()` will return the default value. -### Singular Embedded Message Fields {#embeddedmessage} +### Optional Embedded Message Fields (proto2 and proto3) {#embeddedmessage} Given the message type: diff --git a/content/reference/go/go-generated.md b/content/reference/go/go-generated.md index c2c9db4c..0df8d4db 100644 --- a/content/reference/go/go-generated.md +++ b/content/reference/go/go-generated.md @@ -385,13 +385,13 @@ The following example shows how to set the field: ```go p1 := &account.Profile{ - Avatar: &account.Profile_ImageUrl{"http://example.com/image.png"}, + Avatar: &account.Profile_ImageUrl{ImageUrl: "http://example.com/image.png"}, } // imageData is []byte imageData := getImageData() p2 := &account.Profile{ - Avatar: &account.Profile_ImageData{imageData}, + Avatar: &account.Profile_ImageData{ImageData: imageData}, } ``` @@ -429,7 +429,6 @@ message Venue { KIND_STADIUM = 2; KIND_BAR = 3; KIND_OPEN_AIR_FESTIVAL = 4; - // ... } Kind kind = 1; // ... @@ -437,7 +436,19 @@ message Venue { ``` the protocol buffer compiler generates a type and a series of constants with -that type. +that type: + +```go +type Venue_Kind int32 + +const ( + Venue_KIND_UNSPECIFIED Venue_Kind = 0 + Venue_KIND_CONCERT_HALL Venue_Kind = 1 + Venue_KIND_STADIUM Venue_Kind = 2 + Venue_KIND_BAR Venue_Kind = 3 + Venue_KIND_OPEN_AIR_FESTIVAL Venue_Kind = 4 +) +``` For enums within a message (like the one above), the type name begins with the message name: diff --git a/content/reference/kotlin/kotlin-generated.md b/content/reference/kotlin/kotlin-generated.md index 426cd17c..4f5eddd1 100644 --- a/content/reference/kotlin/kotlin-generated.md +++ b/content/reference/kotlin/kotlin-generated.md @@ -332,8 +332,8 @@ The protocol buffer compiler will add the following methods to `FooKt.Dsl`: - `operator fun > set(extension: ExtensionLite)`: sets the current value of the extension field in the DSL (for `Comparable` field types) -- `operator fun > set(extension: ExtensionLite)`: - sets the current value of the extension field in the DSL (for message field +- `operator fun set(extension: ExtensionLite)`: sets + the current value of the extension field in the DSL (for message field types) - `operator fun set(extension: ExtensionLite)`: sets the current value of the extension field in the DSL (for `bytes` fields) @@ -350,7 +350,7 @@ The protocol buffer compiler will add the following methods to `FooKt.Dsl`: alias for `addAll` using operator syntax - `operator fun ExtensionList.set(index: Int, value: E)`: sets the element of the repeated extension field at the specified index -- `operator fun ExtensionList.clear()`: clears the elements of the +- `inline fun ExtensionList.clear()`: clears the elements of the repeated extension field The generics here are complex, but the effect is that `this[extension] = value` diff --git a/content/reference/protobuf/textformat-spec.md b/content/reference/protobuf/textformat-spec.md index 7ab08840..fb4729ee 100644 --- a/content/reference/protobuf/textformat-spec.md +++ b/content/reference/protobuf/textformat-spec.md @@ -571,6 +571,8 @@ Both the `key` and `value` fields are optional and default to the zero value of their respective types if unspecified. If a key is duplicated, only the last-specified value will be retained in a parsed map. +The order of maps is not maintained in textprotos. + ## `oneof` Fields {#oneof} While there is no special syntax related to `oneof` fields in text format, only @@ -657,7 +659,7 @@ the schema, so they may provide various features. ```textproto # proto-file: some/proto/my_file.proto -# proto-message: some.package.MyMessage +# proto-message: MyMessage ``` ## Working with the Format Programmatically diff --git a/content/reference/python/python-generated.md b/content/reference/python/python-generated.md index b3245b07..34d08d68 100644 --- a/content/reference/python/python-generated.md +++ b/content/reference/python/python-generated.md @@ -6,11 +6,11 @@ description = "This topic describes exactly what Python definitions the protocol type = "docs" +++ -Any differences -between proto2 and proto3 generated code are highlighted - note that these -differences are in the generated code as described in this document, not the -base message classes/interfaces, which are the same in both versions. You should -read the +Any +differences between proto2 and proto3 generated code are highlighted - note that +these differences are in the generated code as described in this document, not +the base message classes/interfaces, which are the same in both versions. You +should read the [proto2 language guide](/programming-guides/proto) and/or [proto3 language guide](/programming-guides/proto3) before reading this document. @@ -48,6 +48,8 @@ The compiler will automatically create the directory `build/gen/bar` if necessary, but it will *not* create `build` or `build/gen`; they must already exist. +Protoc can generate Python stubs (`.pyi`) using the `--pyi_out` parameter. + Note that if the `.proto` file or its path contains any characters which cannot be used in Python module names (for example, hyphens), they will be replaced with underscores. So, the file `foo-bar.proto` becomes the Python file diff --git a/content/support/version-support.md b/content/support/version-support.md index 0f958556..38c61846 100644 --- a/content/support/version-support.md +++ b/content/support/version-support.md @@ -813,6 +813,8 @@ went out of support on July 1, 2023. Protobuf is committed to following the platform and library support policy described in [Python Support Policy](https://cloud.google.com/python/docs/supported-python-versions). +For specific versions supported, see +[Foundational Python Support Matrix](https://github.com/google/oss-policies-info/blob/main/foundational-python-support-matrix.md). ## Ruby {#ruby}