Skip to content

Commit

Permalink
v1.1 (#66)
Browse files Browse the repository at this point in the history
  • Loading branch information
laves authored Feb 25, 2025
1 parent 6cacd0e commit 42be884
Show file tree
Hide file tree
Showing 117 changed files with 715 additions and 391 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/nodejs-perf.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ jobs:
- machine: rpi3-32
proc_performance_threshold_sec: 2.9
- machine: rpi3-64
proc_performance_threshold_sec: 2.1
proc_performance_threshold_sec: 2.5
- machine: rpi4-32
proc_performance_threshold_sec: 1.3
- machine: rpi4-64
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/web-demos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:

strategy:
matrix:
node-version: [ 16.x, 18.x, 20.x ]
node-version: [ 18.x, 20.x, 22.x ]

steps:
- uses: actions/checkout@v3
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/web.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ jobs:

strategy:
matrix:
node-version: [ 16.x, 18.x, 20.x ]
node-version: [ 18.x, 20.x, 22.x ]

steps:
- uses: actions/checkout@v3
Expand Down
23 changes: 12 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ voice assistants. Orca is:
- [Orca streaming text synthesis](#orca-input-and-output-streaming-synthesis)
- [Text input](#text-input)
- [Custom pronunciations](#custom-pronunciations)
- [Voices](#voices)
- [Language and Voice](#language-and-voice)
- [Speech control](#speech-control)
- [Audio output](#audio-output)
- [AccessKey](#accesskey)
Expand Down Expand Up @@ -93,17 +93,11 @@ The following are examples of sentences using custom pronunciations:
- "{read|R IY D} this as {read|R EH D}, please."
- "I {live|L IH V} in {Sevilla|S EH V IY Y AH}. We have great {live|L AY V} sports!"

### Voices
### Language and Voice

Orca can synthesize speech with various voices, each of which is characterized by a model file located
in [lib/common](./lib/common).
To synthesize speech with a specific voice, provide the associated model file as an argument to the orca init function.
The following are the voices currently available:
Orca Streaming Text-to-Speech can synthesize speech in different languages and with a variety of voices, each of which is characterized by a model file (`.pv`) located in [lib/common](./lib/common). The language and gender of the speaker is indicated in the file name.

| Model name | Sample rate (Hz) |
|:---------------------------------------------------------:|:----------------:|
| [orca_params_female.pv](lib/common/orca_params_female.pv) | 22050 |
| [orca_params_male.pv](lib/common/orca_params_male.pv) | 22050 |
To synthesize speech with a specific language and voice, provide the associated model file as an argument to the Orca init function.

### Speech control

Expand Down Expand Up @@ -779,7 +773,14 @@ For more details, see the [Node.js SDK](./binding/nodejs/).

## Releases

### v1.0.0 - Aug 20th, 2024
### v1.1.0 - February 24th, 2025

- Added support for Spanish voices
- Improved English voices
- Added .NET SDK
- Improved text normalization

### v1.0.0 - August 20th, 2024

- Improved voice quality
- Significantly reduced latency in streaming synthesis
Expand Down
2 changes: 1 addition & 1 deletion binding/android/Orca/orca/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apply plugin: 'com.android.library'

ext {
PUBLISH_GROUP_ID = 'ai.picovoice'
PUBLISH_VERSION = '1.0.0'
PUBLISH_VERSION = '1.1.0'
PUBLISH_ARTIFACT_ID = 'orca-android'
}

Expand Down
6 changes: 3 additions & 3 deletions binding/android/OrcaTestApp/orca-test-app/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,8 @@ android {

tasks.register('copyParams', Copy) {
from("$projectDir/../../../../lib/common/")
include("orca_params_female.pv")
include("orca_params_male.pv")
include("orca_params_en_female.pv")
include("orca_params_en_male.pv")
into("$projectDir/src/main/assets/models")
}

Expand Down Expand Up @@ -113,7 +113,7 @@ dependencies {
implementation 'androidx.constraintlayout:constraintlayout:2.1.4'
implementation 'com.google.code.gson:gson:2.10'
implementation 'com.google.errorprone:error_prone_annotations:2.36.0'
implementation 'ai.picovoice:orca-android:1.0.0'
implementation 'ai.picovoice:orca-android:1.1.0'

// Espresso UI Testing
androidTestImplementation 'androidx.test.ext:junit:1.1.5'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@ public class PerformanceTest extends BaseTest {
@Parameterized.Parameter(value = 0)
public String modelFile;

@Parameterized.Parameter(value = 1)
public String procSentence;

@Parameterized.Parameters(name = "{0}")
public static Collection<Object[]> initParameters() throws IOException {
String testDataJsonString = getTestDataString();
Expand All @@ -48,10 +51,12 @@ public static Collection<Object[]> initParameters() throws IOException {
final JsonArray testCases = testDataJson.getAsJsonObject("tests").get("sentence_tests").getAsJsonArray();
JsonObject testCase = testCases.get(0).getAsJsonObject();

String text = testCase.get("text").getAsString();

List<Object[]> parameters = new ArrayList<>();
for (JsonElement modelJson : testCase.get("models").getAsJsonArray()) {
String model = modelJson.getAsString();
parameters.add(new Object[]{model});
parameters.add(new Object[]{model, text});
}
return parameters;
}
Expand All @@ -76,13 +81,9 @@ public void testProcPerformance() throws Exception {
Assume.assumeFalse(procThresholdString.equals(""));

final double procPerformanceThresholdSec = Double.parseDouble(procThresholdString);
final String procSentence = testJson
.getAsJsonObject("test_sentences")
.get("text")
.getAsString();
final Orca orca = new Orca.Builder()
.setAccessKey(accessKey)
.setModelPath(modelFile)
.setModelPath(getModelFilepath(modelFile))
.build(appContext);

long totalNSec = 0;
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
Copyright 2024 Picovoice Inc.
Copyright 2024-2025 Picovoice Inc.
You may not use this file except in compliance with the license. A copy of the license is
located in the "LICENSE" file accompanying this source.
Expand Down Expand Up @@ -61,7 +61,7 @@ public void runTest() {

ArrayList<TestResult> results = new ArrayList<>();

final String modelFile = "models/orca_params_female.pv";
final String modelFile = "models/orca_params_en_female.pv";

TestResult result = new TestResult();
result.testName = "Test Init";
Expand Down
7 changes: 4 additions & 3 deletions binding/android/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,10 +137,11 @@ The pronunciation is expressed in [ARPAbet](https://en.wikipedia.org/wiki/ARPABE
- "{read|R IY D} this as {read|R EH D}, please."
- "I {live|L IH V} in {Sevilla|S EH V IY Y AH}. We have great {live|L AY V} sports!"

### Voices
### Language and Voice

Orca can synthesize speech with various voices, each of which is characterized by a model file located
in [lib/common](../../lib/common).
Orca Streaming Text-to-Speech can synthesize speech in different languages and with a variety of voices,
each of which is characterized by a model file (`.pv`) located in [lib/common](../../lib/common).
The language and gender of the speaker is indicated in the file name.

To add the Orca model file to your Android application:

Expand Down
12 changes: 9 additions & 3 deletions binding/dotnet/Orca/Orca.cs
Original file line number Diff line number Diff line change
Expand Up @@ -404,6 +404,7 @@ public short[] Synthesize(string text)
pv_orca_pcm_delete(cPcm);

}

return pcm;
}

Expand Down Expand Up @@ -437,9 +438,14 @@ public short[] Flush()
HandlePvStatus(status, "Orca stream flush failed");
}

short[] pcm = new short[numSamples];
Marshal.Copy(cPcm, pcm, 0, numSamples);
pv_orca_pcm_delete(cPcm);
short[] pcm = null;
if (numSamples > 0)
{
pcm = new short[numSamples];
Marshal.Copy(cPcm, pcm, 0, numSamples);
pv_orca_pcm_delete(cPcm);

}

return pcm;
}
Expand Down
8 changes: 4 additions & 4 deletions binding/dotnet/Orca/Orca.csproj
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFrameworks>net8.0;net6.0;netcoreapp3.0;netstandard2.0</TargetFrameworks>
<Version>1.0.0</Version>
<Version>1.1.0</Version>
<Authors>Picovoice</Authors>
<Company />
<Product>Orca Streaming Text-to-Speech Engine</Product>
Expand Down Expand Up @@ -100,12 +100,12 @@
</Content>
</ItemGroup>
<ItemGroup>
<Content Include="..\..\..\lib\common\orca_params_female.pv">
<Content Include="..\..\..\lib\common\orca_params_en_female.pv">
<PackagePath>
buildTransitive/common/orca_params_female.pv;
buildTransitive/common/orca_params_en_female.pv;
</PackagePath>
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Link>lib\common\orca_params_female.pv</Link>
<Link>lib\common\orca_params_en_female.pv</Link>
<Visible>false</Visible>
</Content>
</ItemGroup>
Expand Down
6 changes: 3 additions & 3 deletions binding/dotnet/Orca/Picovoice.Orca.netstandard2.0.targets
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,9 @@
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Visible>false</Visible>
</Content>
<Content Include="$(MSBuildThisFileDirectory)/../common/orca_params_female.pv">
<Link>lib/common/orca_params_female.pv</Link>
<PackagePath>content/picovoice/common/orca_params_female.pv</PackagePath>
<Content Include="$(MSBuildThisFileDirectory)/../common/orca_params_en_female.pv">
<Link>lib/common/orca_params_en_female.pv</Link>
<PackagePath>content/picovoice/common/orca_params_en_female.pv</PackagePath>
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Visible>false</Visible>
</Content>
Expand Down
6 changes: 3 additions & 3 deletions binding/dotnet/Orca/Picovoice.Orca.targets
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Visible>false</Visible>
</Content>
<Content Include="$(MSBuildThisFileDirectory)/../common/orca_params_female.pv">
<Link>lib/common/orca_params_female.pv</Link>
<PackagePath>content/picovoice/common/orca_params_female.pv</PackagePath>
<Content Include="$(MSBuildThisFileDirectory)/../common/orca_params_en_female.pv">
<Link>lib/common/orca_params_en_female.pv</Link>
<PackagePath>content/picovoice/common/orca_params_en_female.pv</PackagePath>
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Visible>false</Visible>
</Content>
Expand Down
2 changes: 1 addition & 1 deletion binding/dotnet/Orca/Utils.cs
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ private static string GetCpuPart()

public static string PvModelPath()
{
return Path.Combine(AppContext.BaseDirectory, "lib/common/orca_params_female.pv");
return Path.Combine(AppContext.BaseDirectory, "lib/common/orca_params_en_female.pv");
}
}
}
8 changes: 5 additions & 3 deletions binding/dotnet/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,10 +161,12 @@ The pronunciation is expressed in [ARPAbet](https://en.wikipedia.org/wiki/ARPABE
- "{read|R IY D} this as {read|R EH D}, please."
- "I {live|L IH V} in {Sevilla|S EH V IY Y AH}. We have great {live|L AY V} sports!"

### Voices
### Language and Voice

Orca Streaming Text-to-Speech can synthesize speech in different languages and with a variety of voices,
each of which is characterized by a model file (`.pv`) located in [lib/common](../../lib/common).
The language and gender of the speaker is indicated in the file name.

Orca can synthesize speech with various voices, each of which is characterized by a model file located
in [lib/common](https://github.com/Picovoice/orca/tree/main/lib/common).
To create an instance of the engine with a specific voice, use:

```csharp
Expand Down
4 changes: 2 additions & 2 deletions binding/ios/Orca-iOS.podspec
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Pod::Spec.new do |s|
s.name = 'Orca-iOS'
s.module_name = 'Orca'
s.version = '1.0.1'
s.version = '1.1.0'
s.license = {:type => 'Apache 2.0'}
s.summary = 'iOS binding for Picovoice\'s Orca Text-to-Speech Engine.'
s.description =
Expand All @@ -11,7 +11,7 @@ Pod::Spec.new do |s|
Orca is an on-device text-to-speech engine producing high-quality, realistic, spoken audio with zero latency. Orca is:
- Private; All voice processing runs locally.
- Cross-Platform:
- Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64)
- Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64, arm64)
- Android and iOS
- Chrome, Safari, Firefox, and Edge
- Raspberry Pi (3, 4, 5)
Expand Down
2 changes: 1 addition & 1 deletion binding/ios/Orca.swift
Original file line number Diff line number Diff line change
Expand Up @@ -235,7 +235,7 @@ public class Orca {
}

var cNumCharacters: Int32 = 0
var cCharacters: UnsafeMutablePointer<UnsafePointer<Int8>?>?
var cCharacters: UnsafePointer<UnsafePointer<Int8>?>?
let validCharactersStatus = pv_orca_valid_characters(handle, &cNumCharacters, &cCharacters)
if validCharactersStatus != PV_STATUS_SUCCESS {
let messageStack = try getMessageStack()
Expand Down
6 changes: 3 additions & 3 deletions binding/ios/OrcaAppTest/OrcaAppTest.xcodeproj/project.pbxproj
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,7 @@
);
mainGroup = 1E00643F27CEDF9B006FF6E9;
packageReferences = (
E1F352FD2D00ECD60069B0E6 /* XCRemoteSwiftPackageReference "orca" */,
07F723562D6D57E90002D88F /* XCRemoteSwiftPackageReference "orca" */,
);
productRefGroup = 1E00644927CEDF9B006FF6E9 /* Products */;
projectDirPath = "";
Expand Down Expand Up @@ -655,12 +655,12 @@
/* End XCLocalSwiftPackageReference section */

/* Begin XCRemoteSwiftPackageReference section */
E1F352FD2D00ECD60069B0E6 /* XCRemoteSwiftPackageReference "orca" */ = {
07F723562D6D57E90002D88F /* XCRemoteSwiftPackageReference "orca" */ = {
isa = XCRemoteSwiftPackageReference;
repositoryURL = "https://github.com/Picovoice/orca";
requirement = {
kind = exactVersion;
version = 1.0.1;
version = 1.1.0;
};
};
/* End XCRemoteSwiftPackageReference section */
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -412,7 +412,7 @@ class OrcaAppTestUITests: BaseTest {
func testMessageStack() throws {
let bundle = Bundle(for: type(of: self))
let modelPath: String = bundle.path(
forResource: "orca_params_female",
forResource: "orca_params_en_female",
ofType: "pv",
inDirectory: "test_resources/model_files")!

Expand All @@ -436,7 +436,7 @@ class OrcaAppTestUITests: BaseTest {
func testSynthesizeMessageStack() throws {
let bundle = Bundle(for: type(of: self))
let modelPath: String = bundle.path(
forResource: "orca_params_female",
forResource: "orca_params_en_female",
ofType: "pv",
inDirectory: "test_resources/model_files")!

Expand Down
12 changes: 7 additions & 5 deletions binding/ios/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,11 +130,13 @@ The pronunciation is expressed in [ARPAbet](https://en.wikipedia.org/wiki/ARPABE
- "{read|R IY D} this as {read|R EH D}, please."
- "I {live|L IH V} in {Sevilla|S EH V IY Y AH}. We have great {live|L AY V} sports!"

### Voices
### Language and Voice

Orca can synthesize speech with various voices, each of which is characterized by a model file located
in [lib/common](https://github.com/Picovoice/orca/tree/main/lib/common).
To create an instance of the engine with a specific voice, use:
Orca Streaming Text-to-Speech can synthesize speech in different languages and with a variety of voices,
each of which is characterized by a model file (`.pv`) located in [lib/common](../../lib/common).
The language and gender of the speaker is indicated in the file name.

To create an instance of the engine with a specific language and voice, use:

```swift
import Orca
Expand All @@ -146,7 +148,7 @@ do {
} catch { }
```

and replace `${MODEL_FILE_PATH}` or `${MODEL_FILE_URL}` with the path to the model file with the desired voice.
and replace `${MODEL_FILE_PATH}` or `${MODEL_FILE_URL}` with the path to the model file with the desired language/voice.

### Speech control

Expand Down
10 changes: 6 additions & 4 deletions binding/nodejs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ orca.release()

### Text input

Orca supports a wide range of English characters, including letters, numbers, symbols, and punctuation marks.
Orca supports a wide range of English characters, including letters, numbers, symbols, and punctuation marks.
You can get a list of all supported characters by calling `validCharacters()`.
Pronunciations of characters or words not supported by this list can be achieved with
[custom pronunciations](#custom-pronunciations).
Expand All @@ -119,10 +119,12 @@ The pronunciation is expressed in [ARPAbet](https://en.wikipedia.org/wiki/ARPABE
- "{read|R IY D} this as {read|R EH D}, please."
- "I {live|L IH V} in {Sevilla|S EH V IY Y AH}. We have great {live|L AY V} sports!"

### Voices
### Language and Voice

Orca Streaming Text-to-Speech can synthesize speech in different languages and with a variety of voices,
each of which is characterized by a model file (`.pv`) located in [lib/common](../../lib/common).
The language and gender of the speaker is indicated in the file name.

Orca can synthesize speech with various voices, each of which is characterized by a model file located
in [lib/common](https://github.com/Picovoice/orca/tree/main/lib/common).
To create an instance of the engine with a specific voice, use:

```typescript
Expand Down
Loading

0 comments on commit 42be884

Please sign in to comment.