diff options
Diffstat (limited to 'thirdparty')
406 files changed, 16170 insertions, 12992 deletions
diff --git a/thirdparty/README.md b/thirdparty/README.md index 6f0d4da958..27cfe41238 100644 --- a/thirdparty/README.md +++ b/thirdparty/README.md @@ -4,7 +4,7 @@ ## assimp - Upstream: http://github.com/assimp/assimp -- Version: git (1d565b0aab5a2ee00462f18c5b8a81f6a5454a48) +- Version: git (308db73d0b3c2d1870cd3e465eaa283692a4cf23) - License: BSD-3-Clause @@ -135,7 +135,7 @@ Files extracted from upstream source: ## glad - Upstream: https://github.com/Dav1dde/glad -- Version: 0.1.31 +- Version: 0.1.33 - License: MIT The files we package are automatically generated. @@ -266,15 +266,16 @@ changes are marked with `// -- GODOT --` comments. ## mbedtls - Upstream: https://tls.mbed.org/ -- Version: 2.16.2 +- Version: 2.16.3 - License: Apache 2.0 File extracted from upstream release tarball (`-apache.tgz` variant): - All `*.h` from `include/mbedtls/` to `thirdparty/mbedtls/include/mbedtls/` - All `*.c` from `library/` to `thirdparty/mbedtls/library/` -- Applied the patch in `thirdparty/mbedtls/1453.diff` (PR 1453). +- LICENSE and apache-2.0.txt files +- Applied the patch in `thirdparty/mbedtls/patches/1453.diff` (PR 1453). Soon to be merged upstream. Check it out at next update. -- Applied the patch in `thirdparty/mbedtls/padlock.diff`. This disables VIA +- Applied the patch in `thirdparty/mbedtls/patches/padlock.diff`. This disables VIA padlock support which defines a symbol `unsupported` which clashes with a symbol in libwebsockets. - Added 2 files `godot_core_mbedtls_platform.{c,h}` providing configuration @@ -284,12 +285,13 @@ File extracted from upstream release tarball (`-apache.tgz` variant): ## miniupnpc - Upstream: https://github.com/miniupnp/miniupnp/tree/master/miniupnpc -- Version: git (3cf6efa, 2019) +- Version: git (0ab1d67, 2019) - License: BSD-3-Clause -Extract only the `miniupnpc` folder inside `thirdparty/miniupnpc`. -Exclude all non `.c` and `.h` files, plus all files beginning with `test` -`minihttptestserver.c` and `wingenminiupnpcstrings.c`. +Files extracted from upstream source: + +- All `*.c` and `*.h` files from `miniupnpc` to `thirdparty/miniupnpc/miniupnpc` +- Remove `test*`, `minihttptestserver.c` and `wingenminiupnpcstrings.c` The only modified file is miniupnpcstrings.h, which was created for Godot (it is usually autogenerated by cmake). @@ -376,14 +378,14 @@ Collection of single-file libraries used in Godot components. * License: zlib - `stb_vorbis.c` * Upstream: https://github.com/nothings/stb - * Version: 1.16 + * Version: 1.17 * License: Public Domain (Unlicense) or MIT ## nanosvg - Upstream: https://github.com/memononen/nanosvg -- Version: git (c1f6e20, 2018) +- Version: git (25241c5, 2019) - License: zlib Files extracted from the upstream source: @@ -395,17 +397,19 @@ Files extracted from the upstream source: ## opus - Upstream: https://opus-codec.org -- Version: 1.1.5 (opus) and 0.8 (opusfile) +- Version: 1.3.1 (opus) and 0.11 (opusfile) - License: BSD-3-Clause Files extracted from upstream source: +- Run `opus/configure` and copy/sync changes to `config.h` + (note that this file may have Godot-specific options enabled) - all .c and .h files in src/ (both opus and opusfile) - all .h files in include/ (both opus and opusfile) as opus/ - remove unused `opus_demo.c`, - remove `http.c`, `wincerts.c` and `winerrno.h` (part of unused libopusurl) -- celt/ and silk/ subfolders +- celt/ and silk/ subfolders (minus tests folders) - COPYING @@ -468,7 +472,7 @@ comments and a patch is provided in the squish/ folder. ## tinyexr - Upstream: https://github.com/syoyo/tinyexr -- Version: git (65f9859, 2018) +- Version: git (656bb61, 2019) - License: BSD-3-Clause Files extracted from upstream source: @@ -479,7 +483,7 @@ Files extracted from upstream source: ## vhacd - Upstream: https://github.com/kmammou/v-hacd -- Version: git (2297aa1, 2018) +- Version: git (b07958e, 2019) - License: BSD-3-Clause Files extracted from upstream source: @@ -534,7 +538,7 @@ Files extracted from upstream source: ## zstd - Upstream: https://github.com/facebook/zstd -- Version: 1.4.1 +- Version: 1.4.4 - License: BSD-3-Clause Files extracted from upstream source: diff --git a/thirdparty/assimp/code/Common/BaseImporter.cpp b/thirdparty/assimp/code/Common/BaseImporter.cpp index de5018a250..5c1e605549 100644 --- a/thirdparty/assimp/code/Common/BaseImporter.cpp +++ b/thirdparty/assimp/code/Common/BaseImporter.cpp @@ -67,7 +67,20 @@ using namespace Assimp; // Constructor to be privately used by Importer BaseImporter::BaseImporter() AI_NO_EXCEPT : m_progress() { - // nothing to do here + /** + * Assimp Importer + * unit conversions available + * if you need another measurment unit add it below. + * it's currently defined in assimp that we prefer meters. + * + * NOTE: Initialised here rather than in the header file + * to workaround a VS2013 bug with brace initialisers + * */ + importerUnits[ImporterUnits::M] = 1.0; + importerUnits[ImporterUnits::CM] = 0.01; + importerUnits[ImporterUnits::MM] = 0.001; + importerUnits[ImporterUnits::INCHES] = 0.0254; + importerUnits[ImporterUnits::FEET] = 0.3048; } // ------------------------------------------------------------------------------------------------ @@ -85,7 +98,7 @@ void BaseImporter::UpdateImporterScale( Importer* pImp ) double activeScale = importerScale * fileScale; // Set active scaling - pImp->SetPropertyFloat( AI_CONFIG_APP_SCALE_KEY, activeScale); + pImp->SetPropertyFloat( AI_CONFIG_APP_SCALE_KEY, static_cast<float>( activeScale) ); ASSIMP_LOG_DEBUG_F("UpdateImporterScale scale set: %f", activeScale ); } diff --git a/thirdparty/assimp/code/Common/DefaultIOSystem.cpp b/thirdparty/assimp/code/Common/DefaultIOSystem.cpp index d40b67de32..6fdc24dd80 100644 --- a/thirdparty/assimp/code/Common/DefaultIOSystem.cpp +++ b/thirdparty/assimp/code/Common/DefaultIOSystem.cpp @@ -61,83 +61,66 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. using namespace Assimp; -// maximum path length -// XXX http://insanecoding.blogspot.com/2007/11/pathmax-simply-isnt.html -#ifdef PATH_MAX -# define PATHLIMIT PATH_MAX -#else -# define PATHLIMIT 4096 +#ifdef _WIN32 +static std::wstring Utf8ToWide(const char* in) +{ + int size = MultiByteToWideChar(CP_UTF8, 0, in, -1, nullptr, 0); + // size includes terminating null; std::wstring adds null automatically + std::wstring out(static_cast<size_t>(size) - 1, L'\0'); + MultiByteToWideChar(CP_UTF8, 0, in, -1, &out[0], size); + return out; +} + +static std::string WideToUtf8(const wchar_t* in) +{ + int size = WideCharToMultiByte(CP_UTF8, 0, in, -1, nullptr, 0, nullptr, nullptr); + // size includes terminating null; std::string adds null automatically + std::string out(static_cast<size_t>(size) - 1, '\0'); + WideCharToMultiByte(CP_UTF8, 0, in, -1, &out[0], size, nullptr, nullptr); + return out; +} #endif // ------------------------------------------------------------------------------------------------ // Tests for the existence of a file at the given path. -bool DefaultIOSystem::Exists( const char* pFile) const +bool DefaultIOSystem::Exists(const char* pFile) const { #ifdef _WIN32 - wchar_t fileName16[PATHLIMIT]; - -#ifndef WindowsStore - bool isUnicode = IsTextUnicode(pFile, static_cast<int>(strlen(pFile)), NULL) != 0; - if (isUnicode) { - - MultiByteToWideChar(CP_UTF8, MB_PRECOMPOSED, pFile, -1, fileName16, PATHLIMIT); - struct __stat64 filestat; - if (0 != _wstat64(fileName16, &filestat)) { - return false; - } - } else { -#endif - FILE* file = ::fopen(pFile, "rb"); - if (!file) - return false; - - ::fclose(file); -#ifndef WindowsStore + struct __stat64 filestat; + if (_wstat64(Utf8ToWide(pFile).c_str(), &filestat) != 0) { + return false; } -#endif #else - FILE* file = ::fopen( pFile, "rb"); - if( !file) + FILE* file = ::fopen(pFile, "rb"); + if (!file) return false; - ::fclose( file); + ::fclose(file); #endif return true; } // ------------------------------------------------------------------------------------------------ // Open a new file with a given path. -IOStream* DefaultIOSystem::Open( const char* strFile, const char* strMode) +IOStream* DefaultIOSystem::Open(const char* strFile, const char* strMode) { - ai_assert(NULL != strFile); - ai_assert(NULL != strMode); + ai_assert(strFile != nullptr); + ai_assert(strMode != nullptr); FILE* file; #ifdef _WIN32 - wchar_t fileName16[PATHLIMIT]; -#ifndef WindowsStore - bool isUnicode = IsTextUnicode(strFile, static_cast<int>(strlen(strFile)), NULL) != 0; - if (isUnicode) { - MultiByteToWideChar(CP_UTF8, MB_PRECOMPOSED, strFile, -1, fileName16, PATHLIMIT); - std::string mode8(strMode); - file = ::_wfopen(fileName16, std::wstring(mode8.begin(), mode8.end()).c_str()); - } else { -#endif - file = ::fopen(strFile, strMode); -#ifndef WindowsStore - } -#endif + file = ::_wfopen(Utf8ToWide(strFile).c_str(), Utf8ToWide(strMode).c_str()); #else file = ::fopen(strFile, strMode); #endif - if (nullptr == file) + if (!file) return nullptr; - return new DefaultIOStream(file, (std::string) strFile); + return new DefaultIOStream(file, strFile); } // ------------------------------------------------------------------------------------------------ // Closes the given file and releases all resources associated with it. -void DefaultIOSystem::Close( IOStream* pFile) +void DefaultIOSystem::Close(IOStream* pFile) { delete pFile; } @@ -155,78 +138,56 @@ char DefaultIOSystem::getOsSeparator() const // ------------------------------------------------------------------------------------------------ // IOSystem default implementation (ComparePaths isn't a pure virtual function) -bool IOSystem::ComparePaths (const char* one, const char* second) const +bool IOSystem::ComparePaths(const char* one, const char* second) const { - return !ASSIMP_stricmp(one,second); + return !ASSIMP_stricmp(one, second); } // ------------------------------------------------------------------------------------------------ // Convert a relative path into an absolute path -inline static void MakeAbsolutePath (const char* in, char* _out) +inline static std::string MakeAbsolutePath(const char* in) { - ai_assert(in && _out); -#if defined( _MSC_VER ) || defined( __MINGW32__ ) -#ifndef WindowsStore - bool isUnicode = IsTextUnicode(in, static_cast<int>(strlen(in)), NULL) != 0; - if (isUnicode) { - wchar_t out16[PATHLIMIT]; - wchar_t in16[PATHLIMIT]; - MultiByteToWideChar(CP_UTF8, MB_PRECOMPOSED, in, -1, out16, PATHLIMIT); - wchar_t* ret = ::_wfullpath(out16, in16, PATHLIMIT); - if (ret) { - WideCharToMultiByte(CP_UTF8, MB_PRECOMPOSED, out16, -1, _out, PATHLIMIT, nullptr, nullptr); - } - if (!ret) { - // preserve the input path, maybe someone else is able to fix - // the path before it is accessed (e.g. our file system filter) - ASSIMP_LOG_WARN_F("Invalid path: ", std::string(in)); - strcpy(_out, in); - } - - } else { -#endif - char* ret = :: _fullpath(_out, in, PATHLIMIT); - if (!ret) { - // preserve the input path, maybe someone else is able to fix - // the path before it is accessed (e.g. our file system filter) - ASSIMP_LOG_WARN_F("Invalid path: ", std::string(in)); - strcpy(_out, in); - } -#ifndef WindowsStore + ai_assert(in); + std::string out; +#ifdef _WIN32 + wchar_t* ret = ::_wfullpath(nullptr, Utf8ToWide(in).c_str(), 0); + if (ret) { + out = WideToUtf8(ret); + free(ret); } -#endif #else - // use realpath - char* ret = realpath(in, _out); - if(!ret) { + char* ret = realpath(in, nullptr); + if (ret) { + out = ret; + free(ret); + } +#endif + if (!ret) { // preserve the input path, maybe someone else is able to fix // the path before it is accessed (e.g. our file system filter) ASSIMP_LOG_WARN_F("Invalid path: ", std::string(in)); - strcpy(_out,in); + out = in; } -#endif + return out; } // ------------------------------------------------------------------------------------------------ // DefaultIOSystem's more specialized implementation -bool DefaultIOSystem::ComparePaths (const char* one, const char* second) const +bool DefaultIOSystem::ComparePaths(const char* one, const char* second) const { // chances are quite good both paths are formatted identically, // so we can hopefully return here already - if( !ASSIMP_stricmp(one,second) ) + if (!ASSIMP_stricmp(one, second)) return true; - char temp1[PATHLIMIT]; - char temp2[PATHLIMIT]; - - MakeAbsolutePath (one, temp1); - MakeAbsolutePath (second, temp2); + std::string temp1 = MakeAbsolutePath(one); + std::string temp2 = MakeAbsolutePath(second); - return !ASSIMP_stricmp(temp1,temp2); + return !ASSIMP_stricmp(temp1, temp2); } // ------------------------------------------------------------------------------------------------ -std::string DefaultIOSystem::fileName( const std::string &path ) +std::string DefaultIOSystem::fileName(const std::string& path) { std::string ret = path; std::size_t last = ret.find_last_of("\\/"); @@ -235,16 +196,16 @@ std::string DefaultIOSystem::fileName( const std::string &path ) } // ------------------------------------------------------------------------------------------------ -std::string DefaultIOSystem::completeBaseName( const std::string &path ) +std::string DefaultIOSystem::completeBaseName(const std::string& path) { std::string ret = fileName(path); std::size_t pos = ret.find_last_of('.'); - if(pos != ret.npos) ret = ret.substr(0, pos); + if (pos != std::string::npos) ret = ret.substr(0, pos); return ret; } // ------------------------------------------------------------------------------------------------ -std::string DefaultIOSystem::absolutePath( const std::string &path ) +std::string DefaultIOSystem::absolutePath(const std::string& path) { std::string ret = path; std::size_t last = ret.find_last_of("\\/"); @@ -253,5 +214,3 @@ std::string DefaultIOSystem::absolutePath( const std::string &path ) } // ------------------------------------------------------------------------------------------------ - -#undef PATHLIMIT diff --git a/thirdparty/assimp/code/Common/Exporter.cpp b/thirdparty/assimp/code/Common/Exporter.cpp index 090b561ae0..4ce1a2bd80 100644 --- a/thirdparty/assimp/code/Common/Exporter.cpp +++ b/thirdparty/assimp/code/Common/Exporter.cpp @@ -102,6 +102,8 @@ void ExportSceneX3D(const char*, IOSystem*, const aiScene*, const ExportProperti void ExportSceneFBX(const char*, IOSystem*, const aiScene*, const ExportProperties*); void ExportSceneFBXA(const char*, IOSystem*, const aiScene*, const ExportProperties*); void ExportScene3MF( const char*, IOSystem*, const aiScene*, const ExportProperties* ); +void ExportSceneM3D(const char*, IOSystem*, const aiScene*, const ExportProperties*); +void ExportSceneA3D(const char*, IOSystem*, const aiScene*, const ExportProperties*); void ExportAssimp2Json(const char* , IOSystem*, const aiScene* , const Assimp::ExportProperties*); // ------------------------------------------------------------------------------------------------ @@ -179,6 +181,11 @@ Exporter::ExportFormatEntry gExporters[] = Exporter::ExportFormatEntry( "fbxa", "Autodesk FBX (ascii)", "fbx", &ExportSceneFBXA, 0 ), #endif +#ifndef ASSIMP_BUILD_NO_M3D_EXPORTER + Exporter::ExportFormatEntry( "m3d", "Model 3D (binary)", "m3d", &ExportSceneM3D, 0 ), + Exporter::ExportFormatEntry( "a3d", "Model 3D (ascii)", "m3d", &ExportSceneA3D, 0 ), +#endif + #ifndef ASSIMP_BUILD_NO_3MF_EXPORTER Exporter::ExportFormatEntry( "3mf", "The 3MF-File-Format", "3mf", &ExportScene3MF, 0 ), #endif @@ -316,34 +323,6 @@ const aiExportDataBlob* Exporter::ExportToBlob( const aiScene* pScene, const cha } // ------------------------------------------------------------------------------------------------ -bool IsVerboseFormat(const aiMesh* mesh) { - // avoid slow vector<bool> specialization - std::vector<unsigned int> seen(mesh->mNumVertices,0); - for(unsigned int i = 0; i < mesh->mNumFaces; ++i) { - const aiFace& f = mesh->mFaces[i]; - for(unsigned int j = 0; j < f.mNumIndices; ++j) { - if(++seen[f.mIndices[j]] == 2) { - // found a duplicate index - return false; - } - } - } - - return true; -} - -// ------------------------------------------------------------------------------------------------ -bool IsVerboseFormat(const aiScene* pScene) { - for(unsigned int i = 0; i < pScene->mNumMeshes; ++i) { - if(!IsVerboseFormat(pScene->mMeshes[i])) { - return false; - } - } - - return true; -} - -// ------------------------------------------------------------------------------------------------ aiReturn Exporter::Export( const aiScene* pScene, const char* pFormatId, const char* pPath, unsigned int pPreprocessing, const ExportProperties* pProperties) { ASSIMP_BEGIN_EXCEPTION_REGION(); @@ -352,7 +331,7 @@ aiReturn Exporter::Export( const aiScene* pScene, const char* pFormatId, const c // format. They will likely not be aware that there is a flag in the scene to indicate // this, however. To avoid surprises and bug reports, we check for duplicates in // meshes upfront. - const bool is_verbose_format = !(pScene->mFlags & AI_SCENE_FLAGS_NON_VERBOSE_FORMAT) || IsVerboseFormat(pScene); + const bool is_verbose_format = !(pScene->mFlags & AI_SCENE_FLAGS_NON_VERBOSE_FORMAT) || MakeVerboseFormatProcess::IsVerboseFormat(pScene); pimpl->mProgressHandler->UpdateFileWrite(0, 4); @@ -472,7 +451,10 @@ aiReturn Exporter::Export( const aiScene* pScene, const char* pFormatId, const c } ExportProperties emptyProperties; // Never pass NULL ExportProperties so Exporters don't have to worry. - exp.mExportFunction(pPath,pimpl->mIOSystem.get(),scenecopy.get(), pProperties ? pProperties : &emptyProperties); + ExportProperties* pProp = pProperties ? (ExportProperties*)pProperties : &emptyProperties; + pProp->SetPropertyBool("bJoinIdenticalVertices", must_join_again); + exp.mExportFunction(pPath,pimpl->mIOSystem.get(),scenecopy.get(), pProp); + exp.mExportFunction(pPath,pimpl->mIOSystem.get(),scenecopy.get(), pProp); pimpl->mProgressHandler->UpdateFileWrite(4, 4); } catch (DeadlyExportError& err) { diff --git a/thirdparty/assimp/code/Common/ImporterRegistry.cpp b/thirdparty/assimp/code/Common/ImporterRegistry.cpp index 32ac3b4168..b9f28f0356 100644 --- a/thirdparty/assimp/code/Common/ImporterRegistry.cpp +++ b/thirdparty/assimp/code/Common/ImporterRegistry.cpp @@ -197,6 +197,9 @@ corresponding preprocessor flag to selectively disable formats. #ifndef ASSIMP_BUILD_NO_MMD_IMPORTER # include "MMD/MMDImporter.h" #endif +#ifndef ASSIMP_BUILD_NO_M3D_IMPORTER +# include "M3D/M3DImporter.h" +#endif #ifndef ASSIMP_BUILD_NO_STEP_IMPORTER # include "Importer/StepFile/StepFileImporter.h" #endif @@ -223,6 +226,9 @@ void GetImporterInstanceList(std::vector< BaseImporter* >& out) #if (!defined ASSIMP_BUILD_NO_3DS_IMPORTER) out.push_back( new Discreet3DSImporter()); #endif +#if (!defined ASSIMP_BUILD_NO_M3D_IMPORTER) + out.push_back( new M3DImporter()); +#endif #if (!defined ASSIMP_BUILD_NO_MD3_IMPORTER) out.push_back( new MD3Importer()); #endif diff --git a/thirdparty/assimp/code/Common/PostStepRegistry.cpp b/thirdparty/assimp/code/Common/PostStepRegistry.cpp index ef58f8ddfd..8ff4af0400 100644 --- a/thirdparty/assimp/code/Common/PostStepRegistry.cpp +++ b/thirdparty/assimp/code/Common/PostStepRegistry.cpp @@ -131,11 +131,15 @@ corresponding preprocessor flag to selectively disable steps. #if (!defined ASSIMP_BUILD_NO_GLOBALSCALE_PROCESS) # include "PostProcessing/ScaleProcess.h" #endif +#if (!defined ASSIMP_BUILD_NO_ARMATUREPOPULATE_PROCESS) +# include "PostProcessing/ArmaturePopulate.h" +#endif #if (!defined ASSIMP_BUILD_NO_GENBOUNDINGBOXES_PROCESS) # include "PostProcessing/GenBoundingBoxesProcess.h" #endif + namespace Assimp { // ------------------------------------------------------------------------------------------------ @@ -180,6 +184,9 @@ void GetPostProcessingStepInstanceList(std::vector< BaseProcess* >& out) #if (!defined ASSIMP_BUILD_NO_GLOBALSCALE_PROCESS) out.push_back( new ScaleProcess()); #endif +#if (!defined ASSIMP_BUILD_NO_ARMATUREPOPULATE_PROCESS) + out.push_back( new ArmaturePopulate()); +#endif #if (!defined ASSIMP_BUILD_NO_PRETRANSFORMVERTICES_PROCESS) out.push_back( new PretransformVertices()); #endif diff --git a/thirdparty/assimp/code/Common/SceneCombiner.cpp b/thirdparty/assimp/code/Common/SceneCombiner.cpp index e445bd7434..f7b13cc951 100644 --- a/thirdparty/assimp/code/Common/SceneCombiner.cpp +++ b/thirdparty/assimp/code/Common/SceneCombiner.cpp @@ -1091,6 +1091,35 @@ void SceneCombiner::Copy( aiMesh** _dest, const aiMesh* src ) { aiFace& f = dest->mFaces[i]; GetArrayCopy(f.mIndices,f.mNumIndices); } + + // make a deep copy of all blend shapes + CopyPtrArray(dest->mAnimMeshes, dest->mAnimMeshes, dest->mNumAnimMeshes); +} + +// ------------------------------------------------------------------------------------------------ +void SceneCombiner::Copy(aiAnimMesh** _dest, const aiAnimMesh* src) { + if (nullptr == _dest || nullptr == src) { + return; + } + + aiAnimMesh* dest = *_dest = new aiAnimMesh(); + + // get a flat copy + ::memcpy(dest, src, sizeof(aiAnimMesh)); + + // and reallocate all arrays + GetArrayCopy(dest->mVertices, dest->mNumVertices); + GetArrayCopy(dest->mNormals, dest->mNumVertices); + GetArrayCopy(dest->mTangents, dest->mNumVertices); + GetArrayCopy(dest->mBitangents, dest->mNumVertices); + + unsigned int n = 0; + while (dest->HasTextureCoords(n)) + GetArrayCopy(dest->mTextureCoords[n++], dest->mNumVertices); + + n = 0; + while (dest->HasVertexColors(n)) + GetArrayCopy(dest->mColors[n++], dest->mNumVertices); } // ------------------------------------------------------------------------------------------------ @@ -1167,6 +1196,7 @@ void SceneCombiner::Copy( aiAnimation** _dest, const aiAnimation* src ) { // and reallocate all arrays CopyPtrArray( dest->mChannels, src->mChannels, dest->mNumChannels ); + CopyPtrArray( dest->mMorphMeshChannels, src->mMorphMeshChannels, dest->mNumMorphMeshChannels ); } // ------------------------------------------------------------------------------------------------ @@ -1186,6 +1216,26 @@ void SceneCombiner::Copy(aiNodeAnim** _dest, const aiNodeAnim* src) { GetArrayCopy( dest->mRotationKeys, dest->mNumRotationKeys ); } +void SceneCombiner::Copy(aiMeshMorphAnim** _dest, const aiMeshMorphAnim* src) { + if ( nullptr == _dest || nullptr == src ) { + return; + } + + aiMeshMorphAnim* dest = *_dest = new aiMeshMorphAnim(); + + // get a flat copy + ::memcpy(dest,src,sizeof(aiMeshMorphAnim)); + + // and reallocate all arrays + GetArrayCopy( dest->mKeys, dest->mNumKeys ); + for (ai_uint i = 0; i < dest->mNumKeys;++i) { + dest->mKeys[i].mValues = new unsigned int[dest->mKeys[i].mNumValuesAndWeights]; + dest->mKeys[i].mWeights = new double[dest->mKeys[i].mNumValuesAndWeights]; + ::memcpy(dest->mKeys[i].mValues, src->mKeys[i].mValues, dest->mKeys[i].mNumValuesAndWeights * sizeof(unsigned int)); + ::memcpy(dest->mKeys[i].mWeights, src->mKeys[i].mWeights, dest->mKeys[i].mNumValuesAndWeights * sizeof(double)); + } +} + // ------------------------------------------------------------------------------------------------ void SceneCombiner::Copy( aiCamera** _dest,const aiCamera* src) { if ( nullptr == _dest || nullptr == src ) { diff --git a/thirdparty/assimp/code/Common/Version.cpp b/thirdparty/assimp/code/Common/Version.cpp index cc94340ac8..cf1da7d5ba 100644 --- a/thirdparty/assimp/code/Common/Version.cpp +++ b/thirdparty/assimp/code/Common/Version.cpp @@ -46,8 +46,7 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #include <assimp/scene.h> #include "ScenePrivate.h" -static const unsigned int MajorVersion = 4; -static const unsigned int MinorVersion = 1; +#include "revision.h" // -------------------------------------------------------------------------------- // Legal information string - don't remove this. @@ -56,9 +55,9 @@ static const char* LEGAL_INFORMATION = "Open Asset Import Library (Assimp).\n" "A free C/C++ library to import various 3D file formats into applications\n\n" -"(c) 2008-2017, assimp team\n" +"(c) 2006-2019, assimp team\n" "License under the terms and conditions of the 3-clause BSD license\n" -"http://assimp.sourceforge.net\n" +"http://assimp.org\n" ; // ------------------------------------------------------------------------------------------------ @@ -70,13 +69,13 @@ ASSIMP_API const char* aiGetLegalString () { // ------------------------------------------------------------------------------------------------ // Get Assimp minor version ASSIMP_API unsigned int aiGetVersionMinor () { - return MinorVersion; + return VER_MINOR; } // ------------------------------------------------------------------------------------------------ // Get Assimp major version ASSIMP_API unsigned int aiGetVersionMajor () { - return MajorVersion; + return VER_MAJOR; } // ------------------------------------------------------------------------------------------------ @@ -104,9 +103,6 @@ ASSIMP_API unsigned int aiGetCompileFlags () { return flags; } -// include current build revision, which is even updated from time to time -- :-) -#include "revision.h" - // ------------------------------------------------------------------------------------------------ ASSIMP_API unsigned int aiGetVersionRevision() { return GitVersion; diff --git a/thirdparty/assimp/code/Common/scene.cpp b/thirdparty/assimp/code/Common/scene.cpp index 2acb348d81..d15619acff 100644 --- a/thirdparty/assimp/code/Common/scene.cpp +++ b/thirdparty/assimp/code/Common/scene.cpp @@ -44,23 +44,23 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. aiNode::aiNode() : mName("") -, mParent(NULL) +, mParent(nullptr) , mNumChildren(0) -, mChildren(NULL) +, mChildren(nullptr) , mNumMeshes(0) -, mMeshes(NULL) -, mMetaData(NULL) { +, mMeshes(nullptr) +, mMetaData(nullptr) { // empty } aiNode::aiNode(const std::string& name) : mName(name) -, mParent(NULL) +, mParent(nullptr) , mNumChildren(0) -, mChildren(NULL) +, mChildren(nullptr) , mNumMeshes(0) -, mMeshes(NULL) -, mMetaData(NULL) { +, mMeshes(nullptr) +, mMetaData(nullptr) { // empty } @@ -68,7 +68,7 @@ aiNode::aiNode(const std::string& name) aiNode::~aiNode() { // delete all children recursively // to make sure we won't crash if the data is invalid ... - if (mChildren && mNumChildren) + if (mNumChildren && mChildren) { for (unsigned int a = 0; a < mNumChildren; a++) delete mChildren[a]; diff --git a/thirdparty/assimp/code/FBX/FBXCompileConfig.h b/thirdparty/assimp/code/FBX/FBXCompileConfig.h index 3a3841fa5b..03536a1823 100644 --- a/thirdparty/assimp/code/FBX/FBXCompileConfig.h +++ b/thirdparty/assimp/code/FBX/FBXCompileConfig.h @@ -47,6 +47,7 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #define INCLUDED_AI_FBX_COMPILECONFIG_H #include <map> +#include <set> // #if _MSC_VER > 1500 || (defined __GNUC___) @@ -54,16 +55,23 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # else # define fbx_unordered_map map # define fbx_unordered_multimap multimap +# define fbx_unordered_set set +# define fbx_unordered_multiset multiset #endif #ifdef ASSIMP_FBX_USE_UNORDERED_MULTIMAP # include <unordered_map> +# include <unordered_set> # if _MSC_VER > 1600 # define fbx_unordered_map unordered_map # define fbx_unordered_multimap unordered_multimap +# define fbx_unordered_set unordered_set +# define fbx_unordered_multiset unordered_multiset # else # define fbx_unordered_map tr1::unordered_map # define fbx_unordered_multimap tr1::unordered_multimap +# define fbx_unordered_set tr1::unordered_set +# define fbx_unordered_multiset tr1::unordered_multiset # endif #endif diff --git a/thirdparty/assimp/code/FBX/FBXConverter.cpp b/thirdparty/assimp/code/FBX/FBXConverter.cpp index 3f64016ea4..d8a22d9f74 100644 --- a/thirdparty/assimp/code/FBX/FBXConverter.cpp +++ b/thirdparty/assimp/code/FBX/FBXConverter.cpp @@ -55,6 +55,7 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #include "FBXImporter.h" #include <assimp/StringComparison.h> +#include <assimp/MathFunctions.h> #include <assimp/scene.h> @@ -67,7 +68,8 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #include <sstream> #include <iomanip> #include <cstdint> - +#include <iostream> +#include <stdlib.h> namespace Assimp { namespace FBX { @@ -76,7 +78,7 @@ namespace Assimp { #define MAGIC_NODE_TAG "_$AssimpFbx$" -#define CONVERT_FBX_TIME(time) static_cast<double>(time) / 46186158000L +#define CONVERT_FBX_TIME(time) static_cast<double>(time) / 46186158000LL FBXConverter::FBXConverter(aiScene* out, const Document& doc, bool removeEmptyBones ) : defaultMaterialIndex() @@ -95,6 +97,14 @@ namespace Assimp { // populate the node_anim_chain_bits map, which is needed // to determine which nodes need to be generated. ConvertAnimations(); + // Embedded textures in FBX could be connected to nothing but to itself, + // for instance Texture -> Video connection only but not to the main graph, + // The idea here is to traverse all objects to find these Textures and convert them, + // so later during material conversion it will find converted texture in the textures_converted array. + if (doc.Settings().readTextures) + { + ConvertOrphantEmbeddedTextures(); + } ConvertRootNode(); if (doc.Settings().readAllMaterials) { @@ -144,7 +154,7 @@ namespace Assimp { out->mRootNode->mName.Set(unique_name); // root has ID 0 - ConvertNodes(0L, *out->mRootNode); + ConvertNodes(0L, out->mRootNode, out->mRootNode); } static std::string getAncestorBaseName(const aiNode* node) @@ -178,8 +188,11 @@ namespace Assimp { GetUniqueName(original_name, unique_name); return unique_name; } - - void FBXConverter::ConvertNodes(uint64_t id, aiNode& parent, const aiMatrix4x4& parent_transform) { + /// todo: pre-build node hierarchy + /// todo: get bone from stack + /// todo: make map of aiBone* to aiNode* + /// then update convert clusters to the new format + void FBXConverter::ConvertNodes(uint64_t id, aiNode *parent, aiNode *root_node) { const std::vector<const Connection*>& conns = doc.GetConnectionsByDestinationSequenced(id, "Model"); std::vector<aiNode*> nodes; @@ -190,62 +203,69 @@ namespace Assimp { try { for (const Connection* con : conns) { - // ignore object-property links if (con->PropertyName().length()) { - continue; + // really important we document why this is ignored. + FBXImporter::LogInfo("ignoring property link - no docs on why this is ignored"); + continue; //? } + // convert connection source object into Object base class const Object* const object = con->SourceObject(); if (nullptr == object) { - FBXImporter::LogWarn("failed to convert source object for Model link"); + FBXImporter::LogError("failed to convert source object for Model link"); continue; } + // FBX Model::Cube, Model::Bone001, etc elements + // This detects if we can cast the object into this model structure. const Model* const model = dynamic_cast<const Model*>(object); if (nullptr != model) { nodes_chain.clear(); post_nodes_chain.clear(); - aiMatrix4x4 new_abs_transform = parent_transform; - - std::string unique_name = MakeUniqueNodeName(model, parent); - + aiMatrix4x4 new_abs_transform = parent->mTransformation; + std::string node_name = FixNodeName(model->Name()); // even though there is only a single input node, the design of // assimp (or rather: the complicated transformation chain that // is employed by fbx) means that we may need multiple aiNode's // to represent a fbx node's transformation. - const bool need_additional_node = GenerateTransformationNodeChain(*model, unique_name, nodes_chain, post_nodes_chain); + + // generate node transforms - this includes pivot data + // if need_additional_node is true then you t + const bool need_additional_node = GenerateTransformationNodeChain(*model, node_name, nodes_chain, post_nodes_chain); + + // assert that for the current node we must have at least a single transform ai_assert(nodes_chain.size()); if (need_additional_node) { - nodes_chain.push_back(new aiNode(unique_name)); + nodes_chain.push_back(new aiNode(node_name)); } //setup metadata on newest node SetupNodeMetadata(*model, *nodes_chain.back()); // link all nodes in a row - aiNode* last_parent = &parent; - for (aiNode* prenode : nodes_chain) { - ai_assert(prenode); + aiNode* last_parent = parent; + for (aiNode* child : nodes_chain) { + ai_assert(child); - if (last_parent != &parent) { + if (last_parent != parent) { last_parent->mNumChildren = 1; last_parent->mChildren = new aiNode*[1]; - last_parent->mChildren[0] = prenode; + last_parent->mChildren[0] = child; } - prenode->mParent = last_parent; - last_parent = prenode; + child->mParent = last_parent; + last_parent = child; - new_abs_transform *= prenode->mTransformation; + new_abs_transform *= child->mTransformation; } // attach geometry - ConvertModel(*model, *nodes_chain.back(), new_abs_transform); + ConvertModel(*model, nodes_chain.back(), root_node, new_abs_transform); // check if there will be any child nodes const std::vector<const Connection*>& child_conns @@ -257,7 +277,7 @@ namespace Assimp { for (aiNode* postnode : post_nodes_chain) { ai_assert(postnode); - if (last_parent != &parent) { + if (last_parent != parent) { last_parent->mNumChildren = 1; last_parent->mChildren = new aiNode*[1]; last_parent->mChildren[0] = postnode; @@ -279,15 +299,15 @@ namespace Assimp { ); } - // attach sub-nodes (if any) - ConvertNodes(model->ID(), *last_parent, new_abs_transform); + // recursion call - child nodes + ConvertNodes(model->ID(), last_parent, root_node); if (doc.Settings().readLights) { - ConvertLights(*model, unique_name); + ConvertLights(*model, node_name); } if (doc.Settings().readCameras) { - ConvertCameras(*model, unique_name); + ConvertCameras(*model, node_name); } nodes.push_back(nodes_chain.front()); @@ -296,11 +316,17 @@ namespace Assimp { } if (nodes.size()) { - parent.mChildren = new aiNode*[nodes.size()](); - parent.mNumChildren = static_cast<unsigned int>(nodes.size()); + parent->mChildren = new aiNode*[nodes.size()](); + parent->mNumChildren = static_cast<unsigned int>(nodes.size()); - std::swap_ranges(nodes.begin(), nodes.end(), parent.mChildren); + std::swap_ranges(nodes.begin(), nodes.end(), parent->mChildren); } + else + { + parent->mNumChildren = 0; + parent->mChildren = nullptr; + } + } catch (std::exception&) { Util::delete_fun<aiNode> deleter; @@ -553,7 +579,7 @@ namespace Assimp { return; } - const float angle_epsilon = 1e-6f; + const float angle_epsilon = Math::getEpsilon<float>(); out = aiMatrix4x4(); @@ -694,7 +720,7 @@ namespace Assimp { std::fill_n(chain, static_cast<unsigned int>(TransformationComp_MAXIMUM), aiMatrix4x4()); // generate transformation matrices for all the different transformation components - const float zero_epsilon = 1e-6f; + const float zero_epsilon = Math::getEpsilon<float>(); const aiVector3D all_ones(1.0f, 1.0f, 1.0f); const aiVector3D& PreRotation = PropertyGet<aiVector3D>(props, "PreRotation", ok); @@ -802,7 +828,7 @@ namespace Assimp { // is_complex needs to be consistent with NeedsComplexTransformationChain() // or the interplay between this code and the animation converter would // not be guaranteed. - ai_assert(NeedsComplexTransformationChain(model) == ((chainBits & chainMaskComplex) != 0)); + //ai_assert(NeedsComplexTransformationChain(model) == ((chainBits & chainMaskComplex) != 0)); // now, if we have more than just Translation, Scaling and Rotation, // we need to generate a full node chain to accommodate for assimp's @@ -904,7 +930,8 @@ namespace Assimp { } } - void FBXConverter::ConvertModel(const Model& model, aiNode& nd, const aiMatrix4x4& node_global_transform) + void FBXConverter::ConvertModel(const Model &model, aiNode *parent, aiNode *root_node, + const aiMatrix4x4 &absolute_transform) { const std::vector<const Geometry*>& geos = model.GetGeometry(); @@ -916,11 +943,12 @@ namespace Assimp { const MeshGeometry* const mesh = dynamic_cast<const MeshGeometry*>(geo); const LineGeometry* const line = dynamic_cast<const LineGeometry*>(geo); if (mesh) { - const std::vector<unsigned int>& indices = ConvertMesh(*mesh, model, node_global_transform, nd); + const std::vector<unsigned int>& indices = ConvertMesh(*mesh, model, parent, root_node, + absolute_transform); std::copy(indices.begin(), indices.end(), std::back_inserter(meshes)); } else if (line) { - const std::vector<unsigned int>& indices = ConvertLine(*line, model, node_global_transform, nd); + const std::vector<unsigned int>& indices = ConvertLine(*line, model, parent, root_node); std::copy(indices.begin(), indices.end(), std::back_inserter(meshes)); } else { @@ -929,15 +957,16 @@ namespace Assimp { } if (meshes.size()) { - nd.mMeshes = new unsigned int[meshes.size()](); - nd.mNumMeshes = static_cast<unsigned int>(meshes.size()); + parent->mMeshes = new unsigned int[meshes.size()](); + parent->mNumMeshes = static_cast<unsigned int>(meshes.size()); - std::swap_ranges(meshes.begin(), meshes.end(), nd.mMeshes); + std::swap_ranges(meshes.begin(), meshes.end(), parent->mMeshes); } } - std::vector<unsigned int> FBXConverter::ConvertMesh(const MeshGeometry& mesh, const Model& model, - const aiMatrix4x4& node_global_transform, aiNode& nd) + std::vector<unsigned int> + FBXConverter::ConvertMesh(const MeshGeometry &mesh, const Model &model, aiNode *parent, aiNode *root_node, + const aiMatrix4x4 &absolute_transform) { std::vector<unsigned int> temp; @@ -961,18 +990,18 @@ namespace Assimp { const MatIndexArray::value_type base = mindices[0]; for (MatIndexArray::value_type index : mindices) { if (index != base) { - return ConvertMeshMultiMaterial(mesh, model, node_global_transform, nd); + return ConvertMeshMultiMaterial(mesh, model, parent, root_node, absolute_transform); } } } // faster code-path, just copy the data - temp.push_back(ConvertMeshSingleMaterial(mesh, model, node_global_transform, nd)); + temp.push_back(ConvertMeshSingleMaterial(mesh, model, absolute_transform, parent, root_node)); return temp; } std::vector<unsigned int> FBXConverter::ConvertLine(const LineGeometry& line, const Model& model, - const aiMatrix4x4& node_global_transform, aiNode& nd) + aiNode *parent, aiNode *root_node) { std::vector<unsigned int> temp; @@ -983,7 +1012,7 @@ namespace Assimp { return temp; } - aiMesh* const out_mesh = SetupEmptyMesh(line, nd); + aiMesh* const out_mesh = SetupEmptyMesh(line, root_node); out_mesh->mPrimitiveTypes |= aiPrimitiveType_LINE; // copy vertices @@ -1018,7 +1047,7 @@ namespace Assimp { return temp; } - aiMesh* FBXConverter::SetupEmptyMesh(const Geometry& mesh, aiNode& nd) + aiMesh* FBXConverter::SetupEmptyMesh(const Geometry& mesh, aiNode *parent) { aiMesh* const out_mesh = new aiMesh(); meshes.push_back(out_mesh); @@ -1035,17 +1064,18 @@ namespace Assimp { } else { - out_mesh->mName = nd.mName; + out_mesh->mName = parent->mName; } return out_mesh; } - unsigned int FBXConverter::ConvertMeshSingleMaterial(const MeshGeometry& mesh, const Model& model, - const aiMatrix4x4& node_global_transform, aiNode& nd) + unsigned int FBXConverter::ConvertMeshSingleMaterial(const MeshGeometry &mesh, const Model &model, + const aiMatrix4x4 &absolute_transform, aiNode *parent, + aiNode *root_node) { const MatIndexArray& mindices = mesh.GetMaterialIndices(); - aiMesh* const out_mesh = SetupEmptyMesh(mesh, nd); + aiMesh* const out_mesh = SetupEmptyMesh(mesh, parent); const std::vector<aiVector3D>& vertices = mesh.GetVertices(); const std::vector<unsigned int>& faces = mesh.GetFaceIndexCounts(); @@ -1112,7 +1142,7 @@ namespace Assimp { binormals = &tempBinormals; } else { - binormals = NULL; + binormals = nullptr; } } @@ -1162,8 +1192,9 @@ namespace Assimp { ConvertMaterialForMesh(out_mesh, model, mesh, mindices[0]); } - if (doc.Settings().readWeights && mesh.DeformerSkin() != NULL) { - ConvertWeights(out_mesh, model, mesh, node_global_transform, NO_MATERIAL_SEPARATION); + if (doc.Settings().readWeights && mesh.DeformerSkin() != nullptr) { + ConvertWeights(out_mesh, model, mesh, absolute_transform, parent, root_node, NO_MATERIAL_SEPARATION, + nullptr); } std::vector<aiAnimMesh*> animMeshes; @@ -1208,8 +1239,10 @@ namespace Assimp { return static_cast<unsigned int>(meshes.size() - 1); } - std::vector<unsigned int> FBXConverter::ConvertMeshMultiMaterial(const MeshGeometry& mesh, const Model& model, - const aiMatrix4x4& node_global_transform, aiNode& nd) + std::vector<unsigned int> + FBXConverter::ConvertMeshMultiMaterial(const MeshGeometry &mesh, const Model &model, aiNode *parent, + aiNode *root_node, + const aiMatrix4x4 &absolute_transform) { const MatIndexArray& mindices = mesh.GetMaterialIndices(); ai_assert(mindices.size()); @@ -1220,7 +1253,7 @@ namespace Assimp { for (MatIndexArray::value_type index : mindices) { if (had.find(index) == had.end()) { - indices.push_back(ConvertMeshMultiMaterial(mesh, model, index, node_global_transform, nd)); + indices.push_back(ConvertMeshMultiMaterial(mesh, model, index, parent, root_node, absolute_transform)); had.insert(index); } } @@ -1228,18 +1261,18 @@ namespace Assimp { return indices; } - unsigned int FBXConverter::ConvertMeshMultiMaterial(const MeshGeometry& mesh, const Model& model, - MatIndexArray::value_type index, - const aiMatrix4x4& node_global_transform, - aiNode& nd) + unsigned int FBXConverter::ConvertMeshMultiMaterial(const MeshGeometry &mesh, const Model &model, + MatIndexArray::value_type index, + aiNode *parent, aiNode *root_node, + const aiMatrix4x4 &absolute_transform) { - aiMesh* const out_mesh = SetupEmptyMesh(mesh, nd); + aiMesh* const out_mesh = SetupEmptyMesh(mesh, parent); const MatIndexArray& mindices = mesh.GetMaterialIndices(); const std::vector<aiVector3D>& vertices = mesh.GetVertices(); const std::vector<unsigned int>& faces = mesh.GetFaceIndexCounts(); - const bool process_weights = doc.Settings().readWeights && mesh.DeformerSkin() != NULL; + const bool process_weights = doc.Settings().readWeights && mesh.DeformerSkin() != nullptr; unsigned int count_faces = 0; unsigned int count_vertices = 0; @@ -1299,7 +1332,7 @@ namespace Assimp { binormals = &tempBinormals; } else { - binormals = NULL; + binormals = nullptr; } } @@ -1398,7 +1431,7 @@ namespace Assimp { ConvertMaterialForMesh(out_mesh, model, mesh, index); if (process_weights) { - ConvertWeights(out_mesh, model, mesh, node_global_transform, index, &reverseMapping); + ConvertWeights(out_mesh, model, mesh, absolute_transform, parent, root_node, index, &reverseMapping); } std::vector<aiAnimMesh*> animMeshes; @@ -1448,10 +1481,10 @@ namespace Assimp { return static_cast<unsigned int>(meshes.size() - 1); } - void FBXConverter::ConvertWeights(aiMesh* out, const Model& model, const MeshGeometry& geo, - const aiMatrix4x4& node_global_transform, - unsigned int materialIndex, - std::vector<unsigned int>* outputVertStartIndices) + void FBXConverter::ConvertWeights(aiMesh *out, const Model &model, const MeshGeometry &geo, + const aiMatrix4x4 &absolute_transform, + aiNode *parent, aiNode *root_node, unsigned int materialIndex, + std::vector<unsigned int> *outputVertStartIndices) { ai_assert(geo.DeformerSkin()); @@ -1462,13 +1495,12 @@ namespace Assimp { const Skin& sk = *geo.DeformerSkin(); std::vector<aiBone*> bones; - bones.reserve(sk.Clusters().size()); const bool no_mat_check = materialIndex == NO_MATERIAL_SEPARATION; ai_assert(no_mat_check || outputVertStartIndices); try { - + // iterate over the sub deformers for (const Cluster* cluster : sk.Clusters()) { ai_assert(cluster); @@ -1482,15 +1514,16 @@ namespace Assimp { index_out_indices.clear(); out_indices.clear(); + // now check if *any* of these weights is contained in the output mesh, // taking notes so we don't need to do it twice. for (WeightIndexArray::value_type index : indices) { unsigned int count = 0; const unsigned int* const out_idx = geo.ToOutputVertexIndex(index, count); - // ToOutputVertexIndex only returns NULL if index is out of bounds + // ToOutputVertexIndex only returns nullptr if index is out of bounds // which should never happen - ai_assert(out_idx != NULL); + ai_assert(out_idx != nullptr); index_out_indices.push_back(no_index_sentinel); count_out_indices.push_back(0); @@ -1519,68 +1552,107 @@ namespace Assimp { } } } - + // if we found at least one, generate the output bones // XXX this could be heavily simplified by collecting the bone // data in a single step. - ConvertCluster(bones, model, *cluster, out_indices, index_out_indices, - count_out_indices, node_global_transform); + ConvertCluster(bones, cluster, out_indices, index_out_indices, + count_out_indices, absolute_transform, parent, root_node); } + + bone_map.clear(); } - catch (std::exception&) { + catch (std::exception&e) { std::for_each(bones.begin(), bones.end(), Util::delete_fun<aiBone>()); throw; } if (bones.empty()) { + out->mBones = nullptr; + out->mNumBones = 0; return; - } - - out->mBones = new aiBone*[bones.size()](); - out->mNumBones = static_cast<unsigned int>(bones.size()); + } else { + out->mBones = new aiBone *[bones.size()](); + out->mNumBones = static_cast<unsigned int>(bones.size()); - std::swap_ranges(bones.begin(), bones.end(), out->mBones); + std::swap_ranges(bones.begin(), bones.end(), out->mBones); + } } - void FBXConverter::ConvertCluster(std::vector<aiBone*>& bones, const Model& /*model*/, const Cluster& cl, - std::vector<size_t>& out_indices, - std::vector<size_t>& index_out_indices, - std::vector<size_t>& count_out_indices, - const aiMatrix4x4& node_global_transform) + const aiNode* FBXConverter::GetNodeByName( const aiString& name, aiNode *current_node ) { + aiNode * iter = current_node; + //printf("Child count: %d", iter->mNumChildren); + return iter; + } - aiBone* const bone = new aiBone(); - bones.push_back(bone); + void FBXConverter::ConvertCluster(std::vector<aiBone *> &local_mesh_bones, const Cluster *cl, + std::vector<size_t> &out_indices, std::vector<size_t> &index_out_indices, + std::vector<size_t> &count_out_indices, const aiMatrix4x4 &absolute_transform, + aiNode *parent, aiNode *root_node) { + ai_assert(cl); // make sure cluster valid + std::string deformer_name = cl->TargetNode()->Name(); + aiString bone_name = aiString(FixNodeName(deformer_name)); - bone->mName = FixNodeName(cl.TargetNode()->Name()); + aiBone *bone = nullptr; - bone->mOffsetMatrix = cl.TransformLink(); - bone->mOffsetMatrix.Inverse(); + if (bone_map.count(deformer_name)) { + std::cout << "retrieved bone from lookup " << bone_name.C_Str() << ". Deformer: " << deformer_name + << std::endl; + bone = bone_map[deformer_name]; + } else { + std::cout << "created new bone " << bone_name.C_Str() << ". Deformer: " << deformer_name << std::endl; + bone = new aiBone(); + bone->mName = bone_name; - bone->mOffsetMatrix = bone->mOffsetMatrix * node_global_transform; + // store local transform link for post processing + bone->mOffsetMatrix = cl->TransformLink(); + bone->mOffsetMatrix.Inverse(); - bone->mNumWeights = static_cast<unsigned int>(out_indices.size()); - aiVertexWeight* cursor = bone->mWeights = new aiVertexWeight[out_indices.size()]; + aiMatrix4x4 matrix = (aiMatrix4x4)absolute_transform; - const size_t no_index_sentinel = std::numeric_limits<size_t>::max(); - const WeightArray& weights = cl.GetWeights(); + bone->mOffsetMatrix = bone->mOffsetMatrix * matrix; // * mesh_offset - const size_t c = index_out_indices.size(); - for (size_t i = 0; i < c; ++i) { - const size_t index_index = index_out_indices[i]; - if (index_index == no_index_sentinel) { - continue; - } + // + // Now calculate the aiVertexWeights + // + + aiVertexWeight *cursor = nullptr; + + bone->mNumWeights = static_cast<unsigned int>(out_indices.size()); + cursor = bone->mWeights = new aiVertexWeight[out_indices.size()]; - const size_t cc = count_out_indices[i]; - for (size_t j = 0; j < cc; ++j) { - aiVertexWeight& out_weight = *cursor++; + const size_t no_index_sentinel = std::numeric_limits<size_t>::max(); + const WeightArray& weights = cl->GetWeights(); - out_weight.mVertexId = static_cast<unsigned int>(out_indices[index_index + j]); - out_weight.mWeight = weights[i]; + const size_t c = index_out_indices.size(); + for (size_t i = 0; i < c; ++i) { + const size_t index_index = index_out_indices[i]; + + if (index_index == no_index_sentinel) { + continue; + } + + const size_t cc = count_out_indices[i]; + for (size_t j = 0; j < cc; ++j) { + // cursor runs from first element relative to the start + // or relative to the start of the next indexes. + aiVertexWeight& out_weight = *cursor++; + + out_weight.mVertexId = static_cast<unsigned int>(out_indices[index_index + j]); + out_weight.mWeight = weights[i]; + } } + + bone_map.insert(std::pair<const std::string, aiBone *>(deformer_name, bone)); } + + std::cout << "bone research: Indicies size: " << out_indices.size() << std::endl; + + // lookup must be populated in case something goes wrong + // this also allocates bones to mesh instance outside + local_mesh_bones.push_back(bone); } void FBXConverter::ConvertMaterialForMesh(aiMesh* out, const Model& model, const MeshGeometry& geo, @@ -1710,7 +1782,7 @@ namespace Assimp { bool textureReady = false; //tells if our texture is ready (if it was loaded or if it was found) unsigned int index; - VideoMap::const_iterator it = textures_converted.find(media); + VideoMap::const_iterator it = textures_converted.find(*media); if (it != textures_converted.end()) { index = (*it).second; textureReady = true; @@ -1718,7 +1790,7 @@ namespace Assimp { else { if (media->ContentLength() > 0) { index = ConvertVideo(*media); - textures_converted[media] = index; + textures_converted[*media] = index; textureReady = true; } } @@ -2242,13 +2314,13 @@ void FBXConverter::SetShadingPropertiesRaw(aiMaterial* out_mat, const PropertyTa if (media != nullptr && media->ContentLength() > 0) { unsigned int index; - VideoMap::const_iterator it = textures_converted.find(media); + VideoMap::const_iterator it = textures_converted.find(*media); if (it != textures_converted.end()) { index = (*it).second; } else { index = ConvertVideo(*media); - textures_converted[media] = index; + textures_converted[*media] = index; } // setup texture reference string (copied from ColladaLoader::FindFilenameForEffectTexture) @@ -2676,7 +2748,7 @@ void FBXConverter::SetShadingPropertiesRaw(aiMaterial* out_mat, const PropertyTa // sanity check whether the input is ok static void validateAnimCurveNodes(const std::vector<const AnimationCurveNode*>& curves, bool strictMode) { - const Object* target(NULL); + const Object* target(nullptr); for (const AnimationCurveNode* node : curves) { if (!target) { target = node->Target(); @@ -2707,7 +2779,7 @@ void FBXConverter::SetShadingPropertiesRaw(aiMaterial* out_mat, const PropertyTa #ifdef ASSIMP_BUILD_DEBUG validateAnimCurveNodes(curves, doc.Settings().strictMode); #endif - const AnimationCurveNode* curve_node = NULL; + const AnimationCurveNode* curve_node = nullptr; for (const AnimationCurveNode* node : curves) { ai_assert(node); @@ -2967,7 +3039,7 @@ void FBXConverter::SetShadingPropertiesRaw(aiMaterial* out_mat, const PropertyTa TransformationCompDefaultValue(comp) ); - const float epsilon = 1e-6f; + const float epsilon = Math::getEpsilon<float>(); return (dyn_val - static_val).SquareLength() < epsilon; } @@ -3555,7 +3627,7 @@ void FBXConverter::SetShadingPropertiesRaw(aiMaterial* out_mat, const PropertyTa ai_assert(!out->mMeshes); ai_assert(!out->mNumMeshes); - // note: the trailing () ensures initialization with NULL - not + // note: the trailing () ensures initialization with nullptr - not // many C++ users seem to know this, so pointing it out to avoid // confusion why this code works. @@ -3602,6 +3674,47 @@ void FBXConverter::SetShadingPropertiesRaw(aiMaterial* out_mat, const PropertyTa } } + void FBXConverter::ConvertOrphantEmbeddedTextures() + { + // in C++14 it could be: + // for (auto&& [id, object] : objects) + for (auto&& id_and_object : doc.Objects()) + { + auto&& id = std::get<0>(id_and_object); + auto&& object = std::get<1>(id_and_object); + // If an object doesn't have parent + if (doc.ConnectionsBySource().count(id) == 0) + { + const Texture* realTexture = nullptr; + try + { + const auto& element = object->GetElement(); + const Token& key = element.KeyToken(); + const char* obtype = key.begin(); + const size_t length = static_cast<size_t>(key.end() - key.begin()); + if (strncmp(obtype, "Texture", length) == 0) + { + const Texture* texture = static_cast<const Texture*>(object->Get()); + if (texture->Media() && texture->Media()->ContentLength() > 0) + { + realTexture = texture; + } + } + } + catch (...) + { + // do nothing + } + if (realTexture) + { + const Video* media = realTexture->Media(); + unsigned int index = ConvertVideo(*media); + textures_converted[*media] = index; + } + } + } + } + // ------------------------------------------------------------------------------------------------ void ConvertToAssimpScene(aiScene* out, const Document& doc, bool removeEmptyBones) { diff --git a/thirdparty/assimp/code/FBX/FBXConverter.h b/thirdparty/assimp/code/FBX/FBXConverter.h index ab610058a4..46693bdca6 100644 --- a/thirdparty/assimp/code/FBX/FBXConverter.h +++ b/thirdparty/assimp/code/FBX/FBXConverter.h @@ -76,16 +76,6 @@ namespace Assimp { namespace FBX { class Document; - -enum class FbxUnit { - cm = 0, - m, - km, - NumUnits, - - Undefined -}; - /** * Convert a FBX #Document to #aiScene * @param out Empty scene to be populated @@ -133,7 +123,7 @@ private: // ------------------------------------------------------------------------------------------------ // collect and assign child nodes - void ConvertNodes(uint64_t id, aiNode& parent, const aiMatrix4x4& parent_transform = aiMatrix4x4()); + void ConvertNodes(uint64_t id, aiNode *parent, aiNode *root_node); // ------------------------------------------------------------------------------------------------ void ConvertLights(const Model& model, const std::string &orig_name ); @@ -189,32 +179,35 @@ private: void SetupNodeMetadata(const Model& model, aiNode& nd); // ------------------------------------------------------------------------------------------------ - void ConvertModel(const Model& model, aiNode& nd, const aiMatrix4x4& node_global_transform); + void ConvertModel(const Model &model, aiNode *parent, aiNode *root_node, + const aiMatrix4x4 &absolute_transform); // ------------------------------------------------------------------------------------------------ // MeshGeometry -> aiMesh, return mesh index + 1 or 0 if the conversion failed - std::vector<unsigned int> ConvertMesh(const MeshGeometry& mesh, const Model& model, - const aiMatrix4x4& node_global_transform, aiNode& nd); + std::vector<unsigned int> + ConvertMesh(const MeshGeometry &mesh, const Model &model, aiNode *parent, aiNode *root_node, + const aiMatrix4x4 &absolute_transform); // ------------------------------------------------------------------------------------------------ std::vector<unsigned int> ConvertLine(const LineGeometry& line, const Model& model, - const aiMatrix4x4& node_global_transform, aiNode& nd); + aiNode *parent, aiNode *root_node); // ------------------------------------------------------------------------------------------------ - aiMesh* SetupEmptyMesh(const Geometry& mesh, aiNode& nd); + aiMesh* SetupEmptyMesh(const Geometry& mesh, aiNode *parent); // ------------------------------------------------------------------------------------------------ - unsigned int ConvertMeshSingleMaterial(const MeshGeometry& mesh, const Model& model, - const aiMatrix4x4& node_global_transform, aiNode& nd); + unsigned int ConvertMeshSingleMaterial(const MeshGeometry &mesh, const Model &model, + const aiMatrix4x4 &absolute_transform, aiNode *parent, + aiNode *root_node); // ------------------------------------------------------------------------------------------------ - std::vector<unsigned int> ConvertMeshMultiMaterial(const MeshGeometry& mesh, const Model& model, - const aiMatrix4x4& node_global_transform, aiNode& nd); + std::vector<unsigned int> + ConvertMeshMultiMaterial(const MeshGeometry &mesh, const Model &model, aiNode *parent, aiNode *root_node, + const aiMatrix4x4 &absolute_transform); // ------------------------------------------------------------------------------------------------ - unsigned int ConvertMeshMultiMaterial(const MeshGeometry& mesh, const Model& model, - MatIndexArray::value_type index, - const aiMatrix4x4& node_global_transform, aiNode& nd); + unsigned int ConvertMeshMultiMaterial(const MeshGeometry &mesh, const Model &model, MatIndexArray::value_type index, + aiNode *parent, aiNode *root_node, const aiMatrix4x4 &absolute_transform); // ------------------------------------------------------------------------------------------------ static const unsigned int NO_MATERIAL_SEPARATION = /* std::numeric_limits<unsigned int>::max() */ @@ -227,17 +220,17 @@ private: * - outputVertStartIndices is only used when a material index is specified, it gives for * each output vertex the DOM index it maps to. */ - void ConvertWeights(aiMesh* out, const Model& model, const MeshGeometry& geo, - const aiMatrix4x4& node_global_transform = aiMatrix4x4(), - unsigned int materialIndex = NO_MATERIAL_SEPARATION, - std::vector<unsigned int>* outputVertStartIndices = NULL); - + void ConvertWeights(aiMesh *out, const Model &model, const MeshGeometry &geo, const aiMatrix4x4 &absolute_transform, + aiNode *parent = NULL, aiNode *root_node = NULL, + unsigned int materialIndex = NO_MATERIAL_SEPARATION, + std::vector<unsigned int> *outputVertStartIndices = NULL); + // lookup + static const aiNode* GetNodeByName( const aiString& name, aiNode *current_node ); // ------------------------------------------------------------------------------------------------ - void ConvertCluster(std::vector<aiBone*>& bones, const Model& /*model*/, const Cluster& cl, - std::vector<size_t>& out_indices, - std::vector<size_t>& index_out_indices, - std::vector<size_t>& count_out_indices, - const aiMatrix4x4& node_global_transform); + void ConvertCluster(std::vector<aiBone *> &local_mesh_bones, const Cluster *cl, + std::vector<size_t> &out_indices, std::vector<size_t> &index_out_indices, + std::vector<size_t> &count_out_indices, const aiMatrix4x4 &absolute_transform, + aiNode *parent, aiNode *root_node); // ------------------------------------------------------------------------------------------------ void ConvertMaterialForMesh(aiMesh* out, const Model& model, const MeshGeometry& geo, @@ -434,6 +427,10 @@ private: // copy generated meshes, animations, lights, cameras and textures to the output scene void TransferDataToScene(); + // ------------------------------------------------------------------------------------------------ + // FBX file could have embedded textures not connected to anything + void ConvertOrphantEmbeddedTextures(); + private: // 0: not assigned yet, others: index is value - 1 unsigned int defaultMaterialIndex; @@ -445,28 +442,47 @@ private: std::vector<aiCamera*> cameras; std::vector<aiTexture*> textures; - using MaterialMap = std::map<const Material*, unsigned int>; + using MaterialMap = std::fbx_unordered_map<const Material*, unsigned int>; MaterialMap materials_converted; - using VideoMap = std::map<const Video*, unsigned int>; + using VideoMap = std::fbx_unordered_map<const Video, unsigned int>; VideoMap textures_converted; - using MeshMap = std::map<const Geometry*, std::vector<unsigned int> >; + using MeshMap = std::fbx_unordered_map<const Geometry*, std::vector<unsigned int> >; MeshMap meshes_converted; // fixed node name -> which trafo chain components have animations? - using NodeAnimBitMap = std::map<std::string, unsigned int> ; + using NodeAnimBitMap = std::fbx_unordered_map<std::string, unsigned int> ; NodeAnimBitMap node_anim_chain_bits; // number of nodes with the same name - using NodeNameCache = std::unordered_map<std::string, unsigned int>; + using NodeNameCache = std::fbx_unordered_map<std::string, unsigned int>; NodeNameCache mNodeNames; + // Deformer name is not the same as a bone name - it does contain the bone name though :) + // Deformer names in FBX are always unique in an FBX file. + std::map<const std::string, aiBone *> bone_map; + double anim_fps; aiScene* const out; const FBX::Document& doc; - FbxUnit mCurrentUnit; + + static void BuildBoneList(aiNode *current_node, const aiNode *root_node, const aiScene *scene, + std::vector<aiBone*>& bones); + + void BuildBoneStack(aiNode *current_node, const aiNode *root_node, const aiScene *scene, + const std::vector<aiBone *> &bones, + std::map<aiBone *, aiNode *> &bone_stack, + std::vector<aiNode*> &node_stack ); + + static void BuildNodeList(aiNode *current_node, std::vector<aiNode *> &nodes); + + static aiNode *GetNodeFromStack(const aiString &node_name, std::vector<aiNode *> &nodes); + + static aiNode *GetArmatureRoot(aiNode *bone_node, std::vector<aiBone*> &bone_list); + + static bool IsBoneNode(const aiString &bone_name, std::vector<aiBone *> &bones); }; } diff --git a/thirdparty/assimp/code/FBX/FBXDocument.h b/thirdparty/assimp/code/FBX/FBXDocument.h index 18e5c38f13..a60d7d9efa 100644 --- a/thirdparty/assimp/code/FBX/FBXDocument.h +++ b/thirdparty/assimp/code/FBX/FBXDocument.h @@ -637,6 +637,20 @@ public: return ptr; } + bool operator==(const Video& other) const + { + return ( + type == other.type + && relativeFileName == other.relativeFileName + && fileName == other.fileName + ); + } + + bool operator<(const Video& other) const + { + return std::tie(type, relativeFileName, fileName) < std::tie(other.type, other.relativeFileName, other.fileName); + } + private: std::string type; std::string relativeFileName; @@ -1005,10 +1019,10 @@ public: // during their entire lifetime (Document). FBX files have // up to many thousands of objects (most of which we never use), // so the memory overhead for them should be kept at a minimum. -typedef std::map<uint64_t, LazyObject*> ObjectMap; +typedef std::fbx_unordered_map<uint64_t, LazyObject*> ObjectMap; typedef std::fbx_unordered_map<std::string, std::shared_ptr<const PropertyTable> > PropertyTemplateMap; -typedef std::multimap<uint64_t, const Connection*> ConnectionMap; +typedef std::fbx_unordered_multimap<uint64_t, const Connection*> ConnectionMap; /** DOM class for global document settings, a single instance per document can * be accessed via Document.Globals(). */ @@ -1177,4 +1191,25 @@ private: } // Namespace FBX } // Namespace Assimp +namespace std +{ + template <> + struct hash<const Assimp::FBX::Video> + { + std::size_t operator()(const Assimp::FBX::Video& video) const + { + using std::size_t; + using std::hash; + using std::string; + + size_t res = 17; + res = res * 31 + hash<string>()(video.Name()); + res = res * 31 + hash<string>()(video.RelativeFilename()); + res = res * 31 + hash<string>()(video.Type()); + + return res; + } + }; +} + #endif // INCLUDED_AI_FBX_DOCUMENT_H diff --git a/thirdparty/assimp/code/FBX/FBXExportProperty.cpp b/thirdparty/assimp/code/FBX/FBXExportProperty.cpp index f8593e6295..f2a63b72b9 100644 --- a/thirdparty/assimp/code/FBX/FBXExportProperty.cpp +++ b/thirdparty/assimp/code/FBX/FBXExportProperty.cpp @@ -59,11 +59,7 @@ namespace FBX { FBXExportProperty::FBXExportProperty(bool v) : type('C') -, data(1) { - data = { - uint8_t(v) - }; -} +, data(1, uint8_t(v)) {} FBXExportProperty::FBXExportProperty(int16_t v) : type('Y') diff --git a/thirdparty/assimp/code/FBX/FBXExporter.cpp b/thirdparty/assimp/code/FBX/FBXExporter.cpp index 8ebc8555a2..9316dc4f02 100644 --- a/thirdparty/assimp/code/FBX/FBXExporter.cpp +++ b/thirdparty/assimp/code/FBX/FBXExporter.cpp @@ -67,6 +67,7 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #include <vector> #include <array> #include <unordered_set> +#include <numeric> // RESOURCES: // https://code.blender.org/2013/08/fbx-binary-file-format-specification/ @@ -1005,6 +1006,9 @@ void FBXExporter::WriteObjects () object_node.EndProperties(outstream, binary, indent); object_node.BeginChildren(outstream, binary, indent); + bool bJoinIdenticalVertices = mProperties->GetPropertyBool("bJoinIdenticalVertices", true); + std::vector<std::vector<int32_t>> vVertexIndice;//save vertex_indices as it is needed later + // geometry (aiMesh) mesh_uids.clear(); indent = 1; @@ -1031,21 +1035,35 @@ void FBXExporter::WriteObjects () std::vector<int32_t> vertex_indices; // map of vertex value to its index in the data vector std::map<aiVector3D,size_t> index_by_vertex_value; - int32_t index = 0; - for (size_t vi = 0; vi < m->mNumVertices; ++vi) { - aiVector3D vtx = m->mVertices[vi]; - auto elem = index_by_vertex_value.find(vtx); - if (elem == index_by_vertex_value.end()) { - vertex_indices.push_back(index); - index_by_vertex_value[vtx] = index; - flattened_vertices.push_back(vtx[0]); - flattened_vertices.push_back(vtx[1]); - flattened_vertices.push_back(vtx[2]); - ++index; - } else { - vertex_indices.push_back(int32_t(elem->second)); + if(bJoinIdenticalVertices){ + int32_t index = 0; + for (size_t vi = 0; vi < m->mNumVertices; ++vi) { + aiVector3D vtx = m->mVertices[vi]; + auto elem = index_by_vertex_value.find(vtx); + if (elem == index_by_vertex_value.end()) { + vertex_indices.push_back(index); + index_by_vertex_value[vtx] = index; + flattened_vertices.push_back(vtx[0]); + flattened_vertices.push_back(vtx[1]); + flattened_vertices.push_back(vtx[2]); + ++index; + } else { + vertex_indices.push_back(int32_t(elem->second)); + } + } + } + else { // do not join vertex, respect the export flag + vertex_indices.resize(m->mNumVertices); + std::iota(vertex_indices.begin(), vertex_indices.end(), 0); + for(unsigned int v = 0; v < m->mNumVertices; ++ v) { + aiVector3D vtx = m->mVertices[v]; + flattened_vertices.push_back(vtx.x); + flattened_vertices.push_back(vtx.y); + flattened_vertices.push_back(vtx.z); } } + vVertexIndice.push_back(vertex_indices); + FBX::Node::WritePropertyNode( "Vertices", flattened_vertices, outstream, binary, indent ); @@ -1116,6 +1134,51 @@ void FBXExporter::WriteObjects () normals.End(outstream, binary, indent, true); } + // colors, if any + // TODO only one color channel currently + const int32_t colorChannelIndex = 0; + if (m->HasVertexColors(colorChannelIndex)) { + FBX::Node vertexcolors("LayerElementColor", int32_t(colorChannelIndex)); + vertexcolors.Begin(outstream, binary, indent); + vertexcolors.DumpProperties(outstream, binary, indent); + vertexcolors.EndProperties(outstream, binary, indent); + vertexcolors.BeginChildren(outstream, binary, indent); + indent = 3; + FBX::Node::WritePropertyNode( + "Version", int32_t(101), outstream, binary, indent + ); + char layerName[8]; + sprintf(layerName, "COLOR_%d", colorChannelIndex); + FBX::Node::WritePropertyNode( + "Name", (const char*)layerName, outstream, binary, indent + ); + FBX::Node::WritePropertyNode( + "MappingInformationType", "ByPolygonVertex", + outstream, binary, indent + ); + FBX::Node::WritePropertyNode( + "ReferenceInformationType", "Direct", + outstream, binary, indent + ); + std::vector<double> color_data; + color_data.reserve(4 * polygon_data.size()); + for (size_t fi = 0; fi < m->mNumFaces; ++fi) { + const aiFace &f = m->mFaces[fi]; + for (size_t pvi = 0; pvi < f.mNumIndices; ++pvi) { + const aiColor4D &c = m->mColors[colorChannelIndex][f.mIndices[pvi]]; + color_data.push_back(c.r); + color_data.push_back(c.g); + color_data.push_back(c.b); + color_data.push_back(c.a); + } + } + FBX::Node::WritePropertyNode( + "Colors", color_data, outstream, binary, indent + ); + indent = 2; + vertexcolors.End(outstream, binary, indent, true); + } + // uvs, if any for (size_t uvi = 0; uvi < m->GetNumUVChannels(); ++uvi) { if (m->mNumUVComponents[uvi] > 2) { @@ -1209,6 +1272,11 @@ void FBXExporter::WriteObjects () le.AddChild("Type", "LayerElementNormal"); le.AddChild("TypedIndex", int32_t(0)); layer.AddChild(le); + // TODO only 1 color channel currently + le = FBX::Node("LayerElement"); + le.AddChild("Type", "LayerElementColor"); + le.AddChild("TypedIndex", int32_t(0)); + layer.AddChild(le); le = FBX::Node("LayerElement"); le.AddChild("Type", "LayerElementMaterial"); le.AddChild("TypedIndex", int32_t(0)); @@ -1221,7 +1289,7 @@ void FBXExporter::WriteObjects () for(unsigned int lr = 1; lr < m->GetNumUVChannels(); ++ lr) { - FBX::Node layerExtra("Layer", int32_t(1)); + FBX::Node layerExtra("Layer", int32_t(lr)); layerExtra.AddChild("Version", int32_t(100)); FBX::Node leExtra("LayerElement"); leExtra.AddChild("Type", "LayerElementUV"); @@ -1748,28 +1816,8 @@ void FBXExporter::WriteObjects () // connect it connections.emplace_back("C", "OO", deformer_uid, mesh_uids[mi]); - // we will be indexing by vertex... - // but there might be a different number of "vertices" - // between assimp and our output FBX. - // this code is cut-and-pasted from the geometry section above... - // ideally this should not be so. - // --- - // index of original vertex in vertex data vector - std::vector<int32_t> vertex_indices; - // map of vertex value to its index in the data vector - std::map<aiVector3D,size_t> index_by_vertex_value; - int32_t index = 0; - for (size_t vi = 0; vi < m->mNumVertices; ++vi) { - aiVector3D vtx = m->mVertices[vi]; - auto elem = index_by_vertex_value.find(vtx); - if (elem == index_by_vertex_value.end()) { - vertex_indices.push_back(index); - index_by_vertex_value[vtx] = index; - ++index; - } else { - vertex_indices.push_back(int32_t(elem->second)); - } - } + //computed before + std::vector<int32_t>& vertex_indices = vVertexIndice[mi]; // TODO, FIXME: this won't work if anything is not in the bind pose. // for now if such a situation is detected, we throw an exception. @@ -2435,7 +2483,7 @@ void FBXExporter::WriteModelNodes( void FBXExporter::WriteAnimationCurveNode( StreamWriterLE& outstream, int64_t uid, - std::string name, // "T", "R", or "S" + const std::string& name, // "T", "R", or "S" aiVector3D default_value, std::string property_name, // "Lcl Translation" etc int64_t layer_uid, diff --git a/thirdparty/assimp/code/FBX/FBXExporter.h b/thirdparty/assimp/code/FBX/FBXExporter.h index 71fb55c57f..1ae727eda9 100644 --- a/thirdparty/assimp/code/FBX/FBXExporter.h +++ b/thirdparty/assimp/code/FBX/FBXExporter.h @@ -156,7 +156,7 @@ namespace Assimp void WriteAnimationCurveNode( StreamWriterLE& outstream, int64_t uid, - std::string name, // "T", "R", or "S" + const std::string& name, // "T", "R", or "S" aiVector3D default_value, std::string property_name, // "Lcl Translation" etc int64_t animation_layer_uid, diff --git a/thirdparty/assimp/code/FBX/FBXImporter.cpp b/thirdparty/assimp/code/FBX/FBXImporter.cpp index 271935a568..afcc1ddc78 100644 --- a/thirdparty/assimp/code/FBX/FBXImporter.cpp +++ b/thirdparty/assimp/code/FBX/FBXImporter.cpp @@ -48,26 +48,26 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #include "FBXImporter.h" -#include "FBXTokenizer.h" +#include "FBXConverter.h" +#include "FBXDocument.h" #include "FBXParser.h" +#include "FBXTokenizer.h" #include "FBXUtil.h" -#include "FBXDocument.h" -#include "FBXConverter.h" -#include <assimp/StreamReader.h> #include <assimp/MemoryIOWrapper.h> -#include <assimp/Importer.hpp> +#include <assimp/StreamReader.h> #include <assimp/importerdesc.h> +#include <assimp/Importer.hpp> namespace Assimp { -template<> -const char* LogFunctions<FBXImporter>::Prefix() { - static auto prefix = "FBX: "; - return prefix; +template <> +const char *LogFunctions<FBXImporter>::Prefix() { + static auto prefix = "FBX: "; + return prefix; } -} +} // namespace Assimp using namespace Assimp; using namespace Assimp::Formatter; @@ -76,136 +76,123 @@ using namespace Assimp::FBX; namespace { static const aiImporterDesc desc = { - "Autodesk FBX Importer", - "", - "", - "", - aiImporterFlags_SupportTextFlavour, - 0, - 0, - 0, - 0, - "fbx" + "Autodesk FBX Importer", + "", + "", + "", + aiImporterFlags_SupportTextFlavour, + 0, + 0, + 0, + 0, + "fbx" }; } // ------------------------------------------------------------------------------------------------ // Constructor to be privately used by #Importer -FBXImporter::FBXImporter() -{ +FBXImporter::FBXImporter() { } // ------------------------------------------------------------------------------------------------ // Destructor, private as well -FBXImporter::~FBXImporter() -{ +FBXImporter::~FBXImporter() { } // ------------------------------------------------------------------------------------------------ // Returns whether the class can handle the format of the given file. -bool FBXImporter::CanRead( const std::string& pFile, IOSystem* pIOHandler, bool checkSig) const -{ - const std::string& extension = GetExtension(pFile); - if (extension == std::string( desc.mFileExtensions ) ) { - return true; - } - - else if ((!extension.length() || checkSig) && pIOHandler) { - // at least ASCII-FBX files usually have a 'FBX' somewhere in their head - const char* tokens[] = {"fbx"}; - return SearchFileHeaderForToken(pIOHandler,pFile,tokens,1); - } - return false; +bool FBXImporter::CanRead(const std::string &pFile, IOSystem *pIOHandler, bool checkSig) const { + const std::string &extension = GetExtension(pFile); + if (extension == std::string(desc.mFileExtensions)) { + return true; + } + + else if ((!extension.length() || checkSig) && pIOHandler) { + // at least ASCII-FBX files usually have a 'FBX' somewhere in their head + const char *tokens[] = { "fbx" }; + return SearchFileHeaderForToken(pIOHandler, pFile, tokens, 1); + } + return false; } // ------------------------------------------------------------------------------------------------ // List all extensions handled by this loader -const aiImporterDesc* FBXImporter::GetInfo () const -{ - return &desc; +const aiImporterDesc *FBXImporter::GetInfo() const { + return &desc; } // ------------------------------------------------------------------------------------------------ // Setup configuration properties for the loader -void FBXImporter::SetupProperties(const Importer* pImp) -{ - settings.readAllLayers = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_READ_ALL_GEOMETRY_LAYERS, true); - settings.readAllMaterials = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_READ_ALL_MATERIALS, false); - settings.readMaterials = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_READ_MATERIALS, true); - settings.readTextures = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_READ_TEXTURES, true); - settings.readCameras = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_READ_CAMERAS, true); - settings.readLights = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_READ_LIGHTS, true); - settings.readAnimations = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_READ_ANIMATIONS, true); - settings.strictMode = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_STRICT_MODE, false); - settings.preservePivots = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_PRESERVE_PIVOTS, true); - settings.optimizeEmptyAnimationCurves = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_OPTIMIZE_EMPTY_ANIMATION_CURVES, true); - settings.useLegacyEmbeddedTextureNaming = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_EMBEDDED_TEXTURES_LEGACY_NAMING, false); - settings.removeEmptyBones = pImp->GetPropertyBool(AI_CONFIG_IMPORT_REMOVE_EMPTY_BONES, true); - settings.convertToMeters = pImp->GetPropertyBool(AI_CONFIG_FBX_CONVERT_TO_M, false); +void FBXImporter::SetupProperties(const Importer *pImp) { + settings.readAllLayers = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_READ_ALL_GEOMETRY_LAYERS, true); + settings.readAllMaterials = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_READ_ALL_MATERIALS, false); + settings.readMaterials = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_READ_MATERIALS, true); + settings.readTextures = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_READ_TEXTURES, true); + settings.readCameras = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_READ_CAMERAS, true); + settings.readLights = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_READ_LIGHTS, true); + settings.readAnimations = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_READ_ANIMATIONS, true); + settings.strictMode = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_STRICT_MODE, false); + settings.preservePivots = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_PRESERVE_PIVOTS, true); + settings.optimizeEmptyAnimationCurves = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_OPTIMIZE_EMPTY_ANIMATION_CURVES, true); + settings.useLegacyEmbeddedTextureNaming = pImp->GetPropertyBool(AI_CONFIG_IMPORT_FBX_EMBEDDED_TEXTURES_LEGACY_NAMING, false); + settings.removeEmptyBones = pImp->GetPropertyBool(AI_CONFIG_IMPORT_REMOVE_EMPTY_BONES, true); + settings.convertToMeters = pImp->GetPropertyBool(AI_CONFIG_FBX_CONVERT_TO_M, false); } // ------------------------------------------------------------------------------------------------ // Imports the given file into the given scene structure. -void FBXImporter::InternReadFile( const std::string& pFile, aiScene* pScene, IOSystem* pIOHandler) -{ - std::unique_ptr<IOStream> stream(pIOHandler->Open(pFile,"rb")); - if (!stream) { - ThrowException("Could not open file for reading"); - } - - // read entire file into memory - no streaming for this, fbx - // files can grow large, but the assimp output data structure - // then becomes very large, too. Assimp doesn't support - // streaming for its output data structures so the net win with - // streaming input data would be very low. - std::vector<char> contents; - contents.resize(stream->FileSize()+1); - stream->Read( &*contents.begin(), 1, contents.size()-1 ); - contents[ contents.size() - 1 ] = 0; - const char* const begin = &*contents.begin(); - - // broadphase tokenizing pass in which we identify the core - // syntax elements of FBX (brackets, commas, key:value mappings) - TokenList tokens; - try { - - bool is_binary = false; - if (!strncmp(begin,"Kaydara FBX Binary",18)) { - is_binary = true; - TokenizeBinary(tokens,begin,contents.size()); - } - else { - Tokenize(tokens,begin); - } - - // use this information to construct a very rudimentary - // parse-tree representing the FBX scope structure - Parser parser(tokens, is_binary); - - // take the raw parse-tree and convert it to a FBX DOM - Document doc(parser,settings); - - FbxUnit unit(FbxUnit::cm); - if (settings.convertToMeters) { - unit = FbxUnit::m; - } - - // convert the FBX DOM to aiScene - ConvertToAssimpScene(pScene, doc, settings.removeEmptyBones); - - // size relative to cm - float size_relative_to_cm = doc.GlobalSettings().UnitScaleFactor(); - - // Set FBX file scale is relative to CM must be converted to M for - // assimp universal format (M) - SetFileScale( size_relative_to_cm * 0.01f); - - std::for_each(tokens.begin(),tokens.end(),Util::delete_fun<Token>()); - } - catch(std::exception&) { - std::for_each(tokens.begin(),tokens.end(),Util::delete_fun<Token>()); - throw; - } +void FBXImporter::InternReadFile(const std::string &pFile, aiScene *pScene, IOSystem *pIOHandler) { + std::unique_ptr<IOStream> stream(pIOHandler->Open(pFile, "rb")); + if (!stream) { + ThrowException("Could not open file for reading"); + } + + // read entire file into memory - no streaming for this, fbx + // files can grow large, but the assimp output data structure + // then becomes very large, too. Assimp doesn't support + // streaming for its output data structures so the net win with + // streaming input data would be very low. + std::vector<char> contents; + contents.resize(stream->FileSize() + 1); + stream->Read(&*contents.begin(), 1, contents.size() - 1); + contents[contents.size() - 1] = 0; + const char *const begin = &*contents.begin(); + + // broadphase tokenizing pass in which we identify the core + // syntax elements of FBX (brackets, commas, key:value mappings) + TokenList tokens; + try { + + bool is_binary = false; + if (!strncmp(begin, "Kaydara FBX Binary", 18)) { + is_binary = true; + TokenizeBinary(tokens, begin, contents.size()); + } else { + Tokenize(tokens, begin); + } + + // use this information to construct a very rudimentary + // parse-tree representing the FBX scope structure + Parser parser(tokens, is_binary); + + // take the raw parse-tree and convert it to a FBX DOM + Document doc(parser, settings); + + // convert the FBX DOM to aiScene + ConvertToAssimpScene(pScene, doc, settings.removeEmptyBones); + + // size relative to cm + float size_relative_to_cm = doc.GlobalSettings().UnitScaleFactor(); + + // Set FBX file scale is relative to CM must be converted to M for + // assimp universal format (M) + SetFileScale(size_relative_to_cm * 0.01f); + + std::for_each(tokens.begin(), tokens.end(), Util::delete_fun<Token>()); + } catch (std::exception &) { + std::for_each(tokens.begin(), tokens.end(), Util::delete_fun<Token>()); + throw; + } } #endif // !ASSIMP_BUILD_NO_FBX_IMPORTER diff --git a/thirdparty/assimp/code/FBX/FBXMeshGeometry.cpp b/thirdparty/assimp/code/FBX/FBXMeshGeometry.cpp index 5c9a0e309d..1386e2383c 100644 --- a/thirdparty/assimp/code/FBX/FBXMeshGeometry.cpp +++ b/thirdparty/assimp/code/FBX/FBXMeshGeometry.cpp @@ -610,11 +610,11 @@ void MeshGeometry::ReadVertexDataMaterials(std::vector<int>& materials_out, cons const std::string& ReferenceInformationType) { const size_t face_count = m_faces.size(); - if(face_count <= 0) + if( 0 == face_count ) { return; } - + // materials are handled separately. First of all, they are assigned per-face // and not per polyvert. Secondly, ReferenceInformationType=IndexToDirect // has a slightly different meaning for materials. @@ -625,16 +625,14 @@ void MeshGeometry::ReadVertexDataMaterials(std::vector<int>& materials_out, cons if (materials_out.empty()) { FBXImporter::LogError(Formatter::format("expected material index, ignoring")); return; - } - else if (materials_out.size() > 1) { + } else if (materials_out.size() > 1) { FBXImporter::LogWarn(Formatter::format("expected only a single material index, ignoring all except the first one")); materials_out.clear(); } materials_out.resize(m_vertices.size()); std::fill(materials_out.begin(), materials_out.end(), materials_out.at(0)); - } - else if (MappingInformationType == "ByPolygon" && ReferenceInformationType == "IndexToDirect") { + } else if (MappingInformationType == "ByPolygon" && ReferenceInformationType == "IndexToDirect") { materials_out.resize(face_count); if(materials_out.size() != face_count) { @@ -643,18 +641,16 @@ void MeshGeometry::ReadVertexDataMaterials(std::vector<int>& materials_out, cons ); return; } - } - else { + } else { FBXImporter::LogError(Formatter::format("ignoring material assignments, access type not implemented: ") << MappingInformationType << "," << ReferenceInformationType); } } // ------------------------------------------------------------------------------------------------ ShapeGeometry::ShapeGeometry(uint64_t id, const Element& element, const std::string& name, const Document& doc) - : Geometry(id, element, name, doc) -{ - const Scope* sc = element.Compound(); - if (!sc) { +: Geometry(id, element, name, doc) { + const Scope *sc = element.Compound(); + if (nullptr == sc) { DOMError("failed to read Geometry object (class: Shape), no data scope found"); } const Element& Indexes = GetRequiredElement(*sc, "Indexes", &element); diff --git a/thirdparty/assimp/code/MMD/MMDCpp14.h b/thirdparty/assimp/code/MMD/MMDCpp14.h deleted file mode 100644 index 638b0bfd2f..0000000000 --- a/thirdparty/assimp/code/MMD/MMDCpp14.h +++ /dev/null @@ -1,83 +0,0 @@ -/* -Open Asset Import Library (assimp) ----------------------------------------------------------------------- - -Copyright (c) 2006-2019, assimp team - - -All rights reserved. - -Redistribution and use of this software in source and binary forms, -with or without modification, are permitted provided that the -following conditions are met: - -* Redistributions of source code must retain the above -copyright notice, this list of conditions and the -following disclaimer. - -* Redistributions in binary form must reproduce the above -copyright notice, this list of conditions and the -following disclaimer in the documentation and/or other -materials provided with the distribution. - -* Neither the name of the assimp team, nor the names of its -contributors may be used to endorse or promote products -derived from this software without specific prior -written permission of the assimp team. - -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS -"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT -LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR -A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT -OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, -SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT -LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, -DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY -THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE -OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - ----------------------------------------------------------------------- -*/ -#pragma once - -#ifndef MMD_CPP14_H -#define MMD_CPP14_H - -#include <cstddef> -#include <memory> -#include <type_traits> -#include <utility> - -namespace mmd { - template<class T> struct _Unique_if { - typedef std::unique_ptr<T> _Single_object; - }; - - template<class T> struct _Unique_if<T[]> { - typedef std::unique_ptr<T[]> _Unknown_bound; - }; - - template<class T, size_t N> struct _Unique_if<T[N]> { - typedef void _Known_bound; - }; - - template<class T, class... Args> - typename _Unique_if<T>::_Single_object - make_unique(Args&&... args) { - return std::unique_ptr<T>(new T(std::forward<Args>(args)...)); - } - - template<class T> - typename _Unique_if<T>::_Unknown_bound - make_unique(size_t n) { - typedef typename std::remove_extent<T>::type U; - return std::unique_ptr<T>(new U[n]()); - } - - template<class T, class... Args> - typename _Unique_if<T>::_Known_bound - make_unique(Args&&...) = delete; -} - -#endif diff --git a/thirdparty/assimp/code/MMD/MMDImporter.cpp b/thirdparty/assimp/code/MMD/MMDImporter.cpp deleted file mode 100644 index e7744e4cd0..0000000000 --- a/thirdparty/assimp/code/MMD/MMDImporter.cpp +++ /dev/null @@ -1,372 +0,0 @@ -/* ---------------------------------------------------------------------------- -Open Asset Import Library (assimp) ---------------------------------------------------------------------------- - -Copyright (c) 2006-2016, assimp team - -All rights reserved. - -Redistribution and use of this software in source and binary forms, -with or without modification, are permitted provided that the following -conditions are met: - -* Redistributions of source code must retain the above - copyright notice, this list of conditions and the - following disclaimer. - -* Redistributions in binary form must reproduce the above - copyright notice, this list of conditions and the - following disclaimer in the documentation and/or other - materials provided with the distribution. - -* Neither the name of the assimp team, nor the names of its - contributors may be used to endorse or promote products - derived from this software without specific prior - written permission of the assimp team. - -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS -"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT -LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR -A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT -OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, -SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT -LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, -DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY -THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE -OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ---------------------------------------------------------------------------- -*/ - -#ifndef ASSIMP_BUILD_NO_MMD_IMPORTER - -#include "MMD/MMDImporter.h" -#include "MMD/MMDPmdParser.h" -#include "MMD/MMDPmxParser.h" -#include "MMD/MMDVmdParser.h" -#include "PostProcessing/ConvertToLHProcess.h" - -#include <assimp/DefaultIOSystem.h> -#include <assimp/Importer.hpp> -#include <assimp/ai_assert.h> -#include <assimp/scene.h> - -#include <fstream> -#include <iomanip> -#include <memory> - -static const aiImporterDesc desc = {"MMD Importer", - "", - "", - "surfaces supported?", - aiImporterFlags_SupportTextFlavour, - 0, - 0, - 0, - 0, - "pmx"}; - -namespace Assimp { - -using namespace std; - -// ------------------------------------------------------------------------------------------------ -// Default constructor -MMDImporter::MMDImporter() -: m_Buffer() -, m_strAbsPath("") { - DefaultIOSystem io; - m_strAbsPath = io.getOsSeparator(); -} - -// ------------------------------------------------------------------------------------------------ -// Destructor. -MMDImporter::~MMDImporter() { - // empty -} - -// ------------------------------------------------------------------------------------------------ -// Returns true, if file is an pmx file. -bool MMDImporter::CanRead(const std::string &pFile, IOSystem *pIOHandler, - bool checkSig) const { - if (!checkSig) // Check File Extension - { - return SimpleExtensionCheck(pFile, "pmx"); - } else // Check file Header - { - static const char *pTokens[] = {"PMX "}; - return BaseImporter::SearchFileHeaderForToken(pIOHandler, pFile, pTokens, 1); - } -} - -// ------------------------------------------------------------------------------------------------ -const aiImporterDesc *MMDImporter::GetInfo() const { return &desc; } - -// ------------------------------------------------------------------------------------------------ -// MMD import implementation -void MMDImporter::InternReadFile(const std::string &file, aiScene *pScene, - IOSystem * /*pIOHandler*/) { - // Read file by istream - std::filebuf fb; - if (!fb.open(file, std::ios::in | std::ios::binary)) { - throw DeadlyImportError("Failed to open file " + file + "."); - } - - std::istream fileStream(&fb); - - // Get the file-size and validate it, throwing an exception when fails - fileStream.seekg(0, fileStream.end); - size_t fileSize = static_cast<size_t>(fileStream.tellg()); - fileStream.seekg(0, fileStream.beg); - - if (fileSize < sizeof(pmx::PmxModel)) { - throw DeadlyImportError(file + " is too small."); - } - - pmx::PmxModel model; - model.Read(&fileStream); - - CreateDataFromImport(&model, pScene); -} - -// ------------------------------------------------------------------------------------------------ -void MMDImporter::CreateDataFromImport(const pmx::PmxModel *pModel, - aiScene *pScene) { - if (pModel == NULL) { - return; - } - - aiNode *pNode = new aiNode; - if (!pModel->model_name.empty()) { - pNode->mName.Set(pModel->model_name); - } - - pScene->mRootNode = pNode; - - pNode = new aiNode; - pScene->mRootNode->addChildren(1, &pNode); - pNode->mName.Set(string(pModel->model_name) + string("_mesh")); - - // split mesh by materials - pNode->mNumMeshes = pModel->material_count; - pNode->mMeshes = new unsigned int[pNode->mNumMeshes]; - for (unsigned int index = 0; index < pNode->mNumMeshes; index++) { - pNode->mMeshes[index] = index; - } - - pScene->mNumMeshes = pModel->material_count; - pScene->mMeshes = new aiMesh *[pScene->mNumMeshes]; - for (unsigned int i = 0, indexStart = 0; i < pScene->mNumMeshes; i++) { - const int indexCount = pModel->materials[i].index_count; - - pScene->mMeshes[i] = CreateMesh(pModel, indexStart, indexCount); - pScene->mMeshes[i]->mName = pModel->materials[i].material_name; - pScene->mMeshes[i]->mMaterialIndex = i; - indexStart += indexCount; - } - - // create node hierarchy for bone position - std::unique_ptr<aiNode *[]> ppNode(new aiNode *[pModel->bone_count]); - for (auto i = 0; i < pModel->bone_count; i++) { - ppNode[i] = new aiNode(pModel->bones[i].bone_name); - } - - for (auto i = 0; i < pModel->bone_count; i++) { - const pmx::PmxBone &bone = pModel->bones[i]; - - if (bone.parent_index < 0) { - pScene->mRootNode->addChildren(1, ppNode.get() + i); - } else { - ppNode[bone.parent_index]->addChildren(1, ppNode.get() + i); - - aiVector3D v3 = aiVector3D( - bone.position[0] - pModel->bones[bone.parent_index].position[0], - bone.position[1] - pModel->bones[bone.parent_index].position[1], - bone.position[2] - pModel->bones[bone.parent_index].position[2]); - aiMatrix4x4::Translation(v3, ppNode[i]->mTransformation); - } - } - - // create materials - pScene->mNumMaterials = pModel->material_count; - pScene->mMaterials = new aiMaterial *[pScene->mNumMaterials]; - for (unsigned int i = 0; i < pScene->mNumMaterials; i++) { - pScene->mMaterials[i] = CreateMaterial(&pModel->materials[i], pModel); - } - - // Convert everything to OpenGL space - MakeLeftHandedProcess convertProcess; - convertProcess.Execute(pScene); - - FlipUVsProcess uvFlipper; - uvFlipper.Execute(pScene); - - FlipWindingOrderProcess windingFlipper; - windingFlipper.Execute(pScene); -} - -// ------------------------------------------------------------------------------------------------ -aiMesh *MMDImporter::CreateMesh(const pmx::PmxModel *pModel, - const int indexStart, const int indexCount) { - aiMesh *pMesh = new aiMesh; - - pMesh->mNumVertices = indexCount; - - pMesh->mNumFaces = indexCount / 3; - pMesh->mFaces = new aiFace[pMesh->mNumFaces]; - - const int numIndices = 3; // triangular face - for (unsigned int index = 0; index < pMesh->mNumFaces; index++) { - pMesh->mFaces[index].mNumIndices = numIndices; - unsigned int *indices = new unsigned int[numIndices]; - indices[0] = numIndices * index; - indices[1] = numIndices * index + 1; - indices[2] = numIndices * index + 2; - pMesh->mFaces[index].mIndices = indices; - } - - pMesh->mVertices = new aiVector3D[pMesh->mNumVertices]; - pMesh->mNormals = new aiVector3D[pMesh->mNumVertices]; - pMesh->mTextureCoords[0] = new aiVector3D[pMesh->mNumVertices]; - pMesh->mNumUVComponents[0] = 2; - - // additional UVs - for (int i = 1; i <= pModel->setting.uv; i++) { - pMesh->mTextureCoords[i] = new aiVector3D[pMesh->mNumVertices]; - pMesh->mNumUVComponents[i] = 4; - } - - map<int, vector<aiVertexWeight>> bone_vertex_map; - - // fill in contents and create bones - for (int index = 0; index < indexCount; index++) { - const pmx::PmxVertex *v = - &pModel->vertices[pModel->indices[indexStart + index]]; - const float *position = v->position; - pMesh->mVertices[index].Set(position[0], position[1], position[2]); - const float *normal = v->normal; - - pMesh->mNormals[index].Set(normal[0], normal[1], normal[2]); - pMesh->mTextureCoords[0][index].x = v->uv[0]; - pMesh->mTextureCoords[0][index].y = v->uv[1]; - - for (int i = 1; i <= pModel->setting.uv; i++) { - // TODO: wrong here? use quaternion transform? - pMesh->mTextureCoords[i][index].x = v->uva[i][0]; - pMesh->mTextureCoords[i][index].y = v->uva[i][1]; - } - - // handle bone map - const auto vsBDEF1_ptr = - dynamic_cast<pmx::PmxVertexSkinningBDEF1 *>(v->skinning.get()); - const auto vsBDEF2_ptr = - dynamic_cast<pmx::PmxVertexSkinningBDEF2 *>(v->skinning.get()); - const auto vsBDEF4_ptr = - dynamic_cast<pmx::PmxVertexSkinningBDEF4 *>(v->skinning.get()); - const auto vsSDEF_ptr = - dynamic_cast<pmx::PmxVertexSkinningSDEF *>(v->skinning.get()); - switch (v->skinning_type) { - case pmx::PmxVertexSkinningType::BDEF1: - bone_vertex_map[vsBDEF1_ptr->bone_index].push_back( - aiVertexWeight(index, 1.0)); - break; - case pmx::PmxVertexSkinningType::BDEF2: - bone_vertex_map[vsBDEF2_ptr->bone_index1].push_back( - aiVertexWeight(index, vsBDEF2_ptr->bone_weight)); - bone_vertex_map[vsBDEF2_ptr->bone_index2].push_back( - aiVertexWeight(index, 1.0f - vsBDEF2_ptr->bone_weight)); - break; - case pmx::PmxVertexSkinningType::BDEF4: - bone_vertex_map[vsBDEF4_ptr->bone_index1].push_back( - aiVertexWeight(index, vsBDEF4_ptr->bone_weight1)); - bone_vertex_map[vsBDEF4_ptr->bone_index2].push_back( - aiVertexWeight(index, vsBDEF4_ptr->bone_weight2)); - bone_vertex_map[vsBDEF4_ptr->bone_index3].push_back( - aiVertexWeight(index, vsBDEF4_ptr->bone_weight3)); - bone_vertex_map[vsBDEF4_ptr->bone_index4].push_back( - aiVertexWeight(index, vsBDEF4_ptr->bone_weight4)); - break; - case pmx::PmxVertexSkinningType::SDEF: // TODO: how to use sdef_c, sdef_r0, - // sdef_r1? - bone_vertex_map[vsSDEF_ptr->bone_index1].push_back( - aiVertexWeight(index, vsSDEF_ptr->bone_weight)); - bone_vertex_map[vsSDEF_ptr->bone_index2].push_back( - aiVertexWeight(index, 1.0f - vsSDEF_ptr->bone_weight)); - break; - case pmx::PmxVertexSkinningType::QDEF: - const auto vsQDEF_ptr = - dynamic_cast<pmx::PmxVertexSkinningQDEF *>(v->skinning.get()); - bone_vertex_map[vsQDEF_ptr->bone_index1].push_back( - aiVertexWeight(index, vsQDEF_ptr->bone_weight1)); - bone_vertex_map[vsQDEF_ptr->bone_index2].push_back( - aiVertexWeight(index, vsQDEF_ptr->bone_weight2)); - bone_vertex_map[vsQDEF_ptr->bone_index3].push_back( - aiVertexWeight(index, vsQDEF_ptr->bone_weight3)); - bone_vertex_map[vsQDEF_ptr->bone_index4].push_back( - aiVertexWeight(index, vsQDEF_ptr->bone_weight4)); - break; - } - } - - // make all bones for each mesh - // assign bone weights to skinned bones (otherwise just initialize) - auto bone_ptr_ptr = new aiBone *[pModel->bone_count]; - pMesh->mNumBones = pModel->bone_count; - pMesh->mBones = bone_ptr_ptr; - for (auto ii = 0; ii < pModel->bone_count; ++ii) { - auto pBone = new aiBone; - const auto &pmxBone = pModel->bones[ii]; - pBone->mName = pmxBone.bone_name; - aiVector3D pos(pmxBone.position[0], pmxBone.position[1], pmxBone.position[2]); - aiMatrix4x4::Translation(-pos, pBone->mOffsetMatrix); - auto it = bone_vertex_map.find(ii); - if (it != bone_vertex_map.end()) { - pBone->mNumWeights = static_cast<unsigned int>(it->second.size()); - pBone->mWeights = new aiVertexWeight[pBone->mNumWeights]; - for (unsigned int j = 0; j < pBone->mNumWeights; j++) { - pBone->mWeights[j] = it->second[j]; - } - } - bone_ptr_ptr[ii] = pBone; - } - - return pMesh; -} - -// ------------------------------------------------------------------------------------------------ -aiMaterial *MMDImporter::CreateMaterial(const pmx::PmxMaterial *pMat, - const pmx::PmxModel *pModel) { - aiMaterial *mat = new aiMaterial(); - aiString name(pMat->material_english_name); - mat->AddProperty(&name, AI_MATKEY_NAME); - - aiColor3D diffuse(pMat->diffuse[0], pMat->diffuse[1], pMat->diffuse[2]); - mat->AddProperty(&diffuse, 1, AI_MATKEY_COLOR_DIFFUSE); - aiColor3D specular(pMat->specular[0], pMat->specular[1], pMat->specular[2]); - mat->AddProperty(&specular, 1, AI_MATKEY_COLOR_SPECULAR); - aiColor3D ambient(pMat->ambient[0], pMat->ambient[1], pMat->ambient[2]); - mat->AddProperty(&ambient, 1, AI_MATKEY_COLOR_AMBIENT); - - float opacity = pMat->diffuse[3]; - mat->AddProperty(&opacity, 1, AI_MATKEY_OPACITY); - float shininess = pMat->specularlity; - mat->AddProperty(&shininess, 1, AI_MATKEY_SHININESS_STRENGTH); - - if(pMat->diffuse_texture_index >= 0) { - aiString texture_path(pModel->textures[pMat->diffuse_texture_index]); - mat->AddProperty(&texture_path, AI_MATKEY_TEXTURE(aiTextureType_DIFFUSE, 0)); - } - - int mapping_uvwsrc = 0; - mat->AddProperty(&mapping_uvwsrc, 1, - AI_MATKEY_UVWSRC(aiTextureType_DIFFUSE, 0)); - - return mat; -} - -// ------------------------------------------------------------------------------------------------ - -} // Namespace Assimp - -#endif // !! ASSIMP_BUILD_NO_MMD_IMPORTER diff --git a/thirdparty/assimp/code/MMD/MMDImporter.h b/thirdparty/assimp/code/MMD/MMDImporter.h deleted file mode 100644 index 4ee94eeb00..0000000000 --- a/thirdparty/assimp/code/MMD/MMDImporter.h +++ /dev/null @@ -1,96 +0,0 @@ -/* -Open Asset Import Library (assimp) ----------------------------------------------------------------------- - -Copyright (c) 2006-2016, assimp team -All rights reserved. - -Redistribution and use of this software in source and binary forms, -with or without modification, are permitted provided that the -following conditions are met: - -* Redistributions of source code must retain the above - copyright notice, this list of conditions and the - following disclaimer. - -* Redistributions in binary form must reproduce the above - copyright notice, this list of conditions and the - following disclaimer in the documentation and/or other - materials provided with the distribution. - -* Neither the name of the assimp team, nor the names of its - contributors may be used to endorse or promote products - derived from this software without specific prior - written permission of the assimp team. - -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS -"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT -LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR -A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT -OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, -SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT -LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, -DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY -THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE -OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - ----------------------------------------------------------------------- -*/ -#ifndef MMD_FILE_IMPORTER_H_INC -#define MMD_FILE_IMPORTER_H_INC - -#include <assimp/BaseImporter.h> -#include "MMDPmxParser.h" -#include <assimp/material.h> -#include <vector> - -struct aiMesh; - -namespace Assimp { - -// ------------------------------------------------------------------------------------------------ -/// \class MMDImporter -/// \brief Imports MMD a pmx/pmd/vmd file -// ------------------------------------------------------------------------------------------------ -class MMDImporter : public BaseImporter { -public: - /// \brief Default constructor - MMDImporter(); - - /// \brief Destructor - ~MMDImporter(); - -public: - /// \brief Returns whether the class can handle the format of the given file. - /// \remark See BaseImporter::CanRead() for details. - bool CanRead( const std::string& pFile, IOSystem* pIOHandler, bool checkSig) const; - -private: - //! \brief Appends the supported extension. - const aiImporterDesc* GetInfo () const; - - //! \brief File import implementation. - void InternReadFile(const std::string& pFile, aiScene* pScene, IOSystem* pIOHandler); - - //! \brief Create the data from imported content. - void CreateDataFromImport(const pmx::PmxModel* pModel, aiScene* pScene); - - //! \brief Create the mesh - aiMesh* CreateMesh(const pmx::PmxModel* pModel, const int indexStart, const int indexCount); - - //! \brief Create the material - aiMaterial* CreateMaterial(const pmx::PmxMaterial* pMat, const pmx::PmxModel* pModel); - -private: - //! Data buffer - std::vector<char> m_Buffer; - //! Absolute pathname of model in file system - std::string m_strAbsPath; -}; - -// ------------------------------------------------------------------------------------------------ - -} // Namespace Assimp - -#endif
\ No newline at end of file diff --git a/thirdparty/assimp/code/MMD/MMDPmdParser.h b/thirdparty/assimp/code/MMD/MMDPmdParser.h deleted file mode 100644 index d2f2224aa1..0000000000 --- a/thirdparty/assimp/code/MMD/MMDPmdParser.h +++ /dev/null @@ -1,597 +0,0 @@ -/* -Open Asset Import Library (assimp) ----------------------------------------------------------------------- - -Copyright (c) 2006-2019, assimp team - - -All rights reserved. - -Redistribution and use of this software in source and binary forms, -with or without modification, are permitted provided that the -following conditions are met: - -* Redistributions of source code must retain the above -copyright notice, this list of conditions and the -following disclaimer. - -* Redistributions in binary form must reproduce the above -copyright notice, this list of conditions and the -following disclaimer in the documentation and/or other -materials provided with the distribution. - -* Neither the name of the assimp team, nor the names of its -contributors may be used to endorse or promote products -derived from this software without specific prior -written permission of the assimp team. - -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS -"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT -LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR -A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT -OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, -SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT -LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, -DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY -THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE -OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - ----------------------------------------------------------------------- -*/ -#pragma once - -#include <vector> -#include <string> -#include <memory> -#include <iostream> -#include <fstream> -#include "MMDCpp14.h" - -namespace pmd -{ - class PmdHeader - { - public: - std::string name; - std::string name_english; - std::string comment; - std::string comment_english; - - bool Read(std::ifstream* stream) - { - char buffer[256]; - stream->read(buffer, 20); - name = std::string(buffer); - stream->read(buffer, 256); - comment = std::string(buffer); - return true; - } - - bool ReadExtension(std::ifstream* stream) - { - char buffer[256]; - stream->read(buffer, 20); - name_english = std::string(buffer); - stream->read(buffer, 256); - comment_english = std::string(buffer); - return true; - } - }; - - class PmdVertex - { - public: - float position[3]; - - float normal[3]; - - float uv[2]; - - uint16_t bone_index[2]; - - uint8_t bone_weight; - - bool edge_invisible; - - bool Read(std::ifstream* stream) - { - stream->read((char*) position, sizeof(float) * 3); - stream->read((char*) normal, sizeof(float) * 3); - stream->read((char*) uv, sizeof(float) * 2); - stream->read((char*) bone_index, sizeof(uint16_t) * 2); - stream->read((char*) &bone_weight, sizeof(uint8_t)); - stream->read((char*) &edge_invisible, sizeof(uint8_t)); - return true; - } - }; - - class PmdMaterial - { - public: - float diffuse[4]; - float power; - float specular[3]; - float ambient[3]; - uint8_t toon_index; - uint8_t edge_flag; - uint32_t index_count; - std::string texture_filename; - std::string sphere_filename; - - bool Read(std::ifstream* stream) - { - char buffer[20]; - stream->read((char*) &diffuse, sizeof(float) * 4); - stream->read((char*) &power, sizeof(float)); - stream->read((char*) &specular, sizeof(float) * 3); - stream->read((char*) &ambient, sizeof(float) * 3); - stream->read((char*) &toon_index, sizeof(uint8_t)); - stream->read((char*) &edge_flag, sizeof(uint8_t)); - stream->read((char*) &index_count, sizeof(uint32_t)); - stream->read((char*) &buffer, sizeof(char) * 20); - char* pstar = strchr(buffer, '*'); - if (NULL == pstar) - { - texture_filename = std::string(buffer); - sphere_filename.clear(); - } - else { - *pstar = 0; - texture_filename = std::string(buffer); - sphere_filename = std::string(pstar+1); - } - return true; - } - }; - - enum class BoneType : uint8_t - { - Rotation, - RotationAndMove, - IkEffector, - Unknown, - IkEffectable, - RotationEffectable, - IkTarget, - Invisible, - Twist, - RotationMovement - }; - - class PmdBone - { - public: - std::string name; - std::string name_english; - uint16_t parent_bone_index; - uint16_t tail_pos_bone_index; - BoneType bone_type; - uint16_t ik_parent_bone_index; - float bone_head_pos[3]; - - void Read(std::istream *stream) - { - char buffer[20]; - stream->read(buffer, 20); - name = std::string(buffer); - stream->read((char*) &parent_bone_index, sizeof(uint16_t)); - stream->read((char*) &tail_pos_bone_index, sizeof(uint16_t)); - stream->read((char*) &bone_type, sizeof(uint8_t)); - stream->read((char*) &ik_parent_bone_index, sizeof(uint16_t)); - stream->read((char*) &bone_head_pos, sizeof(float) * 3); - } - - void ReadExpantion(std::istream *stream) - { - char buffer[20]; - stream->read(buffer, 20); - name_english = std::string(buffer); - } - }; - - class PmdIk - { - public: - uint16_t ik_bone_index; - uint16_t target_bone_index; - uint16_t interations; - float angle_limit; - std::vector<uint16_t> ik_child_bone_index; - - void Read(std::istream *stream) - { - stream->read((char *) &ik_bone_index, sizeof(uint16_t)); - stream->read((char *) &target_bone_index, sizeof(uint16_t)); - uint8_t ik_chain_length; - stream->read((char*) &ik_chain_length, sizeof(uint8_t)); - stream->read((char *) &interations, sizeof(uint16_t)); - stream->read((char *) &angle_limit, sizeof(float)); - ik_child_bone_index.resize(ik_chain_length); - for (int i = 0; i < ik_chain_length; i++) - { - stream->read((char *) &ik_child_bone_index[i], sizeof(uint16_t)); - } - } - }; - - class PmdFaceVertex - { - public: - int vertex_index; - float position[3]; - - void Read(std::istream *stream) - { - stream->read((char *) &vertex_index, sizeof(int)); - stream->read((char *) position, sizeof(float) * 3); - } - }; - - enum class FaceCategory : uint8_t - { - Base, - Eyebrow, - Eye, - Mouth, - Other - }; - - class PmdFace - { - public: - std::string name; - FaceCategory type; - std::vector<PmdFaceVertex> vertices; - std::string name_english; - - void Read(std::istream *stream) - { - char buffer[20]; - stream->read(buffer, 20); - name = std::string(buffer); - int vertex_count; - stream->read((char*) &vertex_count, sizeof(int)); - stream->read((char*) &type, sizeof(uint8_t)); - vertices.resize(vertex_count); - for (int i = 0; i < vertex_count; i++) - { - vertices[i].Read(stream); - } - } - - void ReadExpantion(std::istream *stream) - { - char buffer[20]; - stream->read(buffer, 20); - name_english = std::string(buffer); - } - }; - - class PmdBoneDispName - { - public: - std::string bone_disp_name; - std::string bone_disp_name_english; - - void Read(std::istream *stream) - { - char buffer[50]; - stream->read(buffer, 50); - bone_disp_name = std::string(buffer); - bone_disp_name_english.clear(); - } - void ReadExpantion(std::istream *stream) - { - char buffer[50]; - stream->read(buffer, 50); - bone_disp_name_english = std::string(buffer); - } - }; - - class PmdBoneDisp - { - public: - uint16_t bone_index; - uint8_t bone_disp_index; - - void Read(std::istream *stream) - { - stream->read((char*) &bone_index, sizeof(uint16_t)); - stream->read((char*) &bone_disp_index, sizeof(uint8_t)); - } - }; - - enum class RigidBodyShape : uint8_t - { - Sphere = 0, - Box = 1, - Cpusel = 2 - }; - - enum class RigidBodyType : uint8_t - { - BoneConnected = 0, - Physics = 1, - ConnectedPhysics = 2 - }; - - class PmdRigidBody - { - public: - std::string name; - uint16_t related_bone_index; - uint8_t group_index; - uint16_t mask; - RigidBodyShape shape; - float size[3]; - float position[3]; - float orientation[3]; - float weight; - float linear_damping; - float anglar_damping; - float restitution; - float friction; - RigidBodyType rigid_type; - - void Read(std::istream *stream) - { - char buffer[20]; - stream->read(buffer, sizeof(char) * 20); - name = (std::string(buffer)); - stream->read((char*) &related_bone_index, sizeof(uint16_t)); - stream->read((char*) &group_index, sizeof(uint8_t)); - stream->read((char*) &mask, sizeof(uint16_t)); - stream->read((char*) &shape, sizeof(uint8_t)); - stream->read((char*) size, sizeof(float) * 3); - stream->read((char*) position, sizeof(float) * 3); - stream->read((char*) orientation, sizeof(float) * 3); - stream->read((char*) &weight, sizeof(float)); - stream->read((char*) &linear_damping, sizeof(float)); - stream->read((char*) &anglar_damping, sizeof(float)); - stream->read((char*) &restitution, sizeof(float)); - stream->read((char*) &friction, sizeof(float)); - stream->read((char*) &rigid_type, sizeof(char)); - } - }; - - class PmdConstraint - { - public: - std::string name; - uint32_t rigid_body_index_a; - uint32_t rigid_body_index_b; - float position[3]; - float orientation[3]; - float linear_lower_limit[3]; - float linear_upper_limit[3]; - float angular_lower_limit[3]; - float angular_upper_limit[3]; - float linear_stiffness[3]; - float angular_stiffness[3]; - - void Read(std::istream *stream) - { - char buffer[20]; - stream->read(buffer, 20); - name = std::string(buffer); - stream->read((char *) &rigid_body_index_a, sizeof(uint32_t)); - stream->read((char *) &rigid_body_index_b, sizeof(uint32_t)); - stream->read((char *) position, sizeof(float) * 3); - stream->read((char *) orientation, sizeof(float) * 3); - stream->read((char *) linear_lower_limit, sizeof(float) * 3); - stream->read((char *) linear_upper_limit, sizeof(float) * 3); - stream->read((char *) angular_lower_limit, sizeof(float) * 3); - stream->read((char *) angular_upper_limit, sizeof(float) * 3); - stream->read((char *) linear_stiffness, sizeof(float) * 3); - stream->read((char *) angular_stiffness, sizeof(float) * 3); - } - }; - - class PmdModel - { - public: - float version; - PmdHeader header; - std::vector<PmdVertex> vertices; - std::vector<uint16_t> indices; - std::vector<PmdMaterial> materials; - std::vector<PmdBone> bones; - std::vector<PmdIk> iks; - std::vector<PmdFace> faces; - std::vector<uint16_t> faces_indices; - std::vector<PmdBoneDispName> bone_disp_name; - std::vector<PmdBoneDisp> bone_disp; - std::vector<std::string> toon_filenames; - std::vector<PmdRigidBody> rigid_bodies; - std::vector<PmdConstraint> constraints; - - static std::unique_ptr<PmdModel> LoadFromFile(const char *filename) - { - std::ifstream stream(filename, std::ios::binary); - if (stream.fail()) - { - std::cerr << "could not open \"" << filename << "\"" << std::endl; - return nullptr; - } - auto result = LoadFromStream(&stream); - stream.close(); - return result; - } - - static std::unique_ptr<PmdModel> LoadFromStream(std::ifstream *stream) - { - auto result = mmd::make_unique<PmdModel>(); - char buffer[100]; - - // magic - char magic[3]; - stream->read(magic, 3); - if (magic[0] != 'P' || magic[1] != 'm' || magic[2] != 'd') - { - std::cerr << "invalid file" << std::endl; - return nullptr; - } - - // version - stream->read((char*) &(result->version), sizeof(float)); - if (result ->version != 1.0f) - { - std::cerr << "invalid version" << std::endl; - return nullptr; - } - - // header - result->header.Read(stream); - - // vertices - uint32_t vertex_num; - stream->read((char*) &vertex_num, sizeof(uint32_t)); - result->vertices.resize(vertex_num); - for (uint32_t i = 0; i < vertex_num; i++) - { - result->vertices[i].Read(stream); - } - - // indices - uint32_t index_num; - stream->read((char*) &index_num, sizeof(uint32_t)); - result->indices.resize(index_num); - for (uint32_t i = 0; i < index_num; i++) - { - stream->read((char*) &result->indices[i], sizeof(uint16_t)); - } - - // materials - uint32_t material_num; - stream->read((char*) &material_num, sizeof(uint32_t)); - result->materials.resize(material_num); - for (uint32_t i = 0; i < material_num; i++) - { - result->materials[i].Read(stream); - } - - // bones - uint16_t bone_num; - stream->read((char*) &bone_num, sizeof(uint16_t)); - result->bones.resize(bone_num); - for (uint32_t i = 0; i < bone_num; i++) - { - result->bones[i].Read(stream); - } - - // iks - uint16_t ik_num; - stream->read((char*) &ik_num, sizeof(uint16_t)); - result->iks.resize(ik_num); - for (uint32_t i = 0; i < ik_num; i++) - { - result->iks[i].Read(stream); - } - - // faces - uint16_t face_num; - stream->read((char*) &face_num, sizeof(uint16_t)); - result->faces.resize(face_num); - for (uint32_t i = 0; i < face_num; i++) - { - result->faces[i].Read(stream); - } - - // face frames - uint8_t face_frame_num; - stream->read((char*) &face_frame_num, sizeof(uint8_t)); - result->faces_indices.resize(face_frame_num); - for (uint32_t i = 0; i < face_frame_num; i++) - { - stream->read((char*) &result->faces_indices[i], sizeof(uint16_t)); - } - - // bone names - uint8_t bone_disp_num; - stream->read((char*) &bone_disp_num, sizeof(uint8_t)); - result->bone_disp_name.resize(bone_disp_num); - for (uint32_t i = 0; i < bone_disp_num; i++) - { - result->bone_disp_name[i].Read(stream); - } - - // bone frame - uint32_t bone_frame_num; - stream->read((char*) &bone_frame_num, sizeof(uint32_t)); - result->bone_disp.resize(bone_frame_num); - for (uint32_t i = 0; i < bone_frame_num; i++) - { - result->bone_disp[i].Read(stream); - } - - // english name - bool english; - stream->read((char*) &english, sizeof(char)); - if (english) - { - result->header.ReadExtension(stream); - for (uint32_t i = 0; i < bone_num; i++) - { - result->bones[i].ReadExpantion(stream); - } - for (uint32_t i = 0; i < face_num; i++) - { - if (result->faces[i].type == pmd::FaceCategory::Base) - { - continue; - } - result->faces[i].ReadExpantion(stream); - } - for (uint32_t i = 0; i < result->bone_disp_name.size(); i++) - { - result->bone_disp_name[i].ReadExpantion(stream); - } - } - - // toon textures - if (stream->peek() == std::ios::traits_type::eof()) - { - result->toon_filenames.clear(); - } - else { - result->toon_filenames.resize(10); - for (uint32_t i = 0; i < 10; i++) - { - stream->read(buffer, 100); - result->toon_filenames[i] = std::string(buffer); - } - } - - // physics - if (stream->peek() == std::ios::traits_type::eof()) - { - result->rigid_bodies.clear(); - result->constraints.clear(); - } - else { - uint32_t rigid_body_num; - stream->read((char*) &rigid_body_num, sizeof(uint32_t)); - result->rigid_bodies.resize(rigid_body_num); - for (uint32_t i = 0; i < rigid_body_num; i++) - { - result->rigid_bodies[i].Read(stream); - } - uint32_t constraint_num; - stream->read((char*) &constraint_num, sizeof(uint32_t)); - result->constraints.resize(constraint_num); - for (uint32_t i = 0; i < constraint_num; i++) - { - result->constraints[i].Read(stream); - } - } - - if (stream->peek() != std::ios::traits_type::eof()) - { - std::cerr << "there is unknown data" << std::endl; - } - - return result; - } - }; -} diff --git a/thirdparty/assimp/code/MMD/MMDPmxParser.cpp b/thirdparty/assimp/code/MMD/MMDPmxParser.cpp deleted file mode 100644 index 80f0986dd7..0000000000 --- a/thirdparty/assimp/code/MMD/MMDPmxParser.cpp +++ /dev/null @@ -1,608 +0,0 @@ -/* -Open Asset Import Library (assimp) ----------------------------------------------------------------------- - -Copyright (c) 2006-2019, assimp team - - -All rights reserved. - -Redistribution and use of this software in source and binary forms, -with or without modification, are permitted provided that the -following conditions are met: - -* Redistributions of source code must retain the above -copyright notice, this list of conditions and the -following disclaimer. - -* Redistributions in binary form must reproduce the above -copyright notice, this list of conditions and the -following disclaimer in the documentation and/or other -materials provided with the distribution. - -* Neither the name of the assimp team, nor the names of its -contributors may be used to endorse or promote products -derived from this software without specific prior -written permission of the assimp team. - -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS -"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT -LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR -A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT -OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, -SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT -LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, -DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY -THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE -OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - ----------------------------------------------------------------------- -*/ -#include <utility> -#include "MMDPmxParser.h" -#include <assimp/StringUtils.h> -#ifdef ASSIMP_USE_HUNTER -# include <utf8/utf8.h> -#else -# include "../contrib/utf8cpp/source/utf8.h" -#endif -#include <assimp/Exceptional.h> - -namespace pmx -{ - int ReadIndex(std::istream *stream, int size) - { - switch (size) - { - case 1: - uint8_t tmp8; - stream->read((char*) &tmp8, sizeof(uint8_t)); - if (255 == tmp8) - { - return -1; - } - else { - return (int) tmp8; - } - case 2: - uint16_t tmp16; - stream->read((char*) &tmp16, sizeof(uint16_t)); - if (65535 == tmp16) - { - return -1; - } - else { - return (int) tmp16; - } - case 4: - int tmp32; - stream->read((char*) &tmp32, sizeof(int)); - return tmp32; - default: - return -1; - } - } - - std::string ReadString(std::istream *stream, uint8_t encoding) - { - int size; - stream->read((char*) &size, sizeof(int)); - std::vector<char> buffer; - if (size == 0) - { - return std::string(""); - } - buffer.reserve(size); - stream->read((char*) buffer.data(), size); - if (encoding == 0) - { - // UTF16 to UTF8 - const uint16_t* sourceStart = (uint16_t*)buffer.data(); - const unsigned int targetSize = size * 3; // enough to encode - char *targetStart = new char[targetSize]; - std::memset(targetStart, 0, targetSize * sizeof(char)); - - utf8::utf16to8( sourceStart, sourceStart + size/2, targetStart ); - - std::string result(targetStart); - delete [] targetStart; - return result; - } - else - { - // the name is already UTF8 - return std::string((const char*)buffer.data(), size); - } - } - - void PmxSetting::Read(std::istream *stream) - { - uint8_t count; - stream->read((char*) &count, sizeof(uint8_t)); - if (count < 8) - { - throw DeadlyImportError("MMD: invalid size"); - } - stream->read((char*) &encoding, sizeof(uint8_t)); - stream->read((char*) &uv, sizeof(uint8_t)); - stream->read((char*) &vertex_index_size, sizeof(uint8_t)); - stream->read((char*) &texture_index_size, sizeof(uint8_t)); - stream->read((char*) &material_index_size, sizeof(uint8_t)); - stream->read((char*) &bone_index_size, sizeof(uint8_t)); - stream->read((char*) &morph_index_size, sizeof(uint8_t)); - stream->read((char*) &rigidbody_index_size, sizeof(uint8_t)); - uint8_t temp; - for (int i = 8; i < count; i++) - { - stream->read((char*)&temp, sizeof(uint8_t)); - } - } - - void PmxVertexSkinningBDEF1::Read(std::istream *stream, PmxSetting *setting) - { - this->bone_index = ReadIndex(stream, setting->bone_index_size); - } - - void PmxVertexSkinningBDEF2::Read(std::istream *stream, PmxSetting *setting) - { - this->bone_index1 = ReadIndex(stream, setting->bone_index_size); - this->bone_index2 = ReadIndex(stream, setting->bone_index_size); - stream->read((char*) &this->bone_weight, sizeof(float)); - } - - void PmxVertexSkinningBDEF4::Read(std::istream *stream, PmxSetting *setting) - { - this->bone_index1 = ReadIndex(stream, setting->bone_index_size); - this->bone_index2 = ReadIndex(stream, setting->bone_index_size); - this->bone_index3 = ReadIndex(stream, setting->bone_index_size); - this->bone_index4 = ReadIndex(stream, setting->bone_index_size); - stream->read((char*) &this->bone_weight1, sizeof(float)); - stream->read((char*) &this->bone_weight2, sizeof(float)); - stream->read((char*) &this->bone_weight3, sizeof(float)); - stream->read((char*) &this->bone_weight4, sizeof(float)); - } - - void PmxVertexSkinningSDEF::Read(std::istream *stream, PmxSetting *setting) - { - this->bone_index1 = ReadIndex(stream, setting->bone_index_size); - this->bone_index2 = ReadIndex(stream, setting->bone_index_size); - stream->read((char*) &this->bone_weight, sizeof(float)); - stream->read((char*) this->sdef_c, sizeof(float) * 3); - stream->read((char*) this->sdef_r0, sizeof(float) * 3); - stream->read((char*) this->sdef_r1, sizeof(float) * 3); - } - - void PmxVertexSkinningQDEF::Read(std::istream *stream, PmxSetting *setting) - { - this->bone_index1 = ReadIndex(stream, setting->bone_index_size); - this->bone_index2 = ReadIndex(stream, setting->bone_index_size); - this->bone_index3 = ReadIndex(stream, setting->bone_index_size); - this->bone_index4 = ReadIndex(stream, setting->bone_index_size); - stream->read((char*) &this->bone_weight1, sizeof(float)); - stream->read((char*) &this->bone_weight2, sizeof(float)); - stream->read((char*) &this->bone_weight3, sizeof(float)); - stream->read((char*) &this->bone_weight4, sizeof(float)); - } - - void PmxVertex::Read(std::istream *stream, PmxSetting *setting) - { - stream->read((char*) this->position, sizeof(float) * 3); - stream->read((char*) this->normal, sizeof(float) * 3); - stream->read((char*) this->uv, sizeof(float) * 2); - for (int i = 0; i < setting->uv; ++i) - { - stream->read((char*) this->uva[i], sizeof(float) * 4); - } - stream->read((char*) &this->skinning_type, sizeof(PmxVertexSkinningType)); - switch (this->skinning_type) - { - case PmxVertexSkinningType::BDEF1: - this->skinning = mmd::make_unique<PmxVertexSkinningBDEF1>(); - break; - case PmxVertexSkinningType::BDEF2: - this->skinning = mmd::make_unique<PmxVertexSkinningBDEF2>(); - break; - case PmxVertexSkinningType::BDEF4: - this->skinning = mmd::make_unique<PmxVertexSkinningBDEF4>(); - break; - case PmxVertexSkinningType::SDEF: - this->skinning = mmd::make_unique<PmxVertexSkinningSDEF>(); - break; - case PmxVertexSkinningType::QDEF: - this->skinning = mmd::make_unique<PmxVertexSkinningQDEF>(); - break; - default: - throw "invalid skinning type"; - } - this->skinning->Read(stream, setting); - stream->read((char*) &this->edge, sizeof(float)); - } - - void PmxMaterial::Read(std::istream *stream, PmxSetting *setting) - { - this->material_name = ReadString(stream, setting->encoding); - this->material_english_name = ReadString(stream, setting->encoding); - stream->read((char*) this->diffuse, sizeof(float) * 4); - stream->read((char*) this->specular, sizeof(float) * 3); - stream->read((char*) &this->specularlity, sizeof(float)); - stream->read((char*) this->ambient, sizeof(float) * 3); - stream->read((char*) &this->flag, sizeof(uint8_t)); - stream->read((char*) this->edge_color, sizeof(float) * 4); - stream->read((char*) &this->edge_size, sizeof(float)); - this->diffuse_texture_index = ReadIndex(stream, setting->texture_index_size); - this->sphere_texture_index = ReadIndex(stream, setting->texture_index_size); - stream->read((char*) &this->sphere_op_mode, sizeof(uint8_t)); - stream->read((char*) &this->common_toon_flag, sizeof(uint8_t)); - if (this->common_toon_flag) - { - stream->read((char*) &this->toon_texture_index, sizeof(uint8_t)); - } - else { - this->toon_texture_index = ReadIndex(stream, setting->texture_index_size); - } - this->memo = ReadString(stream, setting->encoding); - stream->read((char*) &this->index_count, sizeof(int)); - } - - void PmxIkLink::Read(std::istream *stream, PmxSetting *setting) - { - this->link_target = ReadIndex(stream, setting->bone_index_size); - stream->read((char*) &this->angle_lock, sizeof(uint8_t)); - if (angle_lock == 1) - { - stream->read((char*) this->max_radian, sizeof(float) * 3); - stream->read((char*) this->min_radian, sizeof(float) * 3); - } - } - - void PmxBone::Read(std::istream *stream, PmxSetting *setting) - { - this->bone_name = ReadString(stream, setting->encoding); - this->bone_english_name = ReadString(stream, setting->encoding); - stream->read((char*) this->position, sizeof(float) * 3); - this->parent_index = ReadIndex(stream, setting->bone_index_size); - stream->read((char*) &this->level, sizeof(int)); - stream->read((char*) &this->bone_flag, sizeof(uint16_t)); - if (this->bone_flag & 0x0001) { - this->target_index = ReadIndex(stream, setting->bone_index_size); - } - else { - stream->read((char*)this->offset, sizeof(float) * 3); - } - if (this->bone_flag & (0x0100 | 0x0200)) { - this->grant_parent_index = ReadIndex(stream, setting->bone_index_size); - stream->read((char*) &this->grant_weight, sizeof(float)); - } - if (this->bone_flag & 0x0400) { - stream->read((char*)this->lock_axis_orientation, sizeof(float) * 3); - } - if (this->bone_flag & 0x0800) { - stream->read((char*)this->local_axis_x_orientation, sizeof(float) * 3); - stream->read((char*)this->local_axis_y_orientation, sizeof(float) * 3); - } - if (this->bone_flag & 0x2000) { - stream->read((char*) &this->key, sizeof(int)); - } - if (this->bone_flag & 0x0020) { - this->ik_target_bone_index = ReadIndex(stream, setting->bone_index_size); - stream->read((char*) &ik_loop, sizeof(int)); - stream->read((char*) &ik_loop_angle_limit, sizeof(float)); - stream->read((char*) &ik_link_count, sizeof(int)); - this->ik_links = mmd::make_unique<PmxIkLink []>(ik_link_count); - for (int i = 0; i < ik_link_count; i++) { - ik_links[i].Read(stream, setting); - } - } - } - - void PmxMorphVertexOffset::Read(std::istream *stream, PmxSetting *setting) - { - this->vertex_index = ReadIndex(stream, setting->vertex_index_size); - stream->read((char*)this->position_offset, sizeof(float) * 3); - } - - void PmxMorphUVOffset::Read(std::istream *stream, PmxSetting *setting) - { - this->vertex_index = ReadIndex(stream, setting->vertex_index_size); - stream->read((char*)this->uv_offset, sizeof(float) * 4); - } - - void PmxMorphBoneOffset::Read(std::istream *stream, PmxSetting *setting) - { - this->bone_index = ReadIndex(stream, setting->bone_index_size); - stream->read((char*)this->translation, sizeof(float) * 3); - stream->read((char*)this->rotation, sizeof(float) * 4); - } - - void PmxMorphMaterialOffset::Read(std::istream *stream, PmxSetting *setting) - { - this->material_index = ReadIndex(stream, setting->material_index_size); - stream->read((char*) &this->offset_operation, sizeof(uint8_t)); - stream->read((char*)this->diffuse, sizeof(float) * 4); - stream->read((char*)this->specular, sizeof(float) * 3); - stream->read((char*) &this->specularity, sizeof(float)); - stream->read((char*)this->ambient, sizeof(float) * 3); - stream->read((char*)this->edge_color, sizeof(float) * 4); - stream->read((char*) &this->edge_size, sizeof(float)); - stream->read((char*)this->texture_argb, sizeof(float) * 4); - stream->read((char*)this->sphere_texture_argb, sizeof(float) * 4); - stream->read((char*)this->toon_texture_argb, sizeof(float) * 4); - } - - void PmxMorphGroupOffset::Read(std::istream *stream, PmxSetting *setting) - { - this->morph_index = ReadIndex(stream, setting->morph_index_size); - stream->read((char*) &this->morph_weight, sizeof(float)); - } - - void PmxMorphFlipOffset::Read(std::istream *stream, PmxSetting *setting) - { - this->morph_index = ReadIndex(stream, setting->morph_index_size); - stream->read((char*) &this->morph_value, sizeof(float)); - } - - void PmxMorphImplusOffset::Read(std::istream *stream, PmxSetting *setting) - { - this->rigid_body_index = ReadIndex(stream, setting->rigidbody_index_size); - stream->read((char*) &this->is_local, sizeof(uint8_t)); - stream->read((char*)this->velocity, sizeof(float) * 3); - stream->read((char*)this->angular_torque, sizeof(float) * 3); - } - - void PmxMorph::Read(std::istream *stream, PmxSetting *setting) - { - this->morph_name = ReadString(stream, setting->encoding); - this->morph_english_name = ReadString(stream, setting->encoding); - stream->read((char*) &category, sizeof(MorphCategory)); - stream->read((char*) &morph_type, sizeof(MorphType)); - stream->read((char*) &this->offset_count, sizeof(int)); - switch (this->morph_type) - { - case MorphType::Group: - group_offsets = mmd::make_unique<PmxMorphGroupOffset []>(this->offset_count); - for (int i = 0; i < offset_count; i++) - { - group_offsets[i].Read(stream, setting); - } - break; - case MorphType::Vertex: - vertex_offsets = mmd::make_unique<PmxMorphVertexOffset []>(this->offset_count); - for (int i = 0; i < offset_count; i++) - { - vertex_offsets[i].Read(stream, setting); - } - break; - case MorphType::Bone: - bone_offsets = mmd::make_unique<PmxMorphBoneOffset []>(this->offset_count); - for (int i = 0; i < offset_count; i++) - { - bone_offsets[i].Read(stream, setting); - } - break; - case MorphType::Matrial: - material_offsets = mmd::make_unique<PmxMorphMaterialOffset []>(this->offset_count); - for (int i = 0; i < offset_count; i++) - { - material_offsets[i].Read(stream, setting); - } - break; - case MorphType::UV: - case MorphType::AdditionalUV1: - case MorphType::AdditionalUV2: - case MorphType::AdditionalUV3: - case MorphType::AdditionalUV4: - uv_offsets = mmd::make_unique<PmxMorphUVOffset []>(this->offset_count); - for (int i = 0; i < offset_count; i++) - { - uv_offsets[i].Read(stream, setting); - } - break; - default: - throw DeadlyImportError("MMD: unknown morth type"); - } - } - - void PmxFrameElement::Read(std::istream *stream, PmxSetting *setting) - { - stream->read((char*) &this->element_target, sizeof(uint8_t)); - if (this->element_target == 0x00) - { - this->index = ReadIndex(stream, setting->bone_index_size); - } - else { - this->index = ReadIndex(stream, setting->morph_index_size); - } - } - - void PmxFrame::Read(std::istream *stream, PmxSetting *setting) - { - this->frame_name = ReadString(stream, setting->encoding); - this->frame_english_name = ReadString(stream, setting->encoding); - stream->read((char*) &this->frame_flag, sizeof(uint8_t)); - stream->read((char*) &this->element_count, sizeof(int)); - this->elements = mmd::make_unique<PmxFrameElement []>(this->element_count); - for (int i = 0; i < this->element_count; i++) - { - this->elements[i].Read(stream, setting); - } - } - - void PmxRigidBody::Read(std::istream *stream, PmxSetting *setting) - { - this->girid_body_name = ReadString(stream, setting->encoding); - this->girid_body_english_name = ReadString(stream, setting->encoding); - this->target_bone = ReadIndex(stream, setting->bone_index_size); - stream->read((char*) &this->group, sizeof(uint8_t)); - stream->read((char*) &this->mask, sizeof(uint16_t)); - stream->read((char*) &this->shape, sizeof(uint8_t)); - stream->read((char*) this->size, sizeof(float) * 3); - stream->read((char*) this->position, sizeof(float) * 3); - stream->read((char*) this->orientation, sizeof(float) * 3); - stream->read((char*) &this->mass, sizeof(float)); - stream->read((char*) &this->move_attenuation, sizeof(float)); - stream->read((char*) &this->rotation_attenuation, sizeof(float)); - stream->read((char*) &this->repulsion, sizeof(float)); - stream->read((char*) &this->friction, sizeof(float)); - stream->read((char*) &this->physics_calc_type, sizeof(uint8_t)); - } - - void PmxJointParam::Read(std::istream *stream, PmxSetting *setting) - { - this->rigid_body1 = ReadIndex(stream, setting->rigidbody_index_size); - this->rigid_body2 = ReadIndex(stream, setting->rigidbody_index_size); - stream->read((char*) this->position, sizeof(float) * 3); - stream->read((char*) this->orientaiton, sizeof(float) * 3); - stream->read((char*) this->move_limitation_min, sizeof(float) * 3); - stream->read((char*) this->move_limitation_max, sizeof(float) * 3); - stream->read((char*) this->rotation_limitation_min, sizeof(float) * 3); - stream->read((char*) this->rotation_limitation_max, sizeof(float) * 3); - stream->read((char*) this->spring_move_coefficient, sizeof(float) * 3); - stream->read((char*) this->spring_rotation_coefficient, sizeof(float) * 3); - } - - void PmxJoint::Read(std::istream *stream, PmxSetting *setting) - { - this->joint_name = ReadString(stream, setting->encoding); - this->joint_english_name = ReadString(stream, setting->encoding); - stream->read((char*) &this->joint_type, sizeof(uint8_t)); - this->param.Read(stream, setting); - } - - void PmxAncherRigidBody::Read(std::istream *stream, PmxSetting *setting) - { - this->related_rigid_body = ReadIndex(stream, setting->rigidbody_index_size); - this->related_vertex = ReadIndex(stream, setting->vertex_index_size); - stream->read((char*) &this->is_near, sizeof(uint8_t)); - } - - void PmxSoftBody::Read(std::istream * /*stream*/, PmxSetting * /*setting*/) - { - std::cerr << "Not Implemented Exception" << std::endl; - throw DeadlyImportError("MMD: Not Implemented Exception"); - } - - void PmxModel::Init() - { - this->version = 0.0f; - this->model_name.clear(); - this->model_english_name.clear(); - this->model_comment.clear(); - this->model_english_comment.clear(); - this->vertex_count = 0; - this->vertices = nullptr; - this->index_count = 0; - this->indices = nullptr; - this->texture_count = 0; - this->textures = nullptr; - this->material_count = 0; - this->materials = nullptr; - this->bone_count = 0; - this->bones = nullptr; - this->morph_count = 0; - this->morphs = nullptr; - this->frame_count = 0; - this->frames = nullptr; - this->rigid_body_count = 0; - this->rigid_bodies = nullptr; - this->joint_count = 0; - this->joints = nullptr; - this->soft_body_count = 0; - this->soft_bodies = nullptr; - } - - void PmxModel::Read(std::istream *stream) - { - char magic[4]; - stream->read((char*) magic, sizeof(char) * 4); - if (magic[0] != 0x50 || magic[1] != 0x4d || magic[2] != 0x58 || magic[3] != 0x20) - { - std::cerr << "invalid magic number." << std::endl; - throw DeadlyImportError("MMD: invalid magic number."); - } - stream->read((char*) &version, sizeof(float)); - if (version != 2.0f && version != 2.1f) - { - std::cerr << "this is not ver2.0 or ver2.1 but " << version << "." << std::endl; - throw DeadlyImportError("MMD: this is not ver2.0 or ver2.1 but " + to_string(version)); - } - this->setting.Read(stream); - - this->model_name = ReadString(stream, setting.encoding); - this->model_english_name = ReadString(stream, setting.encoding); - this->model_comment = ReadString(stream, setting.encoding); - this->model_english_comment = ReadString(stream, setting.encoding); - - // read vertices - stream->read((char*) &vertex_count, sizeof(int)); - this->vertices = mmd::make_unique<PmxVertex []>(vertex_count); - for (int i = 0; i < vertex_count; i++) - { - vertices[i].Read(stream, &setting); - } - - // read indices - stream->read((char*) &index_count, sizeof(int)); - this->indices = mmd::make_unique<int []>(index_count); - for (int i = 0; i < index_count; i++) - { - this->indices[i] = ReadIndex(stream, setting.vertex_index_size); - } - - // read texture names - stream->read((char*) &texture_count, sizeof(int)); - this->textures = mmd::make_unique<std::string []>(texture_count); - for (int i = 0; i < texture_count; i++) - { - this->textures[i] = ReadString(stream, setting.encoding); - } - - // read materials - stream->read((char*) &material_count, sizeof(int)); - this->materials = mmd::make_unique<PmxMaterial []>(material_count); - for (int i = 0; i < material_count; i++) - { - this->materials[i].Read(stream, &setting); - } - - // read bones - stream->read((char*) &this->bone_count, sizeof(int)); - this->bones = mmd::make_unique<PmxBone []>(this->bone_count); - for (int i = 0; i < this->bone_count; i++) - { - this->bones[i].Read(stream, &setting); - } - - // read morphs - stream->read((char*) &this->morph_count, sizeof(int)); - this->morphs = mmd::make_unique<PmxMorph []>(this->morph_count); - for (int i = 0; i < this->morph_count; i++) - { - this->morphs[i].Read(stream, &setting); - } - - // read display frames - stream->read((char*) &this->frame_count, sizeof(int)); - this->frames = mmd::make_unique<PmxFrame []>(this->frame_count); - for (int i = 0; i < this->frame_count; i++) - { - this->frames[i].Read(stream, &setting); - } - - // read rigid bodies - stream->read((char*) &this->rigid_body_count, sizeof(int)); - this->rigid_bodies = mmd::make_unique<PmxRigidBody []>(this->rigid_body_count); - for (int i = 0; i < this->rigid_body_count; i++) - { - this->rigid_bodies[i].Read(stream, &setting); - } - - // read joints - stream->read((char*) &this->joint_count, sizeof(int)); - this->joints = mmd::make_unique<PmxJoint []>(this->joint_count); - for (int i = 0; i < this->joint_count; i++) - { - this->joints[i].Read(stream, &setting); - } - } -} diff --git a/thirdparty/assimp/code/MMD/MMDPmxParser.h b/thirdparty/assimp/code/MMD/MMDPmxParser.h deleted file mode 100644 index cf523a1298..0000000000 --- a/thirdparty/assimp/code/MMD/MMDPmxParser.h +++ /dev/null @@ -1,782 +0,0 @@ -/* -Open Asset Import Library (assimp) ----------------------------------------------------------------------- - -Copyright (c) 2006-2019, assimp team - - -All rights reserved. - -Redistribution and use of this software in source and binary forms, -with or without modification, are permitted provided that the -following conditions are met: - -* Redistributions of source code must retain the above -copyright notice, this list of conditions and the -following disclaimer. - -* Redistributions in binary form must reproduce the above -copyright notice, this list of conditions and the -following disclaimer in the documentation and/or other -materials provided with the distribution. - -* Neither the name of the assimp team, nor the names of its -contributors may be used to endorse or promote products -derived from this software without specific prior -written permission of the assimp team. - -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS -"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT -LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR -A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT -OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, -SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT -LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, -DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY -THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE -OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - ----------------------------------------------------------------------- -*/ -#pragma once - -#include <vector> -#include <string> -#include <iostream> -#include <fstream> -#include <memory> -#include "MMDCpp14.h" - -namespace pmx -{ - class PmxSetting - { - public: - PmxSetting() - : encoding(0) - , uv(0) - , vertex_index_size(0) - , texture_index_size(0) - , material_index_size(0) - , bone_index_size(0) - , morph_index_size(0) - , rigidbody_index_size(0) - {} - - uint8_t encoding; - uint8_t uv; - uint8_t vertex_index_size; - uint8_t texture_index_size; - uint8_t material_index_size; - uint8_t bone_index_size; - uint8_t morph_index_size; - uint8_t rigidbody_index_size; - void Read(std::istream *stream); - }; - - enum class PmxVertexSkinningType : uint8_t - { - BDEF1 = 0, - BDEF2 = 1, - BDEF4 = 2, - SDEF = 3, - QDEF = 4, - }; - - class PmxVertexSkinning - { - public: - virtual void Read(std::istream *stream, PmxSetting *setting) = 0; - virtual ~PmxVertexSkinning() {} - }; - - class PmxVertexSkinningBDEF1 : public PmxVertexSkinning - { - public: - PmxVertexSkinningBDEF1() - : bone_index(0) - {} - - int bone_index; - void Read(std::istream *stresam, PmxSetting *setting); - }; - - class PmxVertexSkinningBDEF2 : public PmxVertexSkinning - { - public: - PmxVertexSkinningBDEF2() - : bone_index1(0) - , bone_index2(0) - , bone_weight(0.0f) - {} - - int bone_index1; - int bone_index2; - float bone_weight; - void Read(std::istream *stresam, PmxSetting *setting); - }; - - class PmxVertexSkinningBDEF4 : public PmxVertexSkinning - { - public: - PmxVertexSkinningBDEF4() - : bone_index1(0) - , bone_index2(0) - , bone_index3(0) - , bone_index4(0) - , bone_weight1(0.0f) - , bone_weight2(0.0f) - , bone_weight3(0.0f) - , bone_weight4(0.0f) - {} - - int bone_index1; - int bone_index2; - int bone_index3; - int bone_index4; - float bone_weight1; - float bone_weight2; - float bone_weight3; - float bone_weight4; - void Read(std::istream *stresam, PmxSetting *setting); - }; - - class PmxVertexSkinningSDEF : public PmxVertexSkinning - { - public: - PmxVertexSkinningSDEF() - : bone_index1(0) - , bone_index2(0) - , bone_weight(0.0f) - { - for (int i = 0; i < 3; ++i) { - sdef_c[i] = 0.0f; - sdef_r0[i] = 0.0f; - sdef_r1[i] = 0.0f; - } - } - - int bone_index1; - int bone_index2; - float bone_weight; - float sdef_c[3]; - float sdef_r0[3]; - float sdef_r1[3]; - void Read(std::istream *stresam, PmxSetting *setting); - }; - - class PmxVertexSkinningQDEF : public PmxVertexSkinning - { - public: - PmxVertexSkinningQDEF() - : bone_index1(0) - , bone_index2(0) - , bone_index3(0) - , bone_index4(0) - , bone_weight1(0.0f) - , bone_weight2(0.0f) - , bone_weight3(0.0f) - , bone_weight4(0.0f) - {} - - int bone_index1; - int bone_index2; - int bone_index3; - int bone_index4; - float bone_weight1; - float bone_weight2; - float bone_weight3; - float bone_weight4; - void Read(std::istream *stresam, PmxSetting *setting); - }; - - class PmxVertex - { - public: - PmxVertex() - : edge(0.0f) - { - uv[0] = uv[1] = 0.0f; - for (int i = 0; i < 3; ++i) { - position[i] = 0.0f; - normal[i] = 0.0f; - } - for (int i = 0; i < 4; ++i) { - for (int k = 0; k < 4; ++k) { - uva[i][k] = 0.0f; - } - } - } - - float position[3]; - float normal[3]; - float uv[2]; - float uva[4][4]; - PmxVertexSkinningType skinning_type; - std::unique_ptr<PmxVertexSkinning> skinning; - float edge; - void Read(std::istream *stream, PmxSetting *setting); - }; - - class PmxMaterial - { - public: - PmxMaterial() - : specularlity(0.0f) - , flag(0) - , edge_size(0.0f) - , diffuse_texture_index(0) - , sphere_texture_index(0) - , sphere_op_mode(0) - , common_toon_flag(0) - , toon_texture_index(0) - , index_count(0) - { - for (int i = 0; i < 3; ++i) { - specular[i] = 0.0f; - ambient[i] = 0.0f; - edge_color[i] = 0.0f; - } - for (int i = 0; i < 4; ++i) { - diffuse[i] = 0.0f; - } - } - - std::string material_name; - std::string material_english_name; - float diffuse[4]; - float specular[3]; - float specularlity; - float ambient[3]; - uint8_t flag; - float edge_color[4]; - float edge_size; - int diffuse_texture_index; - int sphere_texture_index; - uint8_t sphere_op_mode; - uint8_t common_toon_flag; - int toon_texture_index; - std::string memo; - int index_count; - void Read(std::istream *stream, PmxSetting *setting); - }; - - class PmxIkLink - { - public: - PmxIkLink() - : link_target(0) - , angle_lock(0) - { - for (int i = 0; i < 3; ++i) { - max_radian[i] = 0.0f; - min_radian[i] = 0.0f; - } - } - - int link_target; - uint8_t angle_lock; - float max_radian[3]; - float min_radian[3]; - void Read(std::istream *stream, PmxSetting *settingn); - }; - - class PmxBone - { - public: - PmxBone() - : parent_index(0) - , level(0) - , bone_flag(0) - , target_index(0) - , grant_parent_index(0) - , grant_weight(0.0f) - , key(0) - , ik_target_bone_index(0) - , ik_loop(0) - , ik_loop_angle_limit(0.0f) - , ik_link_count(0) - { - for (int i = 0; i < 3; ++i) { - position[i] = 0.0f; - offset[i] = 0.0f; - lock_axis_orientation[i] = 0.0f; - local_axis_x_orientation[i] = 0.0f; - local_axis_y_orientation[i] = 0.0f; - } - } - - std::string bone_name; - std::string bone_english_name; - float position[3]; - int parent_index; - int level; - uint16_t bone_flag; - float offset[3]; - int target_index; - int grant_parent_index; - float grant_weight; - float lock_axis_orientation[3]; - float local_axis_x_orientation[3]; - float local_axis_y_orientation[3]; - int key; - int ik_target_bone_index; - int ik_loop; - float ik_loop_angle_limit; - int ik_link_count; - std::unique_ptr<PmxIkLink []> ik_links; - void Read(std::istream *stream, PmxSetting *setting); - }; - - enum class MorphType : uint8_t - { - Group = 0, - Vertex = 1, - Bone = 2, - UV = 3, - AdditionalUV1 = 4, - AdditionalUV2 = 5, - AdditionalUV3 = 6, - AdditionalUV4 = 7, - Matrial = 8, - Flip = 9, - Implus = 10, - }; - - enum class MorphCategory : uint8_t - { - ReservedCategory = 0, - Eyebrow = 1, - Eye = 2, - Mouth = 3, - Other = 4, - }; - - class PmxMorphOffset - { - public: - void virtual Read(std::istream *stream, PmxSetting *setting) = 0; - }; - - class PmxMorphVertexOffset : public PmxMorphOffset - { - public: - PmxMorphVertexOffset() - : vertex_index(0) - { - for (int i = 0; i < 3; ++i) { - position_offset[i] = 0.0f; - } - } - int vertex_index; - float position_offset[3]; - void Read(std::istream *stream, PmxSetting *setting); //override; - }; - - class PmxMorphUVOffset : public PmxMorphOffset - { - public: - PmxMorphUVOffset() - : vertex_index(0) - { - for (int i = 0; i < 4; ++i) { - uv_offset[i] = 0.0f; - } - } - int vertex_index; - float uv_offset[4]; - void Read(std::istream *stream, PmxSetting *setting); //override; - }; - - class PmxMorphBoneOffset : public PmxMorphOffset - { - public: - PmxMorphBoneOffset() - : bone_index(0) - { - for (int i = 0; i < 3; ++i) { - translation[i] = 0.0f; - } - for (int i = 0; i < 4; ++i) { - rotation[i] = 0.0f; - } - } - int bone_index; - float translation[3]; - float rotation[4]; - void Read(std::istream *stream, PmxSetting *setting); //override; - }; - - class PmxMorphMaterialOffset : public PmxMorphOffset - { - public: - PmxMorphMaterialOffset() - : specularity(0.0f) - , edge_size(0.0f) - { - for (int i = 0; i < 3; ++i) { - specular[i] = 0.0f; - ambient[i] = 0.0f; - } - for (int i = 0; i < 4; ++i) { - diffuse[i] = 0.0f; - edge_color[i] = 0.0f; - texture_argb[i] = 0.0f; - sphere_texture_argb[i] = 0.0f; - toon_texture_argb[i] = 0.0f; - } - } - int material_index; - uint8_t offset_operation; - float diffuse[4]; - float specular[3]; - float specularity; - float ambient[3]; - float edge_color[4]; - float edge_size; - float texture_argb[4]; - float sphere_texture_argb[4]; - float toon_texture_argb[4]; - void Read(std::istream *stream, PmxSetting *setting); //override; - }; - - class PmxMorphGroupOffset : public PmxMorphOffset - { - public: - PmxMorphGroupOffset() - : morph_index(0) - , morph_weight(0.0f) - {} - int morph_index; - float morph_weight; - void Read(std::istream *stream, PmxSetting *setting); //override; - }; - - class PmxMorphFlipOffset : public PmxMorphOffset - { - public: - PmxMorphFlipOffset() - : morph_index(0) - , morph_value(0.0f) - {} - int morph_index; - float morph_value; - void Read(std::istream *stream, PmxSetting *setting); //override; - }; - - class PmxMorphImplusOffset : public PmxMorphOffset - { - public: - PmxMorphImplusOffset() - : rigid_body_index(0) - , is_local(0) - { - for (int i = 0; i < 3; ++i) { - velocity[i] = 0.0f; - angular_torque[i] = 0.0f; - } - } - int rigid_body_index; - uint8_t is_local; - float velocity[3]; - float angular_torque[3]; - void Read(std::istream *stream, PmxSetting *setting); //override; - }; - - class PmxMorph - { - public: - PmxMorph() - : offset_count(0) - { - } - std::string morph_name; - std::string morph_english_name; - MorphCategory category; - MorphType morph_type; - int offset_count; - std::unique_ptr<PmxMorphVertexOffset []> vertex_offsets; - std::unique_ptr<PmxMorphUVOffset []> uv_offsets; - std::unique_ptr<PmxMorphBoneOffset []> bone_offsets; - std::unique_ptr<PmxMorphMaterialOffset []> material_offsets; - std::unique_ptr<PmxMorphGroupOffset []> group_offsets; - std::unique_ptr<PmxMorphFlipOffset []> flip_offsets; - std::unique_ptr<PmxMorphImplusOffset []> implus_offsets; - void Read(std::istream *stream, PmxSetting *setting); - }; - - class PmxFrameElement - { - public: - PmxFrameElement() - : element_target(0) - , index(0) - { - } - uint8_t element_target; - int index; - void Read(std::istream *stream, PmxSetting *setting); - }; - - class PmxFrame - { - public: - PmxFrame() - : frame_flag(0) - , element_count(0) - { - } - std::string frame_name; - std::string frame_english_name; - uint8_t frame_flag; - int element_count; - std::unique_ptr<PmxFrameElement []> elements; - void Read(std::istream *stream, PmxSetting *setting); - }; - - class PmxRigidBody - { - public: - PmxRigidBody() - : target_bone(0) - , group(0) - , mask(0) - , shape(0) - , mass(0.0f) - , move_attenuation(0.0f) - , rotation_attenuation(0.0f) - , repulsion(0.0f) - , friction(0.0f) - , physics_calc_type(0) - { - for (int i = 0; i < 3; ++i) { - size[i] = 0.0f; - position[i] = 0.0f; - orientation[i] = 0.0f; - } - } - std::string girid_body_name; - std::string girid_body_english_name; - int target_bone; - uint8_t group; - uint16_t mask; - uint8_t shape; - float size[3]; - float position[3]; - float orientation[3]; - float mass; - float move_attenuation; - float rotation_attenuation; - float repulsion; - float friction; - uint8_t physics_calc_type; - void Read(std::istream *stream, PmxSetting *setting); - }; - - enum class PmxJointType : uint8_t - { - Generic6DofSpring = 0, - Generic6Dof = 1, - Point2Point = 2, - ConeTwist = 3, - Slider = 5, - Hinge = 6 - }; - - class PmxJointParam - { - public: - PmxJointParam() - : rigid_body1(0) - , rigid_body2(0) - { - for (int i = 0; i < 3; ++i) { - position[i] = 0.0f; - orientaiton[i] = 0.0f; - move_limitation_min[i] = 0.0f; - move_limitation_max[i] = 0.0f; - rotation_limitation_min[i] = 0.0f; - rotation_limitation_max[i] = 0.0f; - spring_move_coefficient[i] = 0.0f; - spring_rotation_coefficient[i] = 0.0f; - } - } - int rigid_body1; - int rigid_body2; - float position[3]; - float orientaiton[3]; - float move_limitation_min[3]; - float move_limitation_max[3]; - float rotation_limitation_min[3]; - float rotation_limitation_max[3]; - float spring_move_coefficient[3]; - float spring_rotation_coefficient[3]; - void Read(std::istream *stream, PmxSetting *setting); - }; - - class PmxJoint - { - public: - std::string joint_name; - std::string joint_english_name; - PmxJointType joint_type; - PmxJointParam param; - void Read(std::istream *stream, PmxSetting *setting); - }; - - enum PmxSoftBodyFlag : uint8_t - { - BLink = 0x01, - Cluster = 0x02, - Link = 0x04 - }; - - class PmxAncherRigidBody - { - public: - PmxAncherRigidBody() - : related_rigid_body(0) - , related_vertex(0) - , is_near(false) - {} - int related_rigid_body; - int related_vertex; - bool is_near; - void Read(std::istream *stream, PmxSetting *setting); - }; - - class PmxSoftBody - { - public: - PmxSoftBody() - : shape(0) - , target_material(0) - , group(0) - , mask(0) - , blink_distance(0) - , cluster_count(0) - , mass(0.0) - , collisioni_margin(0.0) - , aero_model(0) - , VCF(0.0f) - , DP(0.0f) - , DG(0.0f) - , LF(0.0f) - , PR(0.0f) - , VC(0.0f) - , DF(0.0f) - , MT(0.0f) - , CHR(0.0f) - , KHR(0.0f) - , SHR(0.0f) - , AHR(0.0f) - , SRHR_CL(0.0f) - , SKHR_CL(0.0f) - , SSHR_CL(0.0f) - , SR_SPLT_CL(0.0f) - , SK_SPLT_CL(0.0f) - , SS_SPLT_CL(0.0f) - , V_IT(0) - , P_IT(0) - , D_IT(0) - , C_IT(0) - , LST(0.0f) - , AST(0.0f) - , VST(0.0f) - , anchor_count(0) - , pin_vertex_count(0) - {} - std::string soft_body_name; - std::string soft_body_english_name; - uint8_t shape; - int target_material; - uint8_t group; - uint16_t mask; - PmxSoftBodyFlag flag; - int blink_distance; - int cluster_count; - float mass; - float collisioni_margin; - int aero_model; - float VCF; - float DP; - float DG; - float LF; - float PR; - float VC; - float DF; - float MT; - float CHR; - float KHR; - float SHR; - float AHR; - float SRHR_CL; - float SKHR_CL; - float SSHR_CL; - float SR_SPLT_CL; - float SK_SPLT_CL; - float SS_SPLT_CL; - int V_IT; - int P_IT; - int D_IT; - int C_IT; - float LST; - float AST; - float VST; - int anchor_count; - std::unique_ptr<PmxAncherRigidBody []> anchers; - int pin_vertex_count; - std::unique_ptr<int []> pin_vertices; - void Read(std::istream *stream, PmxSetting *setting); - }; - - class PmxModel - { - public: - PmxModel() - : version(0.0f) - , vertex_count(0) - , index_count(0) - , texture_count(0) - , material_count(0) - , bone_count(0) - , morph_count(0) - , frame_count(0) - , rigid_body_count(0) - , joint_count(0) - , soft_body_count(0) - {} - - float version; - PmxSetting setting; - std::string model_name; - std::string model_english_name; - std::string model_comment; - std::string model_english_comment; - int vertex_count; - std::unique_ptr<PmxVertex []> vertices; - int index_count; - std::unique_ptr<int []> indices; - int texture_count; - std::unique_ptr< std::string []> textures; - int material_count; - std::unique_ptr<PmxMaterial []> materials; - int bone_count; - std::unique_ptr<PmxBone []> bones; - int morph_count; - std::unique_ptr<PmxMorph []> morphs; - int frame_count; - std::unique_ptr<PmxFrame [] > frames; - int rigid_body_count; - std::unique_ptr<PmxRigidBody []> rigid_bodies; - int joint_count; - std::unique_ptr<PmxJoint []> joints; - int soft_body_count; - std::unique_ptr<PmxSoftBody []> soft_bodies; - void Init(); - void Read(std::istream *stream); - //static std::unique_ptr<PmxModel> ReadFromFile(const char *filename); - //static std::unique_ptr<PmxModel> ReadFromStream(std::istream *stream); - }; -} diff --git a/thirdparty/assimp/code/MMD/MMDVmdParser.h b/thirdparty/assimp/code/MMD/MMDVmdParser.h deleted file mode 100644 index 947c3a2422..0000000000 --- a/thirdparty/assimp/code/MMD/MMDVmdParser.h +++ /dev/null @@ -1,376 +0,0 @@ -/* -Open Asset Import Library (assimp) ----------------------------------------------------------------------- - -Copyright (c) 2006-2019, assimp team - - -All rights reserved. - -Redistribution and use of this software in source and binary forms, -with or without modification, are permitted provided that the -following conditions are met: - -* Redistributions of source code must retain the above -copyright notice, this list of conditions and the -following disclaimer. - -* Redistributions in binary form must reproduce the above -copyright notice, this list of conditions and the -following disclaimer in the documentation and/or other -materials provided with the distribution. - -* Neither the name of the assimp team, nor the names of its -contributors may be used to endorse or promote products -derived from this software without specific prior -written permission of the assimp team. - -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS -"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT -LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR -A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT -OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, -SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT -LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, -DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY -THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE -OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - ----------------------------------------------------------------------- -*/ -#pragma once - -#include <vector> -#include <string> -#include <memory> -#include <iostream> -#include <fstream> -#include <ostream> -#include "MMDCpp14.h" - -namespace vmd -{ - class VmdBoneFrame - { - public: - std::string name; - int frame; - float position[3]; - float orientation[4]; - char interpolation[4][4][4]; - - void Read(std::istream* stream) - { - char buffer[15]; - stream->read((char*) buffer, sizeof(char)*15); - name = std::string(buffer); - stream->read((char*) &frame, sizeof(int)); - stream->read((char*) position, sizeof(float)*3); - stream->read((char*) orientation, sizeof(float)*4); - stream->read((char*) interpolation, sizeof(char) * 4 * 4 * 4); - } - - void Write(std::ostream* stream) - { - stream->write((char*)name.c_str(), sizeof(char) * 15); - stream->write((char*)&frame, sizeof(int)); - stream->write((char*)position, sizeof(float) * 3); - stream->write((char*)orientation, sizeof(float) * 4); - stream->write((char*)interpolation, sizeof(char) * 4 * 4 * 4); - } - }; - - class VmdFaceFrame - { - public: - std::string face_name; - float weight; - uint32_t frame; - - void Read(std::istream* stream) - { - char buffer[15]; - stream->read((char*) &buffer, sizeof(char) * 15); - face_name = std::string(buffer); - stream->read((char*) &frame, sizeof(int)); - stream->read((char*) &weight, sizeof(float)); - } - - void Write(std::ostream* stream) - { - stream->write((char*)face_name.c_str(), sizeof(char) * 15); - stream->write((char*)&frame, sizeof(int)); - stream->write((char*)&weight, sizeof(float)); - } - }; - - class VmdCameraFrame - { - public: - int frame; - float distance; - float position[3]; - float orientation[3]; - char interpolation[6][4]; - float angle; - char unknown[3]; - - void Read(std::istream *stream) - { - stream->read((char*) &frame, sizeof(int)); - stream->read((char*) &distance, sizeof(float)); - stream->read((char*) position, sizeof(float) * 3); - stream->read((char*) orientation, sizeof(float) * 3); - stream->read((char*) interpolation, sizeof(char) * 24); - stream->read((char*) &angle, sizeof(float)); - stream->read((char*) unknown, sizeof(char) * 3); - } - - void Write(std::ostream *stream) - { - stream->write((char*)&frame, sizeof(int)); - stream->write((char*)&distance, sizeof(float)); - stream->write((char*)position, sizeof(float) * 3); - stream->write((char*)orientation, sizeof(float) * 3); - stream->write((char*)interpolation, sizeof(char) * 24); - stream->write((char*)&angle, sizeof(float)); - stream->write((char*)unknown, sizeof(char) * 3); - } - }; - - class VmdLightFrame - { - public: - int frame; - float color[3]; - float position[3]; - - void Read(std::istream* stream) - { - stream->read((char*) &frame, sizeof(int)); - stream->read((char*) color, sizeof(float) * 3); - stream->read((char*) position, sizeof(float) * 3); - } - - void Write(std::ostream* stream) - { - stream->write((char*)&frame, sizeof(int)); - stream->write((char*)color, sizeof(float) * 3); - stream->write((char*)position, sizeof(float) * 3); - } - }; - - class VmdIkEnable - { - public: - std::string ik_name; - bool enable; - }; - - class VmdIkFrame - { - public: - int frame; - bool display; - std::vector<VmdIkEnable> ik_enable; - - void Read(std::istream *stream) - { - char buffer[20]; - stream->read((char*) &frame, sizeof(int)); - stream->read((char*) &display, sizeof(uint8_t)); - int ik_count; - stream->read((char*) &ik_count, sizeof(int)); - ik_enable.resize(ik_count); - for (int i = 0; i < ik_count; i++) - { - stream->read(buffer, 20); - ik_enable[i].ik_name = std::string(buffer); - stream->read((char*) &ik_enable[i].enable, sizeof(uint8_t)); - } - } - - void Write(std::ostream *stream) - { - stream->write((char*)&frame, sizeof(int)); - stream->write((char*)&display, sizeof(uint8_t)); - int ik_count = static_cast<int>(ik_enable.size()); - stream->write((char*)&ik_count, sizeof(int)); - for (int i = 0; i < ik_count; i++) - { - const VmdIkEnable& ik_enable = this->ik_enable.at(i); - stream->write(ik_enable.ik_name.c_str(), 20); - stream->write((char*)&ik_enable.enable, sizeof(uint8_t)); - } - } - }; - - class VmdMotion - { - public: - std::string model_name; - int version; - std::vector<VmdBoneFrame> bone_frames; - std::vector<VmdFaceFrame> face_frames; - std::vector<VmdCameraFrame> camera_frames; - std::vector<VmdLightFrame> light_frames; - std::vector<VmdIkFrame> ik_frames; - - static std::unique_ptr<VmdMotion> LoadFromFile(char const *filename) - { - std::ifstream stream(filename, std::ios::binary); - auto result = LoadFromStream(&stream); - stream.close(); - return result; - } - - static std::unique_ptr<VmdMotion> LoadFromStream(std::ifstream *stream) - { - - char buffer[30]; - auto result = mmd::make_unique<VmdMotion>(); - - // magic and version - stream->read((char*) buffer, 30); - if (strncmp(buffer, "Vocaloid Motion Data", 20)) - { - std::cerr << "invalid vmd file." << std::endl; - return nullptr; - } - result->version = std::atoi(buffer + 20); - - // name - stream->read(buffer, 20); - result->model_name = std::string(buffer); - - // bone frames - int bone_frame_num; - stream->read((char*) &bone_frame_num, sizeof(int)); - result->bone_frames.resize(bone_frame_num); - for (int i = 0; i < bone_frame_num; i++) - { - result->bone_frames[i].Read(stream); - } - - // face frames - int face_frame_num; - stream->read((char*) &face_frame_num, sizeof(int)); - result->face_frames.resize(face_frame_num); - for (int i = 0; i < face_frame_num; i++) - { - result->face_frames[i].Read(stream); - } - - // camera frames - int camera_frame_num; - stream->read((char*) &camera_frame_num, sizeof(int)); - result->camera_frames.resize(camera_frame_num); - for (int i = 0; i < camera_frame_num; i++) - { - result->camera_frames[i].Read(stream); - } - - // light frames - int light_frame_num; - stream->read((char*) &light_frame_num, sizeof(int)); - result->light_frames.resize(light_frame_num); - for (int i = 0; i < light_frame_num; i++) - { - result->light_frames[i].Read(stream); - } - - // unknown2 - stream->read(buffer, 4); - - // ik frames - if (stream->peek() != std::ios::traits_type::eof()) - { - int ik_num; - stream->read((char*) &ik_num, sizeof(int)); - result->ik_frames.resize(ik_num); - for (int i = 0; i < ik_num; i++) - { - result->ik_frames[i].Read(stream); - } - } - - if (stream->peek() != std::ios::traits_type::eof()) - { - std::cerr << "vmd stream has unknown data." << std::endl; - } - - return result; - } - - bool SaveToFile(const std::u16string& /*filename*/) - { - // TODO: How to adapt u16string to string? - /* - std::ofstream stream(filename.c_str(), std::ios::binary); - auto result = SaveToStream(&stream); - stream.close(); - return result; - */ - return false; - } - - bool SaveToStream(std::ofstream *stream) - { - std::string magic = "Vocaloid Motion Data 0002\0"; - magic.resize(30); - - // magic and version - stream->write(magic.c_str(), 30); - - // name - stream->write(model_name.c_str(), 20); - - // bone frames - const int bone_frame_num = static_cast<int>(bone_frames.size()); - stream->write(reinterpret_cast<const char*>(&bone_frame_num), sizeof(int)); - for (int i = 0; i < bone_frame_num; i++) - { - bone_frames[i].Write(stream); - } - - // face frames - const int face_frame_num = static_cast<int>(face_frames.size()); - stream->write(reinterpret_cast<const char*>(&face_frame_num), sizeof(int)); - for (int i = 0; i < face_frame_num; i++) - { - face_frames[i].Write(stream); - } - - // camera frames - const int camera_frame_num = static_cast<int>(camera_frames.size()); - stream->write(reinterpret_cast<const char*>(&camera_frame_num), sizeof(int)); - for (int i = 0; i < camera_frame_num; i++) - { - camera_frames[i].Write(stream); - } - - // light frames - const int light_frame_num = static_cast<int>(light_frames.size()); - stream->write(reinterpret_cast<const char*>(&light_frame_num), sizeof(int)); - for (int i = 0; i < light_frame_num; i++) - { - light_frames[i].Write(stream); - } - - // self shadow datas - const int self_shadow_num = 0; - stream->write(reinterpret_cast<const char*>(&self_shadow_num), sizeof(int)); - - // ik frames - const int ik_num = static_cast<int>(ik_frames.size()); - stream->write(reinterpret_cast<const char*>(&ik_num), sizeof(int)); - for (int i = 0; i < ik_num; i++) - { - ik_frames[i].Write(stream); - } - - return true; - } - }; -} diff --git a/thirdparty/assimp/code/Material/MaterialSystem.cpp b/thirdparty/assimp/code/Material/MaterialSystem.cpp index d0b39093b6..0be6e9f7bb 100644 --- a/thirdparty/assimp/code/Material/MaterialSystem.cpp +++ b/thirdparty/assimp/code/Material/MaterialSystem.cpp @@ -51,7 +51,6 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #include <assimp/types.h> #include <assimp/material.h> #include <assimp/DefaultLogger.hpp> -#include <assimp/Macros.h> using namespace Assimp; @@ -545,23 +544,7 @@ aiReturn aiMaterial::AddProperty (const aiString* pInput, unsigned int type, unsigned int index) { - // We don't want to add the whole buffer .. write a 32 bit length - // prefix followed by the zero-terminated UTF8 string. - // (HACK) I don't want to break the ABI now, but we definitely - // ought to change aiString::mLength to uint32_t one day. - if (sizeof(size_t) == 8) { - aiString copy = *pInput; - uint32_t* s = reinterpret_cast<uint32_t*>(©.length); - s[1] = static_cast<uint32_t>(pInput->length); - - return AddBinaryProperty(s+1, - static_cast<unsigned int>(pInput->length+1+4), - pKey, - type, - index, - aiPTI_String); - } - ai_assert(sizeof(size_t)==4); + ai_assert(sizeof(ai_uint32)==4); return AddBinaryProperty(pInput, static_cast<unsigned int>(pInput->length+1+4), pKey, diff --git a/thirdparty/assimp/code/PostProcessing/ArmaturePopulate.cpp b/thirdparty/assimp/code/PostProcessing/ArmaturePopulate.cpp new file mode 100644 index 0000000000..75daeb6b59 --- /dev/null +++ b/thirdparty/assimp/code/PostProcessing/ArmaturePopulate.cpp @@ -0,0 +1,268 @@ +/* +Open Asset Import Library (assimp) +---------------------------------------------------------------------- + +Copyright (c) 2006-2019, assimp team + + +All rights reserved. + +Redistribution and use of this software in source and binary forms, +with or without modification, are permitted provided that the +following conditions are met: + +* Redistributions of source code must retain the above +copyright notice, this list of conditions and the +following disclaimer. + +* Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the +following disclaimer in the documentation and/or other +materials provided with the distribution. + +* Neither the name of the assimp team, nor the names of its +contributors may be used to endorse or promote products +derived from this software without specific prior +written permission of the assimp team. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +---------------------------------------------------------------------- +*/ +#include "ArmaturePopulate.h" + +#include <assimp/BaseImporter.h> +#include <assimp/DefaultLogger.hpp> +#include <assimp/postprocess.h> +#include <assimp/scene.h> +#include <iostream> + +namespace Assimp { + +/// The default class constructor. +ArmaturePopulate::ArmaturePopulate() : BaseProcess() +{} + +/// The class destructor. +ArmaturePopulate::~ArmaturePopulate() +{} + +bool ArmaturePopulate::IsActive(unsigned int pFlags) const { + return (pFlags & aiProcess_PopulateArmatureData) != 0; +} + +void ArmaturePopulate::SetupProperties(const Importer *pImp) { + // do nothing +} + +void ArmaturePopulate::Execute(aiScene *out) { + + // Now convert all bone positions to the correct mOffsetMatrix + std::vector<aiBone *> bones; + std::vector<aiNode *> nodes; + std::map<aiBone *, aiNode *> bone_stack; + BuildBoneList(out->mRootNode, out->mRootNode, out, bones); + BuildNodeList(out->mRootNode, nodes); + + BuildBoneStack(out->mRootNode, out->mRootNode, out, bones, bone_stack, nodes); + + ASSIMP_LOG_DEBUG_F("Bone stack size: ", bone_stack.size()); + + for (std::pair<aiBone *, aiNode *> kvp : bone_stack) { + aiBone *bone = kvp.first; + aiNode *bone_node = kvp.second; + ASSIMP_LOG_DEBUG_F("active node lookup: ", bone->mName.C_Str()); + // lcl transform grab - done in generate_nodes :) + + // bone->mOffsetMatrix = bone_node->mTransformation; + aiNode *armature = GetArmatureRoot(bone_node, bones); + + ai_assert(armature); + + // set up bone armature id + bone->mArmature = armature; + + // set this bone node to be referenced properly + ai_assert(bone_node); + bone->mNode = bone_node; + } +} + + +/* Reprocess all nodes to calculate bone transforms properly based on the REAL + * mOffsetMatrix not the local. */ +/* Before this would use mesh transforms which is wrong for bone transforms */ +/* Before this would work for simple character skeletons but not complex meshes + * with multiple origins */ +/* Source: sketch fab log cutter fbx */ +void ArmaturePopulate::BuildBoneList(aiNode *current_node, + const aiNode *root_node, + const aiScene *scene, + std::vector<aiBone *> &bones) { + ai_assert(scene); + for (unsigned int nodeId = 0; nodeId < current_node->mNumChildren; ++nodeId) { + aiNode *child = current_node->mChildren[nodeId]; + ai_assert(child); + + // check for bones + for (unsigned int meshId = 0; meshId < child->mNumMeshes; ++meshId) { + ai_assert(child->mMeshes); + unsigned int mesh_index = child->mMeshes[meshId]; + aiMesh *mesh = scene->mMeshes[mesh_index]; + ai_assert(mesh); + + for (unsigned int boneId = 0; boneId < mesh->mNumBones; ++boneId) { + aiBone *bone = mesh->mBones[boneId]; + ai_assert(bone); + + // duplicate meshes exist with the same bones sometimes :) + // so this must be detected + if (std::find(bones.begin(), bones.end(), bone) == bones.end()) { + // add the element once + bones.push_back(bone); + } + } + + // find mesh and get bones + // then do recursive lookup for bones in root node hierarchy + } + + BuildBoneList(child, root_node, scene, bones); + } +} + +/* Prepare flat node list which can be used for non recursive lookups later */ +void ArmaturePopulate::BuildNodeList(const aiNode *current_node, + std::vector<aiNode *> &nodes) { + ai_assert(current_node); + + for (unsigned int nodeId = 0; nodeId < current_node->mNumChildren; ++nodeId) { + aiNode *child = current_node->mChildren[nodeId]; + ai_assert(child); + + nodes.push_back(child); + + BuildNodeList(child, nodes); + } +} + +/* A bone stack allows us to have multiple armatures, with the same bone names + * A bone stack allows us also to retrieve bones true transform even with + * duplicate names :) + */ +void ArmaturePopulate::BuildBoneStack(aiNode *current_node, + const aiNode *root_node, + const aiScene *scene, + const std::vector<aiBone *> &bones, + std::map<aiBone *, aiNode *> &bone_stack, + std::vector<aiNode *> &node_stack) { + ai_assert(scene); + ai_assert(root_node); + ai_assert(!node_stack.empty()); + + for (aiBone *bone : bones) { + ai_assert(bone); + aiNode *node = GetNodeFromStack(bone->mName, node_stack); + if (node == nullptr) { + node_stack.clear(); + BuildNodeList(root_node, node_stack); + ASSIMP_LOG_DEBUG_F("Resetting bone stack: nullptr element ", bone->mName.C_Str()); + + node = GetNodeFromStack(bone->mName, node_stack); + + if (!node) { + ASSIMP_LOG_ERROR("serious import issue node for bone was not detected"); + continue; + } + } + + ASSIMP_LOG_DEBUG_F("Successfully added bone[", bone->mName.C_Str(), "] to stack and bone node is: ", node->mName.C_Str()); + + bone_stack.insert(std::pair<aiBone *, aiNode *>(bone, node)); + } +} + + +/* Returns the armature root node */ +/* This is required to be detected for a bone initially, it will recurse up + * until it cannot find another bone and return the node No known failure + * points. (yet) + */ +aiNode *ArmaturePopulate::GetArmatureRoot(aiNode *bone_node, + std::vector<aiBone *> &bone_list) { + while (bone_node) { + if (!IsBoneNode(bone_node->mName, bone_list)) { + ASSIMP_LOG_DEBUG_F("GetArmatureRoot() Found valid armature: ", bone_node->mName.C_Str()); + return bone_node; + } + + bone_node = bone_node->mParent; + } + + ASSIMP_LOG_ERROR("GetArmatureRoot() can't find armature!"); + + return nullptr; +} + + + +/* Simple IsBoneNode check if this could be a bone */ +bool ArmaturePopulate::IsBoneNode(const aiString &bone_name, + std::vector<aiBone *> &bones) { + for (aiBone *bone : bones) { + if (bone->mName == bone_name) { + return true; + } + } + + return false; +} + +/* Pop this node by name from the stack if found */ +/* Used in multiple armature situations with duplicate node / bone names */ +/* Known flaw: cannot have nodes with bone names, will be fixed in later release + */ +/* (serious to be fixed) Known flaw: nodes which have more than one bone could + * be prematurely dropped from stack */ +aiNode *ArmaturePopulate::GetNodeFromStack(const aiString &node_name, + std::vector<aiNode *> &nodes) { + std::vector<aiNode *>::iterator iter; + aiNode *found = nullptr; + for (iter = nodes.begin(); iter < nodes.end(); ++iter) { + aiNode *element = *iter; + ai_assert(element); + // node valid and node name matches + if (element->mName == node_name) { + found = element; + break; + } + } + + if (found != nullptr) { + ASSIMP_LOG_INFO_F("Removed node from stack: ", found->mName.C_Str()); + // now pop the element from the node list + nodes.erase(iter); + + return found; + } + + // unique names can cause this problem + ASSIMP_LOG_ERROR("[Serious] GetNodeFromStack() can't find node from stack!"); + + return nullptr; +} + + + + +} // Namespace Assimp diff --git a/thirdparty/assimp/code/PostProcessing/ArmaturePopulate.h b/thirdparty/assimp/code/PostProcessing/ArmaturePopulate.h new file mode 100644 index 0000000000..aa1ad7c80c --- /dev/null +++ b/thirdparty/assimp/code/PostProcessing/ArmaturePopulate.h @@ -0,0 +1,112 @@ +/* +Open Asset Import Library (assimp) +---------------------------------------------------------------------- + +Copyright (c) 2006-2019, assimp team + + +All rights reserved. + +Redistribution and use of this software in source and binary forms, +with or without modification, are permitted provided that the +following conditions are met: + +* Redistributions of source code must retain the above +copyright notice, this list of conditions and the +following disclaimer. + +* Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the +following disclaimer in the documentation and/or other +materials provided with the distribution. + +* Neither the name of the assimp team, nor the names of its +contributors may be used to endorse or promote products +derived from this software without specific prior +written permission of the assimp team. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +---------------------------------------------------------------------- +*/ +#ifndef ARMATURE_POPULATE_H_ +#define ARMATURE_POPULATE_H_ + +#include "Common/BaseProcess.h" +#include <assimp/BaseImporter.h> +#include <vector> +#include <map> + + +struct aiNode; +struct aiBone; + +namespace Assimp { + +// --------------------------------------------------------------------------- +/** Armature Populate: This is a post process designed + * To save you time when importing models into your game engines + * This was originally designed only for fbx but will work with other formats + * it is intended to auto populate aiBone data with armature and the aiNode + * This is very useful when dealing with skinned meshes + * or when dealing with many different skeletons + * It's off by default but recommend that you try it and use it + * It should reduce down any glue code you have in your + * importers + * You can contact RevoluPowered <gordon@gordonite.tech> + * For more info about this +*/ +class ASSIMP_API ArmaturePopulate : public BaseProcess { +public: + /// The default class constructor. + ArmaturePopulate(); + + /// The class destructor. + virtual ~ArmaturePopulate(); + + /// Overwritten, @see BaseProcess + virtual bool IsActive( unsigned int pFlags ) const; + + /// Overwritten, @see BaseProcess + virtual void SetupProperties( const Importer* pImp ); + + /// Overwritten, @see BaseProcess + virtual void Execute( aiScene* pScene ); + + static aiNode *GetArmatureRoot(aiNode *bone_node, + std::vector<aiBone *> &bone_list); + + static bool IsBoneNode(const aiString &bone_name, + std::vector<aiBone *> &bones); + + static aiNode *GetNodeFromStack(const aiString &node_name, + std::vector<aiNode *> &nodes); + + static void BuildNodeList(const aiNode *current_node, + std::vector<aiNode *> &nodes); + + static void BuildBoneList(aiNode *current_node, const aiNode *root_node, + const aiScene *scene, + std::vector<aiBone *> &bones); + + static void BuildBoneStack(aiNode *current_node, const aiNode *root_node, + const aiScene *scene, + const std::vector<aiBone *> &bones, + std::map<aiBone *, aiNode *> &bone_stack, + std::vector<aiNode *> &node_stack); +}; + +} // Namespace Assimp + + +#endif // SCALE_PROCESS_H_
\ No newline at end of file diff --git a/thirdparty/assimp/code/PostProcessing/ComputeUVMappingProcess.cpp b/thirdparty/assimp/code/PostProcessing/ComputeUVMappingProcess.cpp index bb571a551b..df4d44337d 100644 --- a/thirdparty/assimp/code/PostProcessing/ComputeUVMappingProcess.cpp +++ b/thirdparty/assimp/code/PostProcessing/ComputeUVMappingProcess.cpp @@ -354,12 +354,12 @@ void ComputeUVMappingProcess::ComputePlaneMapping(aiMesh* mesh,const aiVector3D& } else if (axis * base_axis_z >= angle_epsilon) { FindMeshCenter(mesh, center, min, max); - diffu = max.y - min.y; - diffv = max.z - min.z; + diffu = max.x - min.x; + diffv = max.y - min.y; for (unsigned int pnt = 0; pnt < mesh->mNumVertices;++pnt) { const aiVector3D& pos = mesh->mVertices[pnt]; - out[pnt].Set((pos.y - min.y) / diffu,(pos.x - min.x) / diffv,0.0); + out[pnt].Set((pos.x - min.x) / diffu,(pos.y - min.y) / diffv,0.0); } } // slower code path in case the mapping axis is not one of the coordinate system axes diff --git a/thirdparty/assimp/code/PostProcessing/FindInvalidDataProcess.cpp b/thirdparty/assimp/code/PostProcessing/FindInvalidDataProcess.cpp index 433f042448..016884c6e7 100644 --- a/thirdparty/assimp/code/PostProcessing/FindInvalidDataProcess.cpp +++ b/thirdparty/assimp/code/PostProcessing/FindInvalidDataProcess.cpp @@ -52,7 +52,6 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #include "FindInvalidDataProcess.h" #include "ProcessHelper.h" -#include <assimp/Macros.h> #include <assimp/Exceptional.h> #include <assimp/qnan.h> diff --git a/thirdparty/assimp/code/PostProcessing/JoinVerticesProcess.cpp b/thirdparty/assimp/code/PostProcessing/JoinVerticesProcess.cpp index 914ec05b46..f121fc60d3 100644 --- a/thirdparty/assimp/code/PostProcessing/JoinVerticesProcess.cpp +++ b/thirdparty/assimp/code/PostProcessing/JoinVerticesProcess.cpp @@ -431,31 +431,6 @@ int JoinVerticesProcess::ProcessMesh( aiMesh* pMesh, unsigned int meshIndex) bone->mWeights = new aiVertexWeight[bone->mNumWeights]; memcpy( bone->mWeights, &newWeights[0], bone->mNumWeights * sizeof( aiVertexWeight)); } - else { - - /* NOTE: - * - * In the algorithm above we're assuming that there are no vertices - * with a different bone weight setup at the same position. That wouldn't - * make sense, but it is not absolutely impossible. SkeletonMeshBuilder - * for example generates such input data if two skeleton points - * share the same position. Again this doesn't make sense but is - * reality for some model formats (MD5 for example uses these special - * nodes as attachment tags for its weapons). - * - * Then it is possible that a bone has no weights anymore .... as a quick - * workaround, we're just removing these bones. If they're animated, - * model geometry might be modified but at least there's no risk of a crash. - */ - delete bone; - --pMesh->mNumBones; - for (unsigned int n = a; n < pMesh->mNumBones; ++n) { - pMesh->mBones[n] = pMesh->mBones[n+1]; - } - - --a; - ASSIMP_LOG_WARN("Removing bone -> no weights remaining"); - } } return pMesh->mNumVertices; } diff --git a/thirdparty/assimp/code/PostProcessing/MakeVerboseFormat.cpp b/thirdparty/assimp/code/PostProcessing/MakeVerboseFormat.cpp index 50ff5ed93d..41f50a5ba5 100644 --- a/thirdparty/assimp/code/PostProcessing/MakeVerboseFormat.cpp +++ b/thirdparty/assimp/code/PostProcessing/MakeVerboseFormat.cpp @@ -224,3 +224,32 @@ bool MakeVerboseFormatProcess::MakeVerboseFormat(aiMesh* pcMesh) } return (pcMesh->mNumVertices != iOldNumVertices); } + + +// ------------------------------------------------------------------------------------------------ +bool IsMeshInVerboseFormat(const aiMesh* mesh) { + // avoid slow vector<bool> specialization + std::vector<unsigned int> seen(mesh->mNumVertices,0); + for(unsigned int i = 0; i < mesh->mNumFaces; ++i) { + const aiFace& f = mesh->mFaces[i]; + for(unsigned int j = 0; j < f.mNumIndices; ++j) { + if(++seen[f.mIndices[j]] == 2) { + // found a duplicate index + return false; + } + } + } + + return true; +} + +// ------------------------------------------------------------------------------------------------ +bool MakeVerboseFormatProcess::IsVerboseFormat(const aiScene* pScene) { + for(unsigned int i = 0; i < pScene->mNumMeshes; ++i) { + if(!IsMeshInVerboseFormat(pScene->mMeshes[i])) { + return false; + } + } + + return true; +} diff --git a/thirdparty/assimp/code/PostProcessing/MakeVerboseFormat.h b/thirdparty/assimp/code/PostProcessing/MakeVerboseFormat.h index 1adf8e2f69..8565d5933a 100644 --- a/thirdparty/assimp/code/PostProcessing/MakeVerboseFormat.h +++ b/thirdparty/assimp/code/PostProcessing/MakeVerboseFormat.h @@ -94,6 +94,13 @@ public: * @param pScene The imported data to work at. */ void Execute( aiScene* pScene); +public: + + // ------------------------------------------------------------------- + /** Checks whether the scene is already in verbose format. + * @param pScene The data to check. + * @return true if the scene is already in verbose format. */ + static bool IsVerboseFormat(const aiScene* pScene); private: diff --git a/thirdparty/assimp/code/PostProcessing/ValidateDataStructure.cpp b/thirdparty/assimp/code/PostProcessing/ValidateDataStructure.cpp index 712fd6943d..75d1b6ef78 100644 --- a/thirdparty/assimp/code/PostProcessing/ValidateDataStructure.cpp +++ b/thirdparty/assimp/code/PostProcessing/ValidateDataStructure.cpp @@ -538,13 +538,17 @@ void ValidateDSProcess::Validate( const aiAnimation* pAnimation) { Validate(&pAnimation->mName); - // validate all materials - if (pAnimation->mNumChannels) + // validate all animations + if (pAnimation->mNumChannels || pAnimation->mNumMorphMeshChannels) { - if (!pAnimation->mChannels) { + if (!pAnimation->mChannels && pAnimation->mNumChannels) { ReportError("aiAnimation::mChannels is NULL (aiAnimation::mNumChannels is %i)", pAnimation->mNumChannels); } + if (!pAnimation->mMorphMeshChannels && pAnimation->mNumMorphMeshChannels) { + ReportError("aiAnimation::mMorphMeshChannels is NULL (aiAnimation::mNumMorphMeshChannels is %i)", + pAnimation->mNumMorphMeshChannels); + } for (unsigned int i = 0; i < pAnimation->mNumChannels;++i) { if (!pAnimation->mChannels[i]) @@ -554,6 +558,15 @@ void ValidateDSProcess::Validate( const aiAnimation* pAnimation) } Validate(pAnimation, pAnimation->mChannels[i]); } + for (unsigned int i = 0; i < pAnimation->mNumMorphMeshChannels;++i) + { + if (!pAnimation->mMorphMeshChannels[i]) + { + ReportError("aiAnimation::mMorphMeshChannels[%i] is NULL (aiAnimation::mNumMorphMeshChannels is %i)", + i, pAnimation->mNumMorphMeshChannels); + } + Validate(pAnimation, pAnimation->mMorphMeshChannels[i]); + } } else { ReportError("aiAnimation::mNumChannels is 0. At least one node animation channel must be there."); @@ -903,6 +916,48 @@ void ValidateDSProcess::Validate( const aiAnimation* pAnimation, } } +void ValidateDSProcess::Validate( const aiAnimation* pAnimation, + const aiMeshMorphAnim* pMeshMorphAnim) +{ + Validate(&pMeshMorphAnim->mName); + + if (!pMeshMorphAnim->mNumKeys) { + ReportError("Empty mesh morph animation channel"); + } + + // otherwise check whether one of the keys exceeds the total duration of the animation + if (pMeshMorphAnim->mNumKeys) + { + if (!pMeshMorphAnim->mKeys) + { + ReportError("aiMeshMorphAnim::mKeys is NULL (aiMeshMorphAnim::mNumKeys is %i)", + pMeshMorphAnim->mNumKeys); + } + double dLast = -10e10; + for (unsigned int i = 0; i < pMeshMorphAnim->mNumKeys;++i) + { + // ScenePreprocessor will compute the duration if still the default value + // (Aramis) Add small epsilon, comparison tended to fail if max_time == duration, + // seems to be due the compilers register usage/width. + if (pAnimation->mDuration > 0. && pMeshMorphAnim->mKeys[i].mTime > pAnimation->mDuration+0.001) + { + ReportError("aiMeshMorphAnim::mKeys[%i].mTime (%.5f) is larger " + "than aiAnimation::mDuration (which is %.5f)",i, + (float)pMeshMorphAnim->mKeys[i].mTime, + (float)pAnimation->mDuration); + } + if (i && pMeshMorphAnim->mKeys[i].mTime <= dLast) + { + ReportWarning("aiMeshMorphAnim::mKeys[%i].mTime (%.5f) is smaller " + "than aiMeshMorphAnim::mKeys[%i] (which is %.5f)",i, + (float)pMeshMorphAnim->mKeys[i].mTime, + i-1, (float)dLast); + } + dLast = pMeshMorphAnim->mKeys[i].mTime; + } + } +} + // ------------------------------------------------------------------------------------------------ void ValidateDSProcess::Validate( const aiNode* pNode) { @@ -958,7 +1013,7 @@ void ValidateDSProcess::Validate( const aiString* pString) { if (pString->length > MAXLEN) { - ReportError("aiString::length is too large (%lu, maximum is %lu)", + ReportError("aiString::length is too large (%u, maximum is %lu)", pString->length,MAXLEN); } const char* sz = pString->data; diff --git a/thirdparty/assimp/code/PostProcessing/ValidateDataStructure.h b/thirdparty/assimp/code/PostProcessing/ValidateDataStructure.h index 0b891ef414..7b309c9251 100644 --- a/thirdparty/assimp/code/PostProcessing/ValidateDataStructure.h +++ b/thirdparty/assimp/code/PostProcessing/ValidateDataStructure.h @@ -55,6 +55,7 @@ struct aiBone; struct aiMesh; struct aiAnimation; struct aiNodeAnim; +struct aiMeshMorphAnim; struct aiTexture; struct aiMaterial; struct aiNode; @@ -150,6 +151,13 @@ protected: void Validate( const aiAnimation* pAnimation, const aiNodeAnim* pBoneAnim); + /** Validates a mesh morph animation channel. + * @param pAnimation Input animation. + * @param pMeshMorphAnim Mesh morph animation channel. + * */ + void Validate( const aiAnimation* pAnimation, + const aiMeshMorphAnim* pMeshMorphAnim); + // ------------------------------------------------------------------- /** Validates a node and all of its subnodes * @param Node Input node*/ diff --git a/thirdparty/assimp/code/revision.h b/thirdparty/assimp/code/revision.h index 88872aef22..66eb875303 100644 --- a/thirdparty/assimp/code/revision.h +++ b/thirdparty/assimp/code/revision.h @@ -1,7 +1,28 @@ #ifndef ASSIMP_REVISION_H_INC #define ASSIMP_REVISION_H_INC -#define GitVersion 0x00000000 +#define GitVersion 0x308db73d #define GitBranch "master" +#define VER_MAJOR 5 +#define VER_MINOR 0 +#define VER_PATCH 0 +#define VER_BUILD 0 + +#define STR_HELP(x) #x +#define STR(x) STR_HELP(x) + +#define VER_FILEVERSION VER_MAJOR,VER_MINOR,VER_PATCH,VER_BUILD +#if (GitVersion == 0) +#define VER_FILEVERSION_STR STR(VER_MAJOR) "." STR(VER_MINOR) "." STR(VER_PATCH) "." STR(VER_BUILD) +#else +#define VER_FILEVERSION_STR STR(VER_MAJOR) "." STR(VER_MINOR) "." STR(VER_PATCH) "." STR(VER_BUILD) " (Commit 308db73d)" +#endif + +#ifdef NDEBUG +#define VER_ORIGINAL_FILENAME_STR "assimp.dll" +#else +#define VER_ORIGINAL_FILENAME_STR "assimp.dll" +#endif // NDEBUG + #endif // ASSIMP_REVISION_H_INC diff --git a/thirdparty/assimp/include/assimp/.editorconfig b/thirdparty/assimp/include/assimp/.editorconfig new file mode 100644 index 0000000000..9ea66423ad --- /dev/null +++ b/thirdparty/assimp/include/assimp/.editorconfig @@ -0,0 +1,8 @@ +# See <http://EditorConfig.org> for details + +[*.{h,hpp,inl}] +end_of_line = lf +insert_final_newline = true +trim_trailing_whitespace = true +indent_size = 4 +indent_style = space diff --git a/thirdparty/assimp/include/assimp/BaseImporter.h b/thirdparty/assimp/include/assimp/BaseImporter.h index 55f7fe3754..ad8a3dafd8 100644 --- a/thirdparty/assimp/include/assimp/BaseImporter.h +++ b/thirdparty/assimp/include/assimp/BaseImporter.h @@ -41,9 +41,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /** @file Definition of the base class for all importer worker classes. */ +#pragma once #ifndef INCLUDED_AI_BASEIMPORTER_H #define INCLUDED_AI_BASEIMPORTER_H +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include "Exceptional.h" #include <vector> @@ -191,16 +196,13 @@ public: /** * Assimp Importer * unit conversions available - * if you need another measurment unit add it below. - * it's currently defined in assimp that we prefer meters. + * NOTE: Valid options are initialised in the + * constructor in the implementation file to + * work around a VS2013 compiler bug if support + * for that compiler is dropped in the future + * initialisation can be moved back here * */ - std::map<ImporterUnits, double> importerUnits = { - {ImporterUnits::M, 1}, - {ImporterUnits::CM, 0.01}, - {ImporterUnits::MM, 0.001}, - {ImporterUnits::INCHES, 0.0254}, - {ImporterUnits::FEET, 0.3048} - }; + std::map<ImporterUnits, double> importerUnits; virtual void SetApplicationUnits( const ImporterUnits& unit ) { diff --git a/thirdparty/assimp/include/assimp/Bitmap.h b/thirdparty/assimp/include/assimp/Bitmap.h index e6b5fb1327..4c3f5a437b 100644 --- a/thirdparty/assimp/include/assimp/Bitmap.h +++ b/thirdparty/assimp/include/assimp/Bitmap.h @@ -46,10 +46,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Used for file formats which embed their textures into the model file. */ - +#pragma once #ifndef AI_BITMAP_H_INC #define AI_BITMAP_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include "defs.h" #include <stdint.h> #include <cstddef> diff --git a/thirdparty/assimp/include/assimp/ByteSwapper.h b/thirdparty/assimp/include/assimp/ByteSwapper.h index 20a2463fb8..3f14c471a8 100644 --- a/thirdparty/assimp/include/assimp/ByteSwapper.h +++ b/thirdparty/assimp/include/assimp/ByteSwapper.h @@ -42,9 +42,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /** @file Helper class tp perform various byte oder swappings (e.g. little to big endian) */ +#pragma once #ifndef AI_BYTESWAPPER_H_INC #define AI_BYTESWAPPER_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/ai_assert.h> #include <assimp/types.h> #include <stdint.h> diff --git a/thirdparty/assimp/include/assimp/CreateAnimMesh.h b/thirdparty/assimp/include/assimp/CreateAnimMesh.h index a60173588f..1266d1de11 100644 --- a/thirdparty/assimp/include/assimp/CreateAnimMesh.h +++ b/thirdparty/assimp/include/assimp/CreateAnimMesh.h @@ -43,16 +43,26 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /** @file CreateAnimMesh.h * Create AnimMesh from Mesh */ +#pragma once #ifndef INCLUDED_AI_CREATE_ANIM_MESH_H #define INCLUDED_AI_CREATE_ANIM_MESH_H +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/mesh.h> -namespace Assimp { +namespace Assimp { -/** Create aiAnimMesh from aiMesh. */ +/** + * Create aiAnimMesh from aiMesh. + * @param mesh The input mesh to create an animated mesh from. + * @return The new created animated mesh. + */ ASSIMP_API aiAnimMesh *aiCreateAnimMesh(const aiMesh *mesh); } // end of namespace Assimp + #endif // INCLUDED_AI_CREATE_ANIM_MESH_H diff --git a/thirdparty/assimp/include/assimp/DefaultIOStream.h b/thirdparty/assimp/include/assimp/DefaultIOStream.h index 994d728ff5..c6d382c1b5 100644 --- a/thirdparty/assimp/include/assimp/DefaultIOStream.h +++ b/thirdparty/assimp/include/assimp/DefaultIOStream.h @@ -41,15 +41,20 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /** @file Default file I/O using fXXX()-family of functions */ +#pragma once #ifndef AI_DEFAULTIOSTREAM_H_INC #define AI_DEFAULTIOSTREAM_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <stdio.h> #include <assimp/IOStream.hpp> #include <assimp/importerdesc.h> #include <assimp/Defines.h> -namespace Assimp { +namespace Assimp { // ---------------------------------------------------------------------------------- //! @class DefaultIOStream @@ -57,8 +62,7 @@ namespace Assimp { //! @note An instance of this class can exist without a valid file handle //! attached to it. All calls fail, but the instance can nevertheless be //! used with no restrictions. -class ASSIMP_API DefaultIOStream : public IOStream -{ +class ASSIMP_API DefaultIOStream : public IOStream { friend class DefaultIOSystem; #if __ANDROID__ # if __ANDROID_API__ > 9 @@ -82,7 +86,6 @@ public: size_t pSize, size_t pCount); - // ------------------------------------------------------------------- /// Write to stream size_t Write(const void* pvBuffer, @@ -107,16 +110,13 @@ public: void Flush(); private: - // File data-structure, using clib FILE* mFile; - // Filename std::string mFilename; - // Cached file size mutable size_t mCachedSize; }; // ---------------------------------------------------------------------------------- -inline +AI_FORCE_INLINE DefaultIOStream::DefaultIOStream() AI_NO_EXCEPT : mFile(nullptr) , mFilename("") @@ -125,7 +125,7 @@ DefaultIOStream::DefaultIOStream() AI_NO_EXCEPT } // ---------------------------------------------------------------------------------- -inline +AI_FORCE_INLINE DefaultIOStream::DefaultIOStream (FILE* pFile, const std::string &strFilename) : mFile(pFile) , mFilename(strFilename) @@ -137,4 +137,3 @@ DefaultIOStream::DefaultIOStream (FILE* pFile, const std::string &strFilename) } // ns assimp #endif //!!AI_DEFAULTIOSTREAM_H_INC - diff --git a/thirdparty/assimp/include/assimp/DefaultIOSystem.h b/thirdparty/assimp/include/assimp/DefaultIOSystem.h index 2dd5c801b5..46f6d447c5 100644 --- a/thirdparty/assimp/include/assimp/DefaultIOSystem.h +++ b/thirdparty/assimp/include/assimp/DefaultIOSystem.h @@ -41,9 +41,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /** @file Default implementation of IOSystem using the standard C file functions */ +#pragma once #ifndef AI_DEFAULTIOSYSTEM_H_INC #define AI_DEFAULTIOSYSTEM_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/IOSystem.hpp> namespace Assimp { diff --git a/thirdparty/assimp/include/assimp/Defines.h b/thirdparty/assimp/include/assimp/Defines.h index 15e1d83c26..be3e2fafd6 100644 --- a/thirdparty/assimp/include/assimp/Defines.h +++ b/thirdparty/assimp/include/assimp/Defines.h @@ -38,6 +38,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ---------------------------------------------------------------------- */ +#pragma once +#ifndef AI_DEFINES_H_INC +#define AI_DEFINES_H_INC + +#ifdef __GNUC__ +# pragma GCC system_header +#endif + // We need those constants, workaround for any platforms where nobody defined them yet #if (!defined SIZE_MAX) # define SIZE_MAX (~((size_t)0)) @@ -47,3 +55,4 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # define UINT_MAX (~((unsigned int)0)) #endif +#endif // AI_DEINES_H_INC diff --git a/thirdparty/assimp/include/assimp/Exceptional.h b/thirdparty/assimp/include/assimp/Exceptional.h index 5109b8f077..6bb6ce1e39 100644 --- a/thirdparty/assimp/include/assimp/Exceptional.h +++ b/thirdparty/assimp/include/assimp/Exceptional.h @@ -38,11 +38,17 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ---------------------------------------------------------------------- */ -#ifndef INCLUDED_EXCEPTIONAL_H -#define INCLUDED_EXCEPTIONAL_H +#pragma once +#ifndef AI_INCLUDED_EXCEPTIONAL_H +#define AI_INCLUDED_EXCEPTIONAL_H + +#ifdef __GNUC__ +# pragma GCC system_header +#endif #include <stdexcept> #include <assimp/DefaultIOStream.h> + using std::runtime_error; #ifdef _MSC_VER @@ -53,17 +59,14 @@ using std::runtime_error; /** FOR IMPORTER PLUGINS ONLY: Simple exception class to be thrown if an * unrecoverable error occurs while importing. Loading APIs return * NULL instead of a valid aiScene then. */ -class DeadlyImportError - : public runtime_error -{ +class DeadlyImportError : public runtime_error { public: /** Constructor with arguments */ explicit DeadlyImportError( const std::string& errorText) - : runtime_error(errorText) - { + : runtime_error(errorText) { + // empty } -private: }; typedef DeadlyImportError DeadlyExportError; @@ -84,7 +87,7 @@ struct ExceptionSwallower { template <typename T> struct ExceptionSwallower<T*> { T* operator ()() const { - return NULL; + return nullptr; } }; @@ -122,4 +125,4 @@ struct ExceptionSwallower<void> { }\ } -#endif // INCLUDED_EXCEPTIONAL_H +#endif // AI_INCLUDED_EXCEPTIONAL_H diff --git a/thirdparty/assimp/include/assimp/Exporter.hpp b/thirdparty/assimp/include/assimp/Exporter.hpp index ea0303e804..2612e1f9d2 100644 --- a/thirdparty/assimp/include/assimp/Exporter.hpp +++ b/thirdparty/assimp/include/assimp/Exporter.hpp @@ -48,6 +48,10 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_EXPORT_HPP_INC #define AI_EXPORT_HPP_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #ifndef ASSIMP_BUILD_NO_EXPORT #include "cexport.h" diff --git a/thirdparty/assimp/include/assimp/GenericProperty.h b/thirdparty/assimp/include/assimp/GenericProperty.h index 183ecd5197..7796d595b8 100644 --- a/thirdparty/assimp/include/assimp/GenericProperty.h +++ b/thirdparty/assimp/include/assimp/GenericProperty.h @@ -40,12 +40,17 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ---------------------------------------------------------------------- */ +#pragma once #ifndef AI_GENERIC_PROPERTY_H_INCLUDED #define AI_GENERIC_PROPERTY_H_INCLUDED +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/Importer.hpp> #include <assimp/ai_assert.h> -#include "Hash.h" +#include <assimp/Hash.h> #include <map> diff --git a/thirdparty/assimp/include/assimp/Hash.h b/thirdparty/assimp/include/assimp/Hash.h index 30657be198..9056440789 100644 --- a/thirdparty/assimp/include/assimp/Hash.h +++ b/thirdparty/assimp/include/assimp/Hash.h @@ -39,10 +39,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ---------------------------------------------------------------------- */ - +#pragma once #ifndef AI_HASH_H_INCLUDED #define AI_HASH_H_INCLUDED +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <stdint.h> #include <string.h> diff --git a/thirdparty/assimp/include/assimp/IOStream.hpp b/thirdparty/assimp/include/assimp/IOStream.hpp index 0623d0f70b..39932cd949 100644 --- a/thirdparty/assimp/include/assimp/IOStream.hpp +++ b/thirdparty/assimp/include/assimp/IOStream.hpp @@ -48,14 +48,18 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_IOSTREAM_H_INC #define AI_IOSTREAM_H_INC -#include "types.h" +#ifdef __GNUC__ +# pragma GCC system_header +#endif + +#include <assimp/types.h> #ifndef __cplusplus # error This header requires C++ to be used. aiFileIO.h is the \ corresponding C interface. #endif -namespace Assimp { +namespace Assimp { // ---------------------------------------------------------------------------------- /** @brief CPP-API: Class to handle file I/O for C++ @@ -125,13 +129,13 @@ public: }; //! class IOStream // ---------------------------------------------------------------------------------- -inline +AI_FORCE_INLINE IOStream::IOStream() AI_NO_EXCEPT { // empty } // ---------------------------------------------------------------------------------- -inline +AI_FORCE_INLINE IOStream::~IOStream() { // empty } diff --git a/thirdparty/assimp/include/assimp/IOStreamBuffer.h b/thirdparty/assimp/include/assimp/IOStreamBuffer.h index 58abd97a02..97c84b23e2 100644 --- a/thirdparty/assimp/include/assimp/IOStreamBuffer.h +++ b/thirdparty/assimp/include/assimp/IOStreamBuffer.h @@ -1,5 +1,3 @@ -#pragma once - /* Open Asset Import Library (assimp) ---------------------------------------------------------------------- @@ -42,10 +40,17 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ---------------------------------------------------------------------- */ +#pragma once +#ifndef AI_IOSTREAMBUFFER_H_INC +#define AI_IOSTREAMBUFFER_H_INC + +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/types.h> #include <assimp/IOStream.hpp> - -#include "ParsingUtils.h" +#include <assimp/ParsingUtils.h> #include <vector> @@ -124,7 +129,7 @@ private: }; template<class T> -inline +AI_FORCE_INLINE IOStreamBuffer<T>::IOStreamBuffer( size_t cache ) : m_stream( nullptr ) , m_filesize( 0 ) @@ -138,13 +143,13 @@ IOStreamBuffer<T>::IOStreamBuffer( size_t cache ) } template<class T> -inline +AI_FORCE_INLINE IOStreamBuffer<T>::~IOStreamBuffer() { // empty } template<class T> -inline +AI_FORCE_INLINE bool IOStreamBuffer<T>::open( IOStream *stream ) { // file still opened! if ( nullptr != m_stream ) { @@ -174,7 +179,7 @@ bool IOStreamBuffer<T>::open( IOStream *stream ) { } template<class T> -inline +AI_FORCE_INLINE bool IOStreamBuffer<T>::close() { if ( nullptr == m_stream ) { return false; @@ -192,19 +197,19 @@ bool IOStreamBuffer<T>::close() { } template<class T> -inline +AI_FORCE_INLINE size_t IOStreamBuffer<T>::size() const { return m_filesize; } template<class T> -inline +AI_FORCE_INLINE size_t IOStreamBuffer<T>::cacheSize() const { return m_cacheSize; } template<class T> -inline +AI_FORCE_INLINE bool IOStreamBuffer<T>::readNextBlock() { m_stream->Seek( m_filePos, aiOrigin_SET ); size_t readLen = m_stream->Read( &m_cache[ 0 ], sizeof( T ), m_cacheSize ); @@ -222,25 +227,25 @@ bool IOStreamBuffer<T>::readNextBlock() { } template<class T> -inline +AI_FORCE_INLINE size_t IOStreamBuffer<T>::getNumBlocks() const { return m_numBlocks; } template<class T> -inline +AI_FORCE_INLINE size_t IOStreamBuffer<T>::getCurrentBlockIndex() const { return m_blockIdx; } template<class T> -inline +AI_FORCE_INLINE size_t IOStreamBuffer<T>::getFilePos() const { return m_filePos; } template<class T> -inline +AI_FORCE_INLINE bool IOStreamBuffer<T>::getNextDataLine( std::vector<T> &buffer, T continuationToken ) { buffer.resize( m_cacheSize ); if ( m_cachePos >= m_cacheSize || 0 == m_filePos ) { @@ -289,13 +294,13 @@ bool IOStreamBuffer<T>::getNextDataLine( std::vector<T> &buffer, T continuationT return true; } -static inline +static AI_FORCE_INLINE bool isEndOfCache( size_t pos, size_t cacheSize ) { return ( pos == cacheSize ); } template<class T> -inline +AI_FORCE_INLINE bool IOStreamBuffer<T>::getNextLine(std::vector<T> &buffer) { buffer.resize(m_cacheSize); if ( isEndOfCache( m_cachePos, m_cacheSize ) || 0 == m_filePos) { @@ -335,7 +340,7 @@ bool IOStreamBuffer<T>::getNextLine(std::vector<T> &buffer) { } template<class T> -inline +AI_FORCE_INLINE bool IOStreamBuffer<T>::getNextBlock( std::vector<T> &buffer) { // Return the last block-value if getNextLine was used before if ( 0 != m_cachePos ) { @@ -353,3 +358,5 @@ bool IOStreamBuffer<T>::getNextBlock( std::vector<T> &buffer) { } } // !ns Assimp + +#endif // AI_IOSTREAMBUFFER_H_INC diff --git a/thirdparty/assimp/include/assimp/IOSystem.hpp b/thirdparty/assimp/include/assimp/IOSystem.hpp index 78139c2839..f1fb3b0c27 100644 --- a/thirdparty/assimp/include/assimp/IOSystem.hpp +++ b/thirdparty/assimp/include/assimp/IOSystem.hpp @@ -50,6 +50,10 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_IOSYSTEM_H_INC #define AI_IOSYSTEM_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #ifndef __cplusplus # error This header requires C++ to be used. aiFileIO.h is the \ corresponding C interface. diff --git a/thirdparty/assimp/include/assimp/Importer.hpp b/thirdparty/assimp/include/assimp/Importer.hpp index 4941df4122..bf449a9a25 100644 --- a/thirdparty/assimp/include/assimp/Importer.hpp +++ b/thirdparty/assimp/include/assimp/Importer.hpp @@ -48,6 +48,10 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_ASSIMP_HPP_INC #define AI_ASSIMP_HPP_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #ifndef __cplusplus # error This header requires C++ to be used. Use assimp.h for plain C. #endif // __cplusplus diff --git a/thirdparty/assimp/include/assimp/LineSplitter.h b/thirdparty/assimp/include/assimp/LineSplitter.h index 4afe45b92a..6c1097bb6d 100644 --- a/thirdparty/assimp/include/assimp/LineSplitter.h +++ b/thirdparty/assimp/include/assimp/LineSplitter.h @@ -48,9 +48,13 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef INCLUDED_LINE_SPLITTER_H #define INCLUDED_LINE_SPLITTER_H +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <stdexcept> -#include "StreamReader.h" -#include "ParsingUtils.h" +#include <assimp/StreamReader.h> +#include <assimp/ParsingUtils.h> namespace Assimp { @@ -140,7 +144,7 @@ private: bool mSwallow, mSkip_empty_lines, mTrim; }; -inline +AI_FORCE_INLINE LineSplitter::LineSplitter(StreamReaderLE& stream, bool skip_empty_lines, bool trim ) : mIdx(0) , mCur() @@ -153,12 +157,12 @@ LineSplitter::LineSplitter(StreamReaderLE& stream, bool skip_empty_lines, bool t mIdx = 0; } -inline +AI_FORCE_INLINE LineSplitter::~LineSplitter() { // empty } -inline +AI_FORCE_INLINE LineSplitter& LineSplitter::operator++() { if (mSwallow) { mSwallow = false; @@ -199,12 +203,12 @@ LineSplitter& LineSplitter::operator++() { return *this; } -inline +AI_FORCE_INLINE LineSplitter &LineSplitter::operator++(int) { return ++(*this); } -inline +AI_FORCE_INLINE const char *LineSplitter::operator[] (size_t idx) const { const char* s = operator->()->c_str(); @@ -222,7 +226,7 @@ const char *LineSplitter::operator[] (size_t idx) const { } template <size_t N> -inline +AI_FORCE_INLINE void LineSplitter::get_tokens(const char* (&tokens)[N]) const { const char* s = operator->()->c_str(); @@ -238,44 +242,44 @@ void LineSplitter::get_tokens(const char* (&tokens)[N]) const { } } -inline +AI_FORCE_INLINE const std::string* LineSplitter::operator -> () const { return &mCur; } -inline +AI_FORCE_INLINE std::string LineSplitter::operator* () const { return mCur; } -inline +AI_FORCE_INLINE LineSplitter::operator bool() const { return mStream.GetRemainingSize() > 0; } -inline +AI_FORCE_INLINE LineSplitter::operator line_idx() const { return mIdx; } -inline +AI_FORCE_INLINE LineSplitter::line_idx LineSplitter::get_index() const { return mIdx; } -inline +AI_FORCE_INLINE StreamReaderLE &LineSplitter::get_stream() { return mStream; } -inline +AI_FORCE_INLINE bool LineSplitter::match_start(const char* check) { const size_t len = ::strlen(check); return len <= mCur.length() && std::equal(check, check + len, mCur.begin()); } -inline +AI_FORCE_INLINE void LineSplitter::swallow_next_increment() { mSwallow = true; } diff --git a/thirdparty/assimp/include/assimp/LogAux.h b/thirdparty/assimp/include/assimp/LogAux.h index 558485272e..bcead78dd3 100644 --- a/thirdparty/assimp/include/assimp/LogAux.h +++ b/thirdparty/assimp/include/assimp/LogAux.h @@ -43,9 +43,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /** @file LogAux.h * @brief Common logging usage patterns for importer implementations */ +#pragma once #ifndef INCLUDED_AI_LOGAUX_H #define INCLUDED_AI_LOGAUX_H +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/TinyFormatter.h> #include <assimp/Exceptional.h> #include <assimp/DefaultLogger.hpp> diff --git a/thirdparty/assimp/include/assimp/Macros.h b/thirdparty/assimp/include/assimp/Macros.h deleted file mode 100644 index 6515303372..0000000000 --- a/thirdparty/assimp/include/assimp/Macros.h +++ /dev/null @@ -1,49 +0,0 @@ -/* ---------------------------------------------------------------------------- -Open Asset Import Library (assimp) ---------------------------------------------------------------------------- - -Copyright (c) 2006-2019, assimp team - -All rights reserved. - -Redistribution and use of this software in source and binary forms, -with or without modification, are permitted provided that the following -conditions are met: - -* Redistributions of source code must retain the above - copyright notice, this list of conditions and the - following disclaimer. - -* Redistributions in binary form must reproduce the above - copyright notice, this list of conditions and the - following disclaimer in the documentation and/or other - materials provided with the distribution. - -* Neither the name of the assimp team, nor the names of its - contributors may be used to endorse or promote products - derived from this software without specific prior - written permission of the assimp team. - -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS -"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT -LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR -A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT -OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, -SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT -LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, -DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY -THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE -OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ---------------------------------------------------------------------------- -*/ - -/* Helper macro to set a pointer to NULL in debug builds - */ -#if (defined ASSIMP_BUILD_DEBUG) -# define AI_DEBUG_INVALIDATE_PTR(x) x = NULL; -#else -# define AI_DEBUG_INVALIDATE_PTR(x) -#endif - diff --git a/thirdparty/assimp/include/assimp/MathFunctions.h b/thirdparty/assimp/include/assimp/MathFunctions.h index cb3b696076..b6c5872a72 100644 --- a/thirdparty/assimp/include/assimp/MathFunctions.h +++ b/thirdparty/assimp/include/assimp/MathFunctions.h @@ -39,22 +39,28 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. --------------------------------------------------------------------------- */ +#pragma once + +#ifdef __GNUC__ +# pragma GCC system_header +#endif + /** @file MathFunctions.h - * @brief Implementation of the math functions (gcd and lcm) +* @brief Implementation of math utility functions. * - * Copied from BoostWorkaround/math - */ +*/ + +#include <limits> namespace Assimp { namespace Math { // TODO: use binary GCD for unsigned integers .... template < typename IntegerType > -IntegerType gcd( IntegerType a, IntegerType b ) -{ +inline +IntegerType gcd( IntegerType a, IntegerType b ) { const IntegerType zero = (IntegerType)0; - while ( true ) - { + while ( true ) { if ( a == zero ) return b; b %= a; @@ -66,12 +72,19 @@ IntegerType gcd( IntegerType a, IntegerType b ) } template < typename IntegerType > -IntegerType lcm( IntegerType a, IntegerType b ) -{ +inline +IntegerType lcm( IntegerType a, IntegerType b ) { const IntegerType t = gcd (a,b); - if (!t)return t; + if (!t) + return t; return a / t * b; } +template<class T> +inline +T getEpsilon() { + return std::numeric_limits<T>::epsilon(); +} + } } diff --git a/thirdparty/assimp/include/assimp/MemoryIOWrapper.h b/thirdparty/assimp/include/assimp/MemoryIOWrapper.h index c522787184..5598d4fc5f 100644 --- a/thirdparty/assimp/include/assimp/MemoryIOWrapper.h +++ b/thirdparty/assimp/include/assimp/MemoryIOWrapper.h @@ -42,12 +42,18 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /** @file MemoryIOWrapper.h * Handy IOStream/IOSystem implemetation to read directly from a memory buffer */ +#pragma once #ifndef AI_MEMORYIOSTREAM_H_INC #define AI_MEMORYIOSTREAM_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/IOStream.hpp> #include <assimp/IOSystem.hpp> #include <assimp/ai_assert.h> + #include <stdint.h> namespace Assimp { diff --git a/thirdparty/assimp/include/assimp/ParsingUtils.h b/thirdparty/assimp/include/assimp/ParsingUtils.h index 6b9574fc67..3025601246 100644 --- a/thirdparty/assimp/include/assimp/ParsingUtils.h +++ b/thirdparty/assimp/include/assimp/ParsingUtils.h @@ -44,11 +44,16 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /** @file ParsingUtils.h * @brief Defines helper functions for text parsing */ +#pragma once #ifndef AI_PARSING_UTILS_H_INC #define AI_PARSING_UTILS_H_INC -#include "StringComparison.h" -#include "StringUtils.h" +#ifdef __GNUC__ +# pragma GCC system_header +#endif + +#include <assimp/StringComparison.h> +#include <assimp/StringUtils.h> #include <assimp/defs.h> namespace Assimp { diff --git a/thirdparty/assimp/include/assimp/Profiler.h b/thirdparty/assimp/include/assimp/Profiler.h index 6ff9d41c0a..624029be99 100644 --- a/thirdparty/assimp/include/assimp/Profiler.h +++ b/thirdparty/assimp/include/assimp/Profiler.h @@ -43,12 +43,17 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /** @file Profiler.h * @brief Utility to measure the respective runtime of each import step */ -#ifndef INCLUDED_PROFILER_H -#define INCLUDED_PROFILER_H +#pragma once +#ifndef AI_INCLUDED_PROFILER_H +#define AI_INCLUDED_PROFILER_H + +#ifdef __GNUC__ +# pragma GCC system_header +#endif #include <chrono> #include <assimp/DefaultLogger.hpp> -#include "TinyFormatter.h" +#include <assimp/TinyFormatter.h> #include <map> @@ -67,7 +72,6 @@ public: // empty } -public: /** Start a named timer */ void BeginRegion(const std::string& region) { @@ -95,5 +99,5 @@ private: } } -#endif +#endif // AI_INCLUDED_PROFILER_H diff --git a/thirdparty/assimp/include/assimp/ProgressHandler.hpp b/thirdparty/assimp/include/assimp/ProgressHandler.hpp index 4e47f1d0a6..8991a64618 100644 --- a/thirdparty/assimp/include/assimp/ProgressHandler.hpp +++ b/thirdparty/assimp/include/assimp/ProgressHandler.hpp @@ -47,9 +47,13 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_PROGRESSHANDLER_H_INC #define AI_PROGRESSHANDLER_H_INC -#include "types.h" +#ifdef __GNUC__ +# pragma GCC system_header +#endif + +#include <assimp/types.h> -namespace Assimp { +namespace Assimp { // ------------------------------------------------------------------------------------ /** @brief CPP-API: Abstract interface for custom progress report receivers. diff --git a/thirdparty/assimp/include/assimp/RemoveComments.h b/thirdparty/assimp/include/assimp/RemoveComments.h index 404b496719..f129420535 100644 --- a/thirdparty/assimp/include/assimp/RemoveComments.h +++ b/thirdparty/assimp/include/assimp/RemoveComments.h @@ -4,7 +4,6 @@ Open Asset Import Library (assimp) Copyright (c) 2006-2019, assimp team - All rights reserved. Redistribution and use of this software in source and binary forms, @@ -43,9 +42,13 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /** @file Declares a helper class, "CommentRemover", which can be * used to remove comments (single and multi line) from a text file. */ +#pragma once #ifndef AI_REMOVE_COMMENTS_H_INC #define AI_REMOVE_COMMENTS_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif #include <assimp/defs.h> @@ -58,8 +61,7 @@ namespace Assimp { * to those in C or C++ so this code has been moved to a separate * module. */ -class ASSIMP_API CommentRemover -{ +class ASSIMP_API CommentRemover { // class cannot be instanced CommentRemover() {} diff --git a/thirdparty/assimp/include/assimp/SGSpatialSort.h b/thirdparty/assimp/include/assimp/SGSpatialSort.h index 5b4f3f41f2..fdb5ce8174 100644 --- a/thirdparty/assimp/include/assimp/SGSpatialSort.h +++ b/thirdparty/assimp/include/assimp/SGSpatialSort.h @@ -42,9 +42,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /** Small helper classes to optimize finding vertices close to a given location */ +#pragma once #ifndef AI_D3DSSPATIALSORT_H_INC #define AI_D3DSSPATIALSORT_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/types.h> #include <vector> #include <stdint.h> diff --git a/thirdparty/assimp/include/assimp/SceneCombiner.h b/thirdparty/assimp/include/assimp/SceneCombiner.h index 679a2acea4..0683c1e052 100644 --- a/thirdparty/assimp/include/assimp/SceneCombiner.h +++ b/thirdparty/assimp/include/assimp/SceneCombiner.h @@ -43,17 +43,22 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /** @file Declares a helper class, "SceneCombiner" providing various * utilities to merge scenes. */ +#pragma once #ifndef AI_SCENE_COMBINER_H_INC #define AI_SCENE_COMBINER_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/ai_assert.h> #include <assimp/types.h> #include <assimp/Defines.h> + #include <stddef.h> #include <set> #include <list> #include <stdint.h> - #include <vector> struct aiScene; @@ -65,8 +70,10 @@ struct aiLight; struct aiMetadata; struct aiBone; struct aiMesh; +struct aiAnimMesh; struct aiAnimation; struct aiNodeAnim; +struct aiMeshMorphAnim; namespace Assimp { @@ -363,6 +370,7 @@ public: static void Copy (aiMesh** dest, const aiMesh* src); // similar to Copy(): + static void Copy (aiAnimMesh** dest, const aiAnimMesh* src); static void Copy (aiMaterial** dest, const aiMaterial* src); static void Copy (aiTexture** dest, const aiTexture* src); static void Copy (aiAnimation** dest, const aiAnimation* src); @@ -370,6 +378,7 @@ public: static void Copy (aiBone** dest, const aiBone* src); static void Copy (aiLight** dest, const aiLight* src); static void Copy (aiNodeAnim** dest, const aiNodeAnim* src); + static void Copy (aiMeshMorphAnim** dest, const aiMeshMorphAnim* src); static void Copy (aiMetadata** dest, const aiMetadata* src); // recursive, of course diff --git a/thirdparty/assimp/include/assimp/SkeletonMeshBuilder.h b/thirdparty/assimp/include/assimp/SkeletonMeshBuilder.h index f9b8d9f55c..ad979a33fa 100644 --- a/thirdparty/assimp/include/assimp/SkeletonMeshBuilder.h +++ b/thirdparty/assimp/include/assimp/SkeletonMeshBuilder.h @@ -47,9 +47,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * for animation skeletons. */ +#pragma once #ifndef AI_SKELETONMESHBUILDER_H_INC #define AI_SKELETONMESHBUILDER_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <vector> #include <assimp/mesh.h> diff --git a/thirdparty/assimp/include/assimp/SmoothingGroups.h b/thirdparty/assimp/include/assimp/SmoothingGroups.h index 92d65cea02..c1a93947f1 100644 --- a/thirdparty/assimp/include/assimp/SmoothingGroups.h +++ b/thirdparty/assimp/include/assimp/SmoothingGroups.h @@ -43,10 +43,16 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /** @file Defines the helper data structures for importing 3DS files. http://www.jalix.org/ressources/graphics/3DS/_unofficials/3ds-unofficial.txt */ +#pragma once #ifndef AI_SMOOTHINGGROUPS_H_INC #define AI_SMOOTHINGGROUPS_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/vector3.h> + #include <stdint.h> #include <vector> diff --git a/thirdparty/assimp/include/assimp/SmoothingGroups.inl b/thirdparty/assimp/include/assimp/SmoothingGroups.inl index 84ea4a1b00..37ea083dbe 100644 --- a/thirdparty/assimp/include/assimp/SmoothingGroups.inl +++ b/thirdparty/assimp/include/assimp/SmoothingGroups.inl @@ -41,13 +41,16 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /** @file Generation of normal vectors basing on smoothing groups */ +#pragma once #ifndef AI_SMOOTHINGGROUPS_INL_INCLUDED #define AI_SMOOTHINGGROUPS_INL_INCLUDED -// internal headers +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/SGSpatialSort.h> -// CRT header #include <algorithm> using namespace Assimp; diff --git a/thirdparty/assimp/include/assimp/SpatialSort.h b/thirdparty/assimp/include/assimp/SpatialSort.h index 61b345bcbf..9f93543150 100644 --- a/thirdparty/assimp/include/assimp/SpatialSort.h +++ b/thirdparty/assimp/include/assimp/SpatialSort.h @@ -41,9 +41,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /** Small helper classes to optimise finding vertizes close to a given location */ +#pragma once #ifndef AI_SPATIALSORT_H_INC #define AI_SPATIALSORT_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <vector> #include <assimp/types.h> diff --git a/thirdparty/assimp/include/assimp/StandardShapes.h b/thirdparty/assimp/include/assimp/StandardShapes.h index 3791569b83..c594cb63f4 100644 --- a/thirdparty/assimp/include/assimp/StandardShapes.h +++ b/thirdparty/assimp/include/assimp/StandardShapes.h @@ -41,11 +41,16 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /** @file Declares a helper class, "StandardShapes" which generates - * vertices for standard shapes, such as cylnders, cones, spheres .. + * vertices for standard shapes, such as cylinders, cones, spheres .. */ +#pragma once #ifndef AI_STANDARD_SHAPES_H_INC #define AI_STANDARD_SHAPES_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/vector3.h> #include <vector> diff --git a/thirdparty/assimp/include/assimp/StreamReader.h b/thirdparty/assimp/include/assimp/StreamReader.h index 9116c14261..cb24f1595b 100644 --- a/thirdparty/assimp/include/assimp/StreamReader.h +++ b/thirdparty/assimp/include/assimp/StreamReader.h @@ -44,15 +44,19 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /** @file Defines the StreamReader class which reads data from * a binary stream with a well-defined endianness. */ - +#pragma once #ifndef AI_STREAMREADER_H_INCLUDED #define AI_STREAMREADER_H_INCLUDED +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/IOStream.hpp> #include <assimp/Defines.h> +#include <assimp/ByteSwapper.h> +#include <assimp/Exceptional.h> -#include "ByteSwapper.h" -#include "Exceptional.h" #include <memory> namespace Assimp { diff --git a/thirdparty/assimp/include/assimp/StreamWriter.h b/thirdparty/assimp/include/assimp/StreamWriter.h index c7cf6c0d74..489e8adfe3 100644 --- a/thirdparty/assimp/include/assimp/StreamWriter.h +++ b/thirdparty/assimp/include/assimp/StreamWriter.h @@ -43,11 +43,15 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /** @file Defines the StreamWriter class which writes data to * a binary stream with a well-defined endianness. */ - +#pragma once #ifndef AI_STREAMWRITER_H_INCLUDED #define AI_STREAMWRITER_H_INCLUDED -#include "ByteSwapper.h" +#ifdef __GNUC__ +# pragma GCC system_header +#endif + +#include <assimp/ByteSwapper.h> #include <assimp/IOStream.hpp> #include <memory> diff --git a/thirdparty/assimp/include/assimp/StringComparison.h b/thirdparty/assimp/include/assimp/StringComparison.h index 8acef277b9..d3ca3e9714 100644 --- a/thirdparty/assimp/include/assimp/StringComparison.h +++ b/thirdparty/assimp/include/assimp/StringComparison.h @@ -49,12 +49,17 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. These functions are not consistently available on all platforms, or the provided implementations behave too differently. */ +#pragma once #ifndef INCLUDED_AI_STRING_WORKERS_H #define INCLUDED_AI_STRING_WORKERS_H +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/ai_assert.h> #include <assimp/defs.h> -#include "StringComparison.h" +#include <assimp/StringComparison.h> #include <string.h> #include <stdint.h> diff --git a/thirdparty/assimp/include/assimp/StringUtils.h b/thirdparty/assimp/include/assimp/StringUtils.h index d68b7fa479..af481f819e 100644 --- a/thirdparty/assimp/include/assimp/StringUtils.h +++ b/thirdparty/assimp/include/assimp/StringUtils.h @@ -39,9 +39,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ---------------------------------------------------------------------- */ +#pragma once #ifndef INCLUDED_AI_STRINGUTILS_H #define INCLUDED_AI_STRINGUTILS_H +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/defs.h> #include <sstream> diff --git a/thirdparty/assimp/include/assimp/Subdivision.h b/thirdparty/assimp/include/assimp/Subdivision.h index 43feb73b30..e9450267ec 100644 --- a/thirdparty/assimp/include/assimp/Subdivision.h +++ b/thirdparty/assimp/include/assimp/Subdivision.h @@ -45,7 +45,10 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_SUBDISIVION_H_INC #define AI_SUBDISIVION_H_INC -#include <cstddef> +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/types.h> struct aiMesh; diff --git a/thirdparty/assimp/include/assimp/TinyFormatter.h b/thirdparty/assimp/include/assimp/TinyFormatter.h index 1226b482e6..6227e42c52 100644 --- a/thirdparty/assimp/include/assimp/TinyFormatter.h +++ b/thirdparty/assimp/include/assimp/TinyFormatter.h @@ -45,9 +45,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * to get rid of the boost::format dependency. Much slinker, * basically just extends stringstream. */ +#pragma once #ifndef INCLUDED_TINY_FORMATTER_H #define INCLUDED_TINY_FORMATTER_H +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <sstream> namespace Assimp { @@ -65,24 +70,15 @@ namespace Formatter { * @endcode */ template < typename T, typename CharTraits = std::char_traits<T>, - typename Allocator = std::allocator<T> -> -class basic_formatter -{ - -public: - - typedef class std::basic_string< - T,CharTraits,Allocator - > string; - - typedef class std::basic_ostringstream< - T,CharTraits,Allocator - > stringstream; - + typename Allocator = std::allocator<T> > +class basic_formatter { public: + typedef class std::basic_string<T,CharTraits,Allocator> string; + typedef class std::basic_ostringstream<T,CharTraits,Allocator> stringstream; - basic_formatter() {} + basic_formatter() { + // empty + } /* Allow basic_formatter<T>'s to be used almost interchangeably * with std::(w)string or const (w)char* arguments because the @@ -104,14 +100,10 @@ public: } #endif - -public: - operator string () const { return underlying.str(); } - /* note - this is declared const because binding temporaries does only * work for const references, so many function prototypes will * include const basic_formatter<T>& s but might still want to diff --git a/thirdparty/assimp/include/assimp/Vertex.h b/thirdparty/assimp/include/assimp/Vertex.h index 2a7f0256ad..5e63db5fe4 100644 --- a/thirdparty/assimp/include/assimp/Vertex.h +++ b/thirdparty/assimp/include/assimp/Vertex.h @@ -47,12 +47,18 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. that are not currently well-defined (and would cause compile errors due to missing operators in the math library), are commented. */ +#pragma once #ifndef AI_VERTEX_H_INC #define AI_VERTEX_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/vector3.h> #include <assimp/mesh.h> #include <assimp/ai_assert.h> + #include <functional> namespace Assimp { @@ -91,23 +97,14 @@ namespace Assimp { * to *all* vertex components equally. This is useful for stuff like interpolation * or subdivision, but won't work if special handling is required for some vertex components. */ // ------------------------------------------------------------------------------------------------ -class Vertex -{ +class Vertex { friend Vertex operator + (const Vertex&,const Vertex&); friend Vertex operator - (const Vertex&,const Vertex&); - -// friend Vertex operator + (const Vertex&,ai_real); -// friend Vertex operator - (const Vertex&,ai_real); friend Vertex operator * (const Vertex&,ai_real); friend Vertex operator / (const Vertex&,ai_real); - -// friend Vertex operator + (ai_real, const Vertex&); -// friend Vertex operator - (ai_real, const Vertex&); friend Vertex operator * (ai_real, const Vertex&); -// friend Vertex operator / (ai_real, const Vertex&); public: - Vertex() {} // ---------------------------------------------------------------------------- @@ -158,8 +155,6 @@ public: } } -public: - Vertex& operator += (const Vertex& v) { *this = *this+v; return *this; @@ -170,18 +165,6 @@ public: return *this; } - -/* - Vertex& operator += (ai_real v) { - *this = *this+v; - return *this; - } - - Vertex& operator -= (ai_real v) { - *this = *this-v; - return *this; - } -*/ Vertex& operator *= (ai_real v) { *this = *this*v; return *this; @@ -192,12 +175,9 @@ public: return *this; } -public: - // ---------------------------------------------------------------------------- /** Convert back to non-interleaved storage */ void SortBack(aiMesh* out, unsigned int idx) const { - ai_assert(idx<out->mNumVertices); out->mVertices[idx] = position; @@ -291,8 +271,6 @@ public: aiColor4D colors[AI_MAX_NUMBER_OF_COLOR_SETS]; }; - - // ------------------------------------------------------------------------------------------------ AI_FORCE_INLINE Vertex operator + (const Vertex& v0,const Vertex& v1) { return Vertex::BinaryOp<std::plus>(v0,v1); @@ -302,19 +280,6 @@ AI_FORCE_INLINE Vertex operator - (const Vertex& v0,const Vertex& v1) { return Vertex::BinaryOp<std::minus>(v0,v1); } - -// ------------------------------------------------------------------------------------------------ -/* -AI_FORCE_INLINE Vertex operator + (const Vertex& v0,ai_real f) { - return Vertex::BinaryOp<Intern::plus>(v0,f); -} - -AI_FORCE_INLINE Vertex operator - (const Vertex& v0,ai_real f) { - return Vertex::BinaryOp<Intern::minus>(v0,f); -} - -*/ - AI_FORCE_INLINE Vertex operator * (const Vertex& v0,ai_real f) { return Vertex::BinaryOp<Intern::multiplies>(v0,f); } @@ -323,26 +288,10 @@ AI_FORCE_INLINE Vertex operator / (const Vertex& v0,ai_real f) { return Vertex::BinaryOp<Intern::multiplies>(v0,1.f/f); } -// ------------------------------------------------------------------------------------------------ -/* -AI_FORCE_INLINE Vertex operator + (ai_real f,const Vertex& v0) { - return Vertex::BinaryOp<Intern::plus>(f,v0); -} - -AI_FORCE_INLINE Vertex operator - (ai_real f,const Vertex& v0) { - return Vertex::BinaryOp<Intern::minus>(f,v0); -} -*/ - AI_FORCE_INLINE Vertex operator * (ai_real f,const Vertex& v0) { return Vertex::BinaryOp<Intern::multiplies>(f,v0); } -/* -AI_FORCE_INLINE Vertex operator / (ai_real f,const Vertex& v0) { - return Vertex::BinaryOp<Intern::divides>(f,v0); } -*/ -} -#endif +#endif // AI_VERTEX_H_INC diff --git a/thirdparty/assimp/include/assimp/XMLTools.h b/thirdparty/assimp/include/assimp/XMLTools.h index b0d3276873..95f12cdebf 100644 --- a/thirdparty/assimp/include/assimp/XMLTools.h +++ b/thirdparty/assimp/include/assimp/XMLTools.h @@ -40,9 +40,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ---------------------------------------------------------------------- */ +#pragma once #ifndef INCLUDED_ASSIMP_XML_TOOLS_H #define INCLUDED_ASSIMP_XML_TOOLS_H +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <string> namespace Assimp { diff --git a/thirdparty/assimp/include/assimp/aabb.h b/thirdparty/assimp/include/assimp/aabb.h index a20f317424..83bb62256b 100644 --- a/thirdparty/assimp/include/assimp/aabb.h +++ b/thirdparty/assimp/include/assimp/aabb.h @@ -5,8 +5,6 @@ Open Asset Import Library (assimp) Copyright (c) 2006-2019, assimp team - - All rights reserved. Redistribution and use of this software in source and binary forms, @@ -45,6 +43,10 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_AABB_H_INC #define AI_AABB_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/vector3.h> struct aiAABB { @@ -69,8 +71,9 @@ struct aiAABB { // empty } -#endif +#endif // __cplusplus + }; -#endif +#endif // AI_AABB_H_INC diff --git a/thirdparty/assimp/include/assimp/ai_assert.h b/thirdparty/assimp/include/assimp/ai_assert.h index e5de5d3f36..2b32b01d3e 100644 --- a/thirdparty/assimp/include/assimp/ai_assert.h +++ b/thirdparty/assimp/include/assimp/ai_assert.h @@ -44,6 +44,10 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_ASSERT_H_INC #define AI_ASSERT_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #ifdef ASSIMP_BUILD_DEBUG # include <assert.h> # define ai_assert(expression) assert( expression ) diff --git a/thirdparty/assimp/include/assimp/anim.h b/thirdparty/assimp/include/assimp/anim.h index 02e92739ec..e208b11adb 100644 --- a/thirdparty/assimp/include/assimp/anim.h +++ b/thirdparty/assimp/include/assimp/anim.h @@ -50,6 +50,10 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_ANIM_H_INC #define AI_ANIM_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/types.h> #include <assimp/quaternion.h> diff --git a/thirdparty/assimp/include/assimp/camera.h b/thirdparty/assimp/include/assimp/camera.h index e573eea5d1..adb749ff59 100644 --- a/thirdparty/assimp/include/assimp/camera.h +++ b/thirdparty/assimp/include/assimp/camera.h @@ -47,6 +47,10 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_CAMERA_H_INC #define AI_CAMERA_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include "types.h" #ifdef __cplusplus @@ -113,7 +117,6 @@ struct aiCamera */ C_STRUCT aiVector3D mPosition; - /** 'Up' - vector of the camera coordinate system relative to * the coordinate space defined by the corresponding node. * @@ -134,7 +137,6 @@ struct aiCamera */ C_STRUCT aiVector3D mLookAt; - /** Half horizontal field of view angle, in radians. * * The field of view angle is the angle between the center diff --git a/thirdparty/assimp/include/assimp/cexport.h b/thirdparty/assimp/include/assimp/cexport.h index 1d62dc26b3..cbc0253d50 100644 --- a/thirdparty/assimp/include/assimp/cexport.h +++ b/thirdparty/assimp/include/assimp/cexport.h @@ -3,7 +3,7 @@ Open Asset Import Library (assimp) --------------------------------------------------------------------------- -Copyright (c) 2006-2011, assimp team +Copyright (c) 2006-2019, assimp team All rights reserved. @@ -46,6 +46,10 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_EXPORT_H_INC #define AI_EXPORT_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #ifndef ASSIMP_BUILD_NO_EXPORT // Public ASSIMP data structures diff --git a/thirdparty/assimp/include/assimp/cfileio.h b/thirdparty/assimp/include/assimp/cfileio.h index 8f7ca45469..be90999d87 100644 --- a/thirdparty/assimp/include/assimp/cfileio.h +++ b/thirdparty/assimp/include/assimp/cfileio.h @@ -48,10 +48,16 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_FILEIO_H_INC #define AI_FILEIO_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/types.h> + #ifdef __cplusplus extern "C" { #endif + struct aiFileIO; struct aiFile; diff --git a/thirdparty/assimp/include/assimp/cimport.h b/thirdparty/assimp/include/assimp/cimport.h index dbd10f1370..66b1c9a174 100644 --- a/thirdparty/assimp/include/assimp/cimport.h +++ b/thirdparty/assimp/include/assimp/cimport.h @@ -48,8 +48,12 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_ASSIMP_H_INC #define AI_ASSIMP_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/types.h> -#include "importerdesc.h" +#include <assimp/importerdesc.h> #ifdef __cplusplus extern "C" { diff --git a/thirdparty/assimp/include/assimp/color4.h b/thirdparty/assimp/include/assimp/color4.h index 3c97c8eda2..fa86128f4f 100644 --- a/thirdparty/assimp/include/assimp/color4.h +++ b/thirdparty/assimp/include/assimp/color4.h @@ -47,7 +47,11 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_COLOR4D_H_INC #define AI_COLOR4D_H_INC -#include "defs.h" +#ifdef __GNUC__ +# pragma GCC system_header +#endif + +#include <assimp/defs.h> #ifdef __cplusplus @@ -56,8 +60,7 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * alpha component. Color values range from 0 to 1. */ // ---------------------------------------------------------------------------------- template <typename TReal> -class aiColor4t -{ +class aiColor4t { public: aiColor4t() AI_NO_EXCEPT : r(), g(), b(), a() {} aiColor4t (TReal _r, TReal _g, TReal _b, TReal _a) @@ -65,14 +68,12 @@ public: explicit aiColor4t (TReal _r) : r(_r), g(_r), b(_r), a(_r) {} aiColor4t (const aiColor4t& o) = default; -public: // combined operators const aiColor4t& operator += (const aiColor4t& o); const aiColor4t& operator -= (const aiColor4t& o); const aiColor4t& operator *= (TReal f); const aiColor4t& operator /= (TReal f); -public: // comparison bool operator == (const aiColor4t& other) const; bool operator != (const aiColor4t& other) const; @@ -85,8 +86,6 @@ public: /** check whether a color is (close to) black */ inline bool IsBlack() const; -public: - // Red, green, blue and alpha color values TReal r, g, b, a; }; // !struct aiColor4D diff --git a/thirdparty/assimp/include/assimp/color4.inl b/thirdparty/assimp/include/assimp/color4.inl index afa53dcb5b..d4a2a98109 100644 --- a/thirdparty/assimp/include/assimp/color4.inl +++ b/thirdparty/assimp/include/assimp/color4.inl @@ -48,36 +48,61 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_COLOR4D_INL_INC #define AI_COLOR4D_INL_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #ifdef __cplusplus -#include "color4.h" +#include <assimp/color4.h> // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE const aiColor4t<TReal>& aiColor4t<TReal>::operator += (const aiColor4t<TReal>& o) { - r += o.r; g += o.g; b += o.b; a += o.a; +AI_FORCE_INLINE +const aiColor4t<TReal>& aiColor4t<TReal>::operator += (const aiColor4t<TReal>& o) { + r += o.r; + g += o.g; + b += o.b; + a += o.a; + return *this; } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE const aiColor4t<TReal>& aiColor4t<TReal>::operator -= (const aiColor4t<TReal>& o) { - r -= o.r; g -= o.g; b -= o.b; a -= o.a; +AI_FORCE_INLINE +const aiColor4t<TReal>& aiColor4t<TReal>::operator -= (const aiColor4t<TReal>& o) { + r -= o.r; + g -= o.g; + b -= o.b; + a -= o.a; + return *this; } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE const aiColor4t<TReal>& aiColor4t<TReal>::operator *= (TReal f) { - r *= f; g *= f; b *= f; a *= f; +AI_FORCE_INLINE +const aiColor4t<TReal>& aiColor4t<TReal>::operator *= (TReal f) { + r *= f; + g *= f; + b *= f; + a *= f; + return *this; } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE const aiColor4t<TReal>& aiColor4t<TReal>::operator /= (TReal f) { - r /= f; g /= f; b /= f; a /= f; +AI_FORCE_INLINE +const aiColor4t<TReal>& aiColor4t<TReal>::operator /= (TReal f) { + r /= f; + g /= f; + b /= f; + a /= f; + return *this; } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE TReal aiColor4t<TReal>::operator[](unsigned int i) const { +AI_FORCE_INLINE +TReal aiColor4t<TReal>::operator[](unsigned int i) const { switch ( i ) { case 0: return r; @@ -94,7 +119,8 @@ AI_FORCE_INLINE TReal aiColor4t<TReal>::operator[](unsigned int i) const { } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE TReal& aiColor4t<TReal>::operator[](unsigned int i) { +AI_FORCE_INLINE +TReal& aiColor4t<TReal>::operator[](unsigned int i) { switch ( i ) { case 0: return r; @@ -111,17 +137,20 @@ AI_FORCE_INLINE TReal& aiColor4t<TReal>::operator[](unsigned int i) { } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE bool aiColor4t<TReal>::operator== (const aiColor4t<TReal>& other) const { +AI_FORCE_INLINE +bool aiColor4t<TReal>::operator== (const aiColor4t<TReal>& other) const { return r == other.r && g == other.g && b == other.b && a == other.a; } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE bool aiColor4t<TReal>::operator!= (const aiColor4t<TReal>& other) const { +AI_FORCE_INLINE +bool aiColor4t<TReal>::operator!= (const aiColor4t<TReal>& other) const { return r != other.r || g != other.g || b != other.b || a != other.a; } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE bool aiColor4t<TReal>::operator< (const aiColor4t<TReal>& other) const { +AI_FORCE_INLINE +bool aiColor4t<TReal>::operator< (const aiColor4t<TReal>& other) const { return r < other.r || ( r == other.r && ( g < other.g || ( @@ -136,14 +165,17 @@ AI_FORCE_INLINE bool aiColor4t<TReal>::operator< (const aiColor4t<TReal>& other) ) ); } + // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE aiColor4t<TReal> operator + (const aiColor4t<TReal>& v1, const aiColor4t<TReal>& v2) { +AI_FORCE_INLINE +aiColor4t<TReal> operator + (const aiColor4t<TReal>& v1, const aiColor4t<TReal>& v2) { return aiColor4t<TReal>( v1.r + v2.r, v1.g + v2.g, v1.b + v2.b, v1.a + v2.a); } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE aiColor4t<TReal> operator - (const aiColor4t<TReal>& v1, const aiColor4t<TReal>& v2) { +AI_FORCE_INLINE +aiColor4t<TReal> operator - (const aiColor4t<TReal>& v1, const aiColor4t<TReal>& v2) { return aiColor4t<TReal>( v1.r - v2.r, v1.g - v2.g, v1.b - v2.b, v1.a - v2.a); } // ------------------------------------------------------------------------------------------------ @@ -153,53 +185,63 @@ AI_FORCE_INLINE aiColor4t<TReal> operator * (const aiColor4t<TReal>& v1, const a } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE aiColor4t<TReal> operator / (const aiColor4t<TReal>& v1, const aiColor4t<TReal>& v2) { +AI_FORCE_INLINE +aiColor4t<TReal> operator / (const aiColor4t<TReal>& v1, const aiColor4t<TReal>& v2) { return aiColor4t<TReal>( v1.r / v2.r, v1.g / v2.g, v1.b / v2.b, v1.a / v2.a); } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE aiColor4t<TReal> operator * ( TReal f, const aiColor4t<TReal>& v) { +AI_FORCE_INLINE +aiColor4t<TReal> operator * ( TReal f, const aiColor4t<TReal>& v) { return aiColor4t<TReal>( f*v.r, f*v.g, f*v.b, f*v.a); } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE aiColor4t<TReal> operator * ( const aiColor4t<TReal>& v, TReal f) { +AI_FORCE_INLINE +aiColor4t<TReal> operator * ( const aiColor4t<TReal>& v, TReal f) { return aiColor4t<TReal>( f*v.r, f*v.g, f*v.b, f*v.a); } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE aiColor4t<TReal> operator / ( const aiColor4t<TReal>& v, TReal f) { +AI_FORCE_INLINE +aiColor4t<TReal> operator / ( const aiColor4t<TReal>& v, TReal f) { return v * (1/f); } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE aiColor4t<TReal> operator / ( TReal f,const aiColor4t<TReal>& v) { +AI_FORCE_INLINE +aiColor4t<TReal> operator / ( TReal f,const aiColor4t<TReal>& v) { return aiColor4t<TReal>(f,f,f,f)/v; } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE aiColor4t<TReal> operator + ( const aiColor4t<TReal>& v, TReal f) { +AI_FORCE_INLINE +aiColor4t<TReal> operator + ( const aiColor4t<TReal>& v, TReal f) { return aiColor4t<TReal>( f+v.r, f+v.g, f+v.b, f+v.a); } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE aiColor4t<TReal> operator - ( const aiColor4t<TReal>& v, TReal f) { +AI_FORCE_INLINE +aiColor4t<TReal> operator - ( const aiColor4t<TReal>& v, TReal f) { return aiColor4t<TReal>( v.r-f, v.g-f, v.b-f, v.a-f); } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE aiColor4t<TReal> operator + ( TReal f, const aiColor4t<TReal>& v) { +AI_FORCE_INLINE +aiColor4t<TReal> operator + ( TReal f, const aiColor4t<TReal>& v) { return aiColor4t<TReal>( f+v.r, f+v.g, f+v.b, f+v.a); } // ------------------------------------------------------------------------------------------------ template <typename TReal> -AI_FORCE_INLINE aiColor4t<TReal> operator - ( TReal f, const aiColor4t<TReal>& v) { +AI_FORCE_INLINE +aiColor4t<TReal> operator - ( TReal f, const aiColor4t<TReal>& v) { return aiColor4t<TReal>( f-v.r, f-v.g, f-v.b, f-v.a); } // ------------------------------------------------------------------------------------------------ template <typename TReal> -inline bool aiColor4t<TReal> :: IsBlack() const { +AI_FORCE_INLINE +bool aiColor4t<TReal>::IsBlack() const { // The alpha component doesn't care here. black is black. static const TReal epsilon = 10e-3f; return std::fabs( r ) < epsilon && std::fabs( g ) < epsilon && std::fabs( b ) < epsilon; diff --git a/thirdparty/assimp/include/assimp/defs.h b/thirdparty/assimp/include/assimp/defs.h index 05a5e3fd4b..6f2f8ae88b 100644 --- a/thirdparty/assimp/include/assimp/defs.h +++ b/thirdparty/assimp/include/assimp/defs.h @@ -5,8 +5,6 @@ Open Asset Import Library (assimp) Copyright (c) 2006-2019, assimp team - - All rights reserved. Redistribution and use of this software in source and binary forms, @@ -50,6 +48,10 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_DEFINES_H_INC #define AI_DEFINES_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/config.h> ////////////////////////////////////////////////////////////////////////// @@ -126,16 +128,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * GENBOUNDINGBOXES */ ////////////////////////////////////////////////////////////////////////// -#ifdef _MSC_VER +#ifdef _WIN32 # undef ASSIMP_API - ////////////////////////////////////////////////////////////////////////// /* Define 'ASSIMP_BUILD_DLL_EXPORT' to build a DLL of the library */ ////////////////////////////////////////////////////////////////////////// # ifdef ASSIMP_BUILD_DLL_EXPORT # define ASSIMP_API __declspec(dllexport) # define ASSIMP_API_WINONLY __declspec(dllexport) -# pragma warning (disable : 4251) ////////////////////////////////////////////////////////////////////////// /* Define 'ASSIMP_DLL' before including Assimp to link to ASSIMP in @@ -148,7 +148,19 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # define ASSIMP_API # define ASSIMP_API_WINONLY # endif +#elif defined(SWIG) + + /* Do nothing, the relevant defines are all in AssimpSwigPort.i */ +#else +# define ASSIMP_API __attribute__ ((visibility("default"))) +# define ASSIMP_API_WINONLY +#endif + +#ifdef _MSC_VER +# ifdef ASSIMP_BUILD_DLL_EXPORT +# pragma warning (disable : 4251) +# endif /* Force the compiler to inline a function, if possible */ # define AI_FORCE_INLINE __forceinline @@ -156,17 +168,12 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /* Tells the compiler that a function never returns. Used in code analysis * to skip dead paths (e.g. after an assertion evaluated to false). */ # define AI_WONT_RETURN __declspec(noreturn) - #elif defined(SWIG) /* Do nothing, the relevant defines are all in AssimpSwigPort.i */ #else - # define AI_WONT_RETURN - -# define ASSIMP_API __attribute__ ((visibility("default"))) -# define ASSIMP_API_WINONLY # define AI_FORCE_INLINE inline #endif // (defined _MSC_VER) @@ -291,9 +298,10 @@ static const ai_real ai_epsilon = (ai_real) 0.00001; #endif -/* To avoid running out of memory - * This can be adjusted for specific use cases - * It's NOT a total limit, just a limit for individual allocations +/** + * To avoid running out of memory + * This can be adjusted for specific use cases + * It's NOT a total limit, just a limit for individual allocations */ #define AI_MAX_ALLOC(type) ((256U * 1024 * 1024) / sizeof(type)) @@ -307,4 +315,13 @@ static const ai_real ai_epsilon = (ai_real) 0.00001; # endif #endif // _MSC_VER +/** + * Helper macro to set a pointer to NULL in debug builds + */ +#if (defined ASSIMP_BUILD_DEBUG) +# define AI_DEBUG_INVALIDATE_PTR(x) x = NULL; +#else +# define AI_DEBUG_INVALIDATE_PTR(x) +#endif + #endif // !! AI_DEFINES_H_INC diff --git a/thirdparty/assimp/include/assimp/fast_atof.h b/thirdparty/assimp/include/assimp/fast_atof.h index 62ea969e97..6e9a1bba7a 100644 --- a/thirdparty/assimp/include/assimp/fast_atof.h +++ b/thirdparty/assimp/include/assimp/fast_atof.h @@ -13,10 +13,14 @@ // to ensure long numbers are handled correctly // ------------------------------------------------------------------------------------ - +#pragma once #ifndef FAST_A_TO_F_H_INCLUDED #define FAST_A_TO_F_H_INCLUDED +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <cmath> #include <limits> #include <stdint.h> diff --git a/thirdparty/assimp/include/assimp/importerdesc.h b/thirdparty/assimp/include/assimp/importerdesc.h index 36e387f011..0a6919c1ae 100644 --- a/thirdparty/assimp/include/assimp/importerdesc.h +++ b/thirdparty/assimp/include/assimp/importerdesc.h @@ -48,11 +48,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_IMPORTER_DESC_H_INC #define AI_IMPORTER_DESC_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + /** Mixed set of flags for #aiImporterDesc, indicating some features * common to many importers*/ -enum aiImporterFlags -{ +enum aiImporterFlags { /** Indicates that there is a textual encoding of the * file format; and that it is supported.*/ aiImporterFlags_SupportTextFlavour = 0x1, @@ -87,8 +90,7 @@ enum aiImporterFlags * as importers/exporters are added to Assimp, so it might be useful * to have a common mechanism to query some rough importer * characteristics. */ -struct aiImporterDesc -{ +struct aiImporterDesc { /** Full name of the importer (i.e. Blender3D importer)*/ const char* mName; diff --git a/thirdparty/assimp/include/assimp/light.h b/thirdparty/assimp/include/assimp/light.h index 1667cfb8c1..bdb2368c4f 100644 --- a/thirdparty/assimp/include/assimp/light.h +++ b/thirdparty/assimp/include/assimp/light.h @@ -49,7 +49,11 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_LIGHT_H_INC #define AI_LIGHT_H_INC -#include "types.h" +#ifdef __GNUC__ +# pragma GCC system_header +#endif + +#include <assimp/types.h> #ifdef __cplusplus extern "C" { diff --git a/thirdparty/assimp/include/assimp/material.h b/thirdparty/assimp/include/assimp/material.h index 206eb2a2b0..19a7c69709 100644 --- a/thirdparty/assimp/include/assimp/material.h +++ b/thirdparty/assimp/include/assimp/material.h @@ -48,7 +48,11 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_MATERIAL_H_INC #define AI_MATERIAL_H_INC -#include "types.h" +#ifdef __GNUC__ +# pragma GCC system_header +#endif + +#include <assimp/types.h> #ifdef __cplusplus extern "C" { diff --git a/thirdparty/assimp/include/assimp/material.inl b/thirdparty/assimp/include/assimp/material.inl index b05d6af6c3..8ae6b88d3e 100644 --- a/thirdparty/assimp/include/assimp/material.inl +++ b/thirdparty/assimp/include/assimp/material.inl @@ -49,14 +49,18 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_MATERIAL_INL_INC #define AI_MATERIAL_INL_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + // --------------------------------------------------------------------------- -inline aiPropertyTypeInfo ai_real_to_property_type_info(float) -{ +AI_FORCE_INLINE +aiPropertyTypeInfo ai_real_to_property_type_info(float) { return aiPTI_Float; } -inline aiPropertyTypeInfo ai_real_to_property_type_info(double) -{ +AI_FORCE_INLINE +aiPropertyTypeInfo ai_real_to_property_type_info(double) { return aiPTI_Double; } // --------------------------------------------------------------------------- @@ -64,30 +68,30 @@ inline aiPropertyTypeInfo ai_real_to_property_type_info(double) //! @cond never // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::GetTexture( aiTextureType type, - unsigned int index, - C_STRUCT aiString* path, - aiTextureMapping* mapping /*= NULL*/, - unsigned int* uvindex /*= NULL*/, - ai_real* blend /*= NULL*/, - aiTextureOp* op /*= NULL*/, - aiTextureMapMode* mapmode /*= NULL*/) const -{ +AI_FORCE_INLINE +aiReturn aiMaterial::GetTexture( aiTextureType type, + unsigned int index, + C_STRUCT aiString* path, + aiTextureMapping* mapping /*= NULL*/, + unsigned int* uvindex /*= NULL*/, + ai_real* blend /*= NULL*/, + aiTextureOp* op /*= NULL*/, + aiTextureMapMode* mapmode /*= NULL*/) const { return ::aiGetMaterialTexture(this,type,index,path,mapping,uvindex,blend,op,mapmode); } // --------------------------------------------------------------------------- -inline unsigned int aiMaterial::GetTextureCount(aiTextureType type) const -{ +AI_FORCE_INLINE +unsigned int aiMaterial::GetTextureCount(aiTextureType type) const { return ::aiGetMaterialTextureCount(this,type); } // --------------------------------------------------------------------------- template <typename Type> -inline aiReturn aiMaterial::Get(const char* pKey,unsigned int type, - unsigned int idx, Type* pOut, - unsigned int* pMax) const -{ +AI_FORCE_INLINE +aiReturn aiMaterial::Get(const char* pKey,unsigned int type, + unsigned int idx, Type* pOut, + unsigned int* pMax) const { unsigned int iNum = pMax ? *pMax : 1; const aiMaterialProperty* prop; @@ -114,9 +118,9 @@ inline aiReturn aiMaterial::Get(const char* pKey,unsigned int type, // --------------------------------------------------------------------------- template <typename Type> -inline aiReturn aiMaterial::Get(const char* pKey,unsigned int type, - unsigned int idx,Type& pOut) const -{ +AI_FORCE_INLINE +aiReturn aiMaterial::Get(const char* pKey,unsigned int type, + unsigned int idx,Type& pOut) const { const aiMaterialProperty* prop; const aiReturn ret = ::aiGetMaterialProperty(this,pKey,type,idx, (const aiMaterialProperty**)&prop); @@ -136,60 +140,56 @@ inline aiReturn aiMaterial::Get(const char* pKey,unsigned int type, } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::Get(const char* pKey,unsigned int type, - unsigned int idx,ai_real* pOut, - unsigned int* pMax) const -{ +AI_FORCE_INLINE +aiReturn aiMaterial::Get(const char* pKey,unsigned int type, + unsigned int idx,ai_real* pOut, + unsigned int* pMax) const { return ::aiGetMaterialFloatArray(this,pKey,type,idx,pOut,pMax); } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::Get(const char* pKey,unsigned int type, - unsigned int idx,int* pOut, - unsigned int* pMax) const -{ +AI_FORCE_INLINE +aiReturn aiMaterial::Get(const char* pKey,unsigned int type, + unsigned int idx,int* pOut, + unsigned int* pMax) const { return ::aiGetMaterialIntegerArray(this,pKey,type,idx,pOut,pMax); } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::Get(const char* pKey,unsigned int type, - unsigned int idx,ai_real& pOut) const -{ +AI_FORCE_INLINE +aiReturn aiMaterial::Get(const char* pKey,unsigned int type, + unsigned int idx,ai_real& pOut) const { return aiGetMaterialFloat(this,pKey,type,idx,&pOut); } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::Get(const char* pKey,unsigned int type, - unsigned int idx,int& pOut) const -{ +AI_FORCE_INLINE +aiReturn aiMaterial::Get(const char* pKey,unsigned int type, + unsigned int idx,int& pOut) const { return aiGetMaterialInteger(this,pKey,type,idx,&pOut); } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::Get(const char* pKey,unsigned int type, - unsigned int idx,aiColor4D& pOut) const -{ +AI_FORCE_INLINE +aiReturn aiMaterial::Get(const char* pKey,unsigned int type, + unsigned int idx,aiColor4D& pOut) const { return aiGetMaterialColor(this,pKey,type,idx,&pOut); } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::Get(const char* pKey,unsigned int type, - unsigned int idx,aiColor3D& pOut) const -{ +AI_FORCE_INLINE aiReturn aiMaterial::Get(const char* pKey,unsigned int type, + unsigned int idx,aiColor3D& pOut) const { aiColor4D c; const aiReturn ret = aiGetMaterialColor(this,pKey,type,idx,&c); pOut = aiColor3D(c.r,c.g,c.b); return ret; } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::Get(const char* pKey,unsigned int type, - unsigned int idx,aiString& pOut) const -{ +AI_FORCE_INLINE aiReturn aiMaterial::Get(const char* pKey,unsigned int type, + unsigned int idx,aiString& pOut) const { return aiGetMaterialString(this,pKey,type,idx,&pOut); } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::Get(const char* pKey,unsigned int type, - unsigned int idx,aiUVTransform& pOut) const -{ +AI_FORCE_INLINE aiReturn aiMaterial::Get(const char* pKey,unsigned int type, + unsigned int idx,aiUVTransform& pOut) const { return aiGetMaterialUVTransform(this,pKey,type,idx,&pOut); } - // --------------------------------------------------------------------------- template<class TYPE> aiReturn aiMaterial::AddProperty (const TYPE* pInput, @@ -204,84 +204,83 @@ aiReturn aiMaterial::AddProperty (const TYPE* pInput, } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::AddProperty(const float* pInput, - const unsigned int pNumValues, - const char* pKey, - unsigned int type, - unsigned int index) -{ +AI_FORCE_INLINE aiReturn aiMaterial::AddProperty(const float* pInput, + const unsigned int pNumValues, + const char* pKey, + unsigned int type, + unsigned int index) { return AddBinaryProperty((const void*)pInput, pNumValues * sizeof(float), pKey,type,index,aiPTI_Float); } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::AddProperty(const double* pInput, - const unsigned int pNumValues, - const char* pKey, - unsigned int type, - unsigned int index) -{ +AI_FORCE_INLINE +aiReturn aiMaterial::AddProperty(const double* pInput, + const unsigned int pNumValues, + const char* pKey, + unsigned int type, + unsigned int index) { return AddBinaryProperty((const void*)pInput, pNumValues * sizeof(double), pKey,type,index,aiPTI_Double); } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::AddProperty(const aiUVTransform* pInput, - const unsigned int pNumValues, - const char* pKey, - unsigned int type, - unsigned int index) -{ +AI_FORCE_INLINE +aiReturn aiMaterial::AddProperty(const aiUVTransform* pInput, + const unsigned int pNumValues, + const char* pKey, + unsigned int type, + unsigned int index) { return AddBinaryProperty((const void*)pInput, pNumValues * sizeof(aiUVTransform), pKey,type,index,ai_real_to_property_type_info(pInput->mRotation)); } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::AddProperty(const aiColor4D* pInput, - const unsigned int pNumValues, - const char* pKey, - unsigned int type, - unsigned int index) -{ +AI_FORCE_INLINE +aiReturn aiMaterial::AddProperty(const aiColor4D* pInput, + const unsigned int pNumValues, + const char* pKey, + unsigned int type, + unsigned int index) { return AddBinaryProperty((const void*)pInput, pNumValues * sizeof(aiColor4D), pKey,type,index,ai_real_to_property_type_info(pInput->a)); } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::AddProperty(const aiColor3D* pInput, - const unsigned int pNumValues, - const char* pKey, - unsigned int type, - unsigned int index) -{ +AI_FORCE_INLINE +aiReturn aiMaterial::AddProperty(const aiColor3D* pInput, + const unsigned int pNumValues, + const char* pKey, + unsigned int type, + unsigned int index) { return AddBinaryProperty((const void*)pInput, pNumValues * sizeof(aiColor3D), pKey,type,index,ai_real_to_property_type_info(pInput->b)); } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::AddProperty(const aiVector3D* pInput, - const unsigned int pNumValues, - const char* pKey, - unsigned int type, - unsigned int index) -{ +AI_FORCE_INLINE +aiReturn aiMaterial::AddProperty(const aiVector3D* pInput, + const unsigned int pNumValues, + const char* pKey, + unsigned int type, + unsigned int index) { return AddBinaryProperty((const void*)pInput, pNumValues * sizeof(aiVector3D), pKey,type,index,ai_real_to_property_type_info(pInput->x)); } // --------------------------------------------------------------------------- -inline aiReturn aiMaterial::AddProperty(const int* pInput, - const unsigned int pNumValues, - const char* pKey, - unsigned int type, - unsigned int index) -{ +AI_FORCE_INLINE +aiReturn aiMaterial::AddProperty(const int* pInput, + const unsigned int pNumValues, + const char* pKey, + unsigned int type, + unsigned int index) { return AddBinaryProperty((const void*)pInput, pNumValues * sizeof(int), pKey,type,index,aiPTI_Integer); @@ -296,12 +295,12 @@ inline aiReturn aiMaterial::AddProperty(const int* pInput, // --------------------------------------------------------------------------- template<> -inline aiReturn aiMaterial::AddProperty<float>(const float* pInput, - const unsigned int pNumValues, - const char* pKey, - unsigned int type, - unsigned int index) -{ +AI_FORCE_INLINE +aiReturn aiMaterial::AddProperty<float>(const float* pInput, + const unsigned int pNumValues, + const char* pKey, + unsigned int type, + unsigned int index) { return AddBinaryProperty((const void*)pInput, pNumValues * sizeof(float), pKey,type,index,aiPTI_Float); @@ -309,12 +308,12 @@ inline aiReturn aiMaterial::AddProperty<float>(const float* pInput, // --------------------------------------------------------------------------- template<> -inline aiReturn aiMaterial::AddProperty<double>(const double* pInput, - const unsigned int pNumValues, - const char* pKey, - unsigned int type, - unsigned int index) -{ +AI_FORCE_INLINE +aiReturn aiMaterial::AddProperty<double>(const double* pInput, + const unsigned int pNumValues, + const char* pKey, + unsigned int type, + unsigned int index) { return AddBinaryProperty((const void*)pInput, pNumValues * sizeof(double), pKey,type,index,aiPTI_Double); @@ -322,12 +321,12 @@ inline aiReturn aiMaterial::AddProperty<double>(const double* pInput, // --------------------------------------------------------------------------- template<> -inline aiReturn aiMaterial::AddProperty<aiUVTransform>(const aiUVTransform* pInput, - const unsigned int pNumValues, - const char* pKey, - unsigned int type, - unsigned int index) -{ +AI_FORCE_INLINE +aiReturn aiMaterial::AddProperty<aiUVTransform>(const aiUVTransform* pInput, + const unsigned int pNumValues, + const char* pKey, + unsigned int type, + unsigned int index) { return AddBinaryProperty((const void*)pInput, pNumValues * sizeof(aiUVTransform), pKey,type,index,aiPTI_Float); @@ -335,12 +334,12 @@ inline aiReturn aiMaterial::AddProperty<aiUVTransform>(const aiUVTransform* pInp // --------------------------------------------------------------------------- template<> -inline aiReturn aiMaterial::AddProperty<aiColor4D>(const aiColor4D* pInput, - const unsigned int pNumValues, - const char* pKey, - unsigned int type, - unsigned int index) -{ +AI_FORCE_INLINE +aiReturn aiMaterial::AddProperty<aiColor4D>(const aiColor4D* pInput, + const unsigned int pNumValues, + const char* pKey, + unsigned int type, + unsigned int index) { return AddBinaryProperty((const void*)pInput, pNumValues * sizeof(aiColor4D), pKey,type,index,aiPTI_Float); @@ -348,12 +347,12 @@ inline aiReturn aiMaterial::AddProperty<aiColor4D>(const aiColor4D* pInput, // --------------------------------------------------------------------------- template<> -inline aiReturn aiMaterial::AddProperty<aiColor3D>(const aiColor3D* pInput, - const unsigned int pNumValues, - const char* pKey, - unsigned int type, - unsigned int index) -{ +AI_FORCE_INLINE +aiReturn aiMaterial::AddProperty<aiColor3D>(const aiColor3D* pInput, + const unsigned int pNumValues, + const char* pKey, + unsigned int type, + unsigned int index) { return AddBinaryProperty((const void*)pInput, pNumValues * sizeof(aiColor3D), pKey,type,index,aiPTI_Float); @@ -361,12 +360,12 @@ inline aiReturn aiMaterial::AddProperty<aiColor3D>(const aiColor3D* pInput, // --------------------------------------------------------------------------- template<> -inline aiReturn aiMaterial::AddProperty<aiVector3D>(const aiVector3D* pInput, - const unsigned int pNumValues, - const char* pKey, - unsigned int type, - unsigned int index) -{ +AI_FORCE_INLINE +aiReturn aiMaterial::AddProperty<aiVector3D>(const aiVector3D* pInput, + const unsigned int pNumValues, + const char* pKey, + unsigned int type, + unsigned int index) { return AddBinaryProperty((const void*)pInput, pNumValues * sizeof(aiVector3D), pKey,type,index,aiPTI_Float); @@ -374,12 +373,12 @@ inline aiReturn aiMaterial::AddProperty<aiVector3D>(const aiVector3D* pInput, // --------------------------------------------------------------------------- template<> -inline aiReturn aiMaterial::AddProperty<int>(const int* pInput, - const unsigned int pNumValues, - const char* pKey, - unsigned int type, - unsigned int index) -{ +AI_FORCE_INLINE +aiReturn aiMaterial::AddProperty<int>(const int* pInput, + const unsigned int pNumValues, + const char* pKey, + unsigned int type, + unsigned int index) { return AddBinaryProperty((const void*)pInput, pNumValues * sizeof(int), pKey,type,index,aiPTI_Integer); diff --git a/thirdparty/assimp/include/assimp/matrix3x3.h b/thirdparty/assimp/include/assimp/matrix3x3.h index 22b69561ff..2c26cf92bb 100644 --- a/thirdparty/assimp/include/assimp/matrix3x3.h +++ b/thirdparty/assimp/include/assimp/matrix3x3.h @@ -48,7 +48,11 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_MATRIX3X3_H_INC #define AI_MATRIX3X3_H_INC -#include "defs.h" +#ifdef __GNUC__ +# pragma GCC system_header +#endif + +#include <assimp/defs.h> #ifdef __cplusplus @@ -65,10 +69,8 @@ template <typename T> class aiVector2t; * defined thereby. */ template <typename TReal> -class aiMatrix3x3t -{ +class aiMatrix3x3t { public: - aiMatrix3x3t() AI_NO_EXCEPT : a1(static_cast<TReal>(1.0f)), a2(), a3(), b1(), b2(static_cast<TReal>(1.0f)), b3(), @@ -82,8 +84,6 @@ public: c1(_c1), c2(_c2), c3(_c3) {} -public: - // matrix multiplication. aiMatrix3x3t& operator *= (const aiMatrix3x3t& m); aiMatrix3x3t operator * (const aiMatrix3x3t& m) const; @@ -101,8 +101,6 @@ public: template <typename TOther> operator aiMatrix3x3t<TOther> () const; -public: - // ------------------------------------------------------------------- /** @brief Construction from a 4x4 matrix. The remaining parts * of the matrix are ignored. @@ -122,7 +120,6 @@ public: aiMatrix3x3t& Inverse(); TReal Determinant() const; -public: // ------------------------------------------------------------------- /** @brief Returns a rotation matrix for a rotation around z * @param a Rotation angle, in radians diff --git a/thirdparty/assimp/include/assimp/matrix3x3.inl b/thirdparty/assimp/include/assimp/matrix3x3.inl index d9d45a3e92..1ce8c9691c 100644 --- a/thirdparty/assimp/include/assimp/matrix3x3.inl +++ b/thirdparty/assimp/include/assimp/matrix3x3.inl @@ -48,10 +48,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_MATRIX3X3_INL_INC #define AI_MATRIX3X3_INL_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #ifdef __cplusplus -#include "matrix3x3.h" +#include <assimp/matrix3x3.h> +#include <assimp/matrix4x4.h> -#include "matrix4x4.h" #include <algorithm> #include <cmath> #include <limits> @@ -59,8 +63,8 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // ------------------------------------------------------------------------------------------------ // Construction from a 4x4 matrix. The remaining parts of the matrix are ignored. template <typename TReal> -inline aiMatrix3x3t<TReal>::aiMatrix3x3t( const aiMatrix4x4t<TReal>& pMatrix) -{ +AI_FORCE_INLINE +aiMatrix3x3t<TReal>::aiMatrix3x3t( const aiMatrix4x4t<TReal>& pMatrix) { a1 = pMatrix.a1; a2 = pMatrix.a2; a3 = pMatrix.a3; b1 = pMatrix.b1; b2 = pMatrix.b2; b3 = pMatrix.b3; c1 = pMatrix.c1; c2 = pMatrix.c2; c3 = pMatrix.c3; @@ -68,8 +72,8 @@ inline aiMatrix3x3t<TReal>::aiMatrix3x3t( const aiMatrix4x4t<TReal>& pMatrix) // ------------------------------------------------------------------------------------------------ template <typename TReal> -inline aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::operator *= (const aiMatrix3x3t<TReal>& m) -{ +AI_FORCE_INLINE +aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::operator *= (const aiMatrix3x3t<TReal>& m) { *this = aiMatrix3x3t<TReal>(m.a1 * a1 + m.b1 * a2 + m.c1 * a3, m.a2 * a1 + m.b2 * a2 + m.c2 * a3, m.a3 * a1 + m.b3 * a2 + m.c3 * a3, @@ -85,8 +89,7 @@ inline aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::operator *= (const aiMatrix3x3t // ------------------------------------------------------------------------------------------------ template <typename TReal> template <typename TOther> -aiMatrix3x3t<TReal>::operator aiMatrix3x3t<TOther> () const -{ +aiMatrix3x3t<TReal>::operator aiMatrix3x3t<TOther> () const { return aiMatrix3x3t<TOther>(static_cast<TOther>(a1),static_cast<TOther>(a2),static_cast<TOther>(a3), static_cast<TOther>(b1),static_cast<TOther>(b2),static_cast<TOther>(b3), static_cast<TOther>(c1),static_cast<TOther>(c2),static_cast<TOther>(c3)); @@ -94,8 +97,8 @@ aiMatrix3x3t<TReal>::operator aiMatrix3x3t<TOther> () const // ------------------------------------------------------------------------------------------------ template <typename TReal> -inline aiMatrix3x3t<TReal> aiMatrix3x3t<TReal>::operator* (const aiMatrix3x3t<TReal>& m) const -{ +AI_FORCE_INLINE +aiMatrix3x3t<TReal> aiMatrix3x3t<TReal>::operator* (const aiMatrix3x3t<TReal>& m) const { aiMatrix3x3t<TReal> temp( *this); temp *= m; return temp; @@ -103,7 +106,8 @@ inline aiMatrix3x3t<TReal> aiMatrix3x3t<TReal>::operator* (const aiMatrix3x3t<TR // ------------------------------------------------------------------------------------------------ template <typename TReal> -inline TReal* aiMatrix3x3t<TReal>::operator[] (unsigned int p_iIndex) { +AI_FORCE_INLINE +TReal* aiMatrix3x3t<TReal>::operator[] (unsigned int p_iIndex) { switch ( p_iIndex ) { case 0: return &a1; @@ -119,7 +123,8 @@ inline TReal* aiMatrix3x3t<TReal>::operator[] (unsigned int p_iIndex) { // ------------------------------------------------------------------------------------------------ template <typename TReal> -inline const TReal* aiMatrix3x3t<TReal>::operator[] (unsigned int p_iIndex) const { +AI_FORCE_INLINE +const TReal* aiMatrix3x3t<TReal>::operator[] (unsigned int p_iIndex) const { switch ( p_iIndex ) { case 0: return &a1; @@ -135,8 +140,8 @@ inline const TReal* aiMatrix3x3t<TReal>::operator[] (unsigned int p_iIndex) cons // ------------------------------------------------------------------------------------------------ template <typename TReal> -inline bool aiMatrix3x3t<TReal>::operator== (const aiMatrix4x4t<TReal>& m) const -{ +AI_FORCE_INLINE +bool aiMatrix3x3t<TReal>::operator== (const aiMatrix4x4t<TReal>& m) const { return a1 == m.a1 && a2 == m.a2 && a3 == m.a3 && b1 == m.b1 && b2 == m.b2 && b3 == m.b3 && c1 == m.c1 && c2 == m.c2 && c3 == m.c3; @@ -144,14 +149,15 @@ inline bool aiMatrix3x3t<TReal>::operator== (const aiMatrix4x4t<TReal>& m) const // ------------------------------------------------------------------------------------------------ template <typename TReal> -inline bool aiMatrix3x3t<TReal>::operator!= (const aiMatrix4x4t<TReal>& m) const -{ +AI_FORCE_INLINE +bool aiMatrix3x3t<TReal>::operator!= (const aiMatrix4x4t<TReal>& m) const { return !(*this == m); } // --------------------------------------------------------------------------- template<typename TReal> -inline bool aiMatrix3x3t<TReal>::Equal(const aiMatrix4x4t<TReal>& m, TReal epsilon) const { +AI_FORCE_INLINE +bool aiMatrix3x3t<TReal>::Equal(const aiMatrix4x4t<TReal>& m, TReal epsilon) const { return std::abs(a1 - m.a1) <= epsilon && std::abs(a2 - m.a2) <= epsilon && @@ -166,8 +172,8 @@ inline bool aiMatrix3x3t<TReal>::Equal(const aiMatrix4x4t<TReal>& m, TReal epsil // ------------------------------------------------------------------------------------------------ template <typename TReal> -inline aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::Transpose() -{ +AI_FORCE_INLINE +aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::Transpose() { // (TReal&) don't remove, GCC complains cause of packed fields std::swap( (TReal&)a2, (TReal&)b1); std::swap( (TReal&)a3, (TReal&)c1); @@ -177,15 +183,15 @@ inline aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::Transpose() // ---------------------------------------------------------------------------------------- template <typename TReal> -inline TReal aiMatrix3x3t<TReal>::Determinant() const -{ +AI_FORCE_INLINE +TReal aiMatrix3x3t<TReal>::Determinant() const { return a1*b2*c3 - a1*b3*c2 + a2*b3*c1 - a2*b1*c3 + a3*b1*c2 - a3*b2*c1; } // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::Inverse() -{ +AI_FORCE_INLINE +aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::Inverse() { // Compute the reciprocal determinant TReal det = Determinant(); if(det == static_cast<TReal>(0.0)) @@ -219,8 +225,8 @@ inline aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::Inverse() // ------------------------------------------------------------------------------------------------ template <typename TReal> -inline aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::RotationZ(TReal a, aiMatrix3x3t<TReal>& out) -{ +AI_FORCE_INLINE +aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::RotationZ(TReal a, aiMatrix3x3t<TReal>& out) { out.a1 = out.b2 = std::cos(a); out.b1 = std::sin(a); out.a2 = - out.b1; @@ -234,8 +240,8 @@ inline aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::RotationZ(TReal a, aiMatrix3x3t // ------------------------------------------------------------------------------------------------ // Returns a rotation matrix for a rotation around an arbitrary axis. template <typename TReal> -inline aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::Rotation( TReal a, const aiVector3t<TReal>& axis, aiMatrix3x3t<TReal>& out) -{ +AI_FORCE_INLINE +aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::Rotation( TReal a, const aiVector3t<TReal>& axis, aiMatrix3x3t<TReal>& out) { TReal c = std::cos( a), s = std::sin( a), t = 1 - c; TReal x = axis.x, y = axis.y, z = axis.z; @@ -249,8 +255,8 @@ inline aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::Rotation( TReal a, const aiVect // ------------------------------------------------------------------------------------------------ template <typename TReal> -inline aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::Translation( const aiVector2t<TReal>& v, aiMatrix3x3t<TReal>& out) -{ +AI_FORCE_INLINE +aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::Translation( const aiVector2t<TReal>& v, aiMatrix3x3t<TReal>& out) { out = aiMatrix3x3t<TReal>(); out.a3 = v.x; out.b3 = v.y; @@ -268,9 +274,8 @@ inline aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::Translation( const aiVector2t<T */ // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::FromToMatrix(const aiVector3t<TReal>& from, - const aiVector3t<TReal>& to, aiMatrix3x3t<TReal>& mtx) -{ +AI_FORCE_INLINE aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::FromToMatrix(const aiVector3t<TReal>& from, + const aiVector3t<TReal>& to, aiMatrix3x3t<TReal>& mtx) { const TReal e = from * to; const TReal f = (e < 0)? -e:e; @@ -352,6 +357,5 @@ inline aiMatrix3x3t<TReal>& aiMatrix3x3t<TReal>::FromToMatrix(const aiVector3t<T return mtx; } - #endif // __cplusplus #endif // AI_MATRIX3X3_INL_INC diff --git a/thirdparty/assimp/include/assimp/matrix4x4.h b/thirdparty/assimp/include/assimp/matrix4x4.h index 046bb535f2..8fc216f669 100644 --- a/thirdparty/assimp/include/assimp/matrix4x4.h +++ b/thirdparty/assimp/include/assimp/matrix4x4.h @@ -47,8 +47,12 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_MATRIX4X4_H_INC #define AI_MATRIX4X4_H_INC -#include "vector3.h" -#include "defs.h" +#ifdef __GNUC__ +# pragma GCC system_header +#endif + +#include <assimp/vector3.h> +#include <assimp/defs.h> #ifdef __cplusplus @@ -66,8 +70,7 @@ template<typename TReal> class aiQuaterniont; * defined thereby. */ template<typename TReal> -class aiMatrix4x4t -{ +class aiMatrix4x4t { public: /** set to identity */ @@ -91,8 +94,6 @@ public: aiMatrix4x4t(const aiVector3t<TReal>& scaling, const aiQuaterniont<TReal>& rotation, const aiVector3t<TReal>& position); -public: - // array access operators /** @fn TReal* operator[] (unsigned int p_iIndex) * @param [in] p_iIndex - index of the row. @@ -120,8 +121,6 @@ public: template <typename TOther> operator aiMatrix4x4t<TOther> () const; -public: - // ------------------------------------------------------------------- /** @brief Transpose the matrix */ aiMatrix4x4t& Transpose(); @@ -182,7 +181,6 @@ public: void DecomposeNoScaling (aiQuaterniont<TReal>& rotation, aiVector3t<TReal>& position) const; - // ------------------------------------------------------------------- /** @brief Creates a trafo matrix from a set of euler angles * @param x Rotation angle for the x-axis, in radians @@ -192,7 +190,6 @@ public: aiMatrix4x4t& FromEulerAnglesXYZ(TReal x, TReal y, TReal z); aiMatrix4x4t& FromEulerAnglesXYZ(const aiVector3t<TReal>& blubb); -public: // ------------------------------------------------------------------- /** @brief Returns a rotation matrix for a rotation around the x axis * @param a Rotation angle, in radians @@ -256,7 +253,6 @@ public: static aiMatrix4x4t& FromToMatrix(const aiVector3t<TReal>& from, const aiVector3t<TReal>& to, aiMatrix4x4t& out); -public: TReal a1, a2, a3, a4; TReal b1, b2, b3, b4; TReal c1, c2, c3, c4; diff --git a/thirdparty/assimp/include/assimp/matrix4x4.inl b/thirdparty/assimp/include/assimp/matrix4x4.inl index ebc67a06ec..84079974f7 100644 --- a/thirdparty/assimp/include/assimp/matrix4x4.inl +++ b/thirdparty/assimp/include/assimp/matrix4x4.inl @@ -5,8 +5,6 @@ Open Asset Import Library (assimp) Copyright (c) 2006-2019, assimp team - - All rights reserved. Redistribution and use of this software in source and binary forms, @@ -53,6 +51,7 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #include "matrix4x4.h" #include "matrix3x3.h" #include "quaternion.h" +#include "MathFunctions.h" #include <algorithm> #include <limits> @@ -61,12 +60,11 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // ---------------------------------------------------------------------------------------- template <typename TReal> aiMatrix4x4t<TReal>::aiMatrix4x4t() AI_NO_EXCEPT : - a1(1.0f), a2(), a3(), a4(), - b1(), b2(1.0f), b3(), b4(), - c1(), c2(), c3(1.0f), c4(), - d1(), d2(), d3(), d4(1.0f) -{ - + a1(1.0f), a2(), a3(), a4(), + b1(), b2(1.0f), b3(), b4(), + c1(), c2(), c3(1.0f), c4(), + d1(), d2(), d3(), d4(1.0f) { + // empty } // ---------------------------------------------------------------------------------------- @@ -75,19 +73,17 @@ aiMatrix4x4t<TReal>::aiMatrix4x4t (TReal _a1, TReal _a2, TReal _a3, TReal _a4, TReal _b1, TReal _b2, TReal _b3, TReal _b4, TReal _c1, TReal _c2, TReal _c3, TReal _c4, TReal _d1, TReal _d2, TReal _d3, TReal _d4) : - a1(_a1), a2(_a2), a3(_a3), a4(_a4), - b1(_b1), b2(_b2), b3(_b3), b4(_b4), - c1(_c1), c2(_c2), c3(_c3), c4(_c4), - d1(_d1), d2(_d2), d3(_d3), d4(_d4) -{ - + a1(_a1), a2(_a2), a3(_a3), a4(_a4), + b1(_b1), b2(_b2), b3(_b3), b4(_b4), + c1(_c1), c2(_c2), c3(_c3), c4(_c4), + d1(_d1), d2(_d2), d3(_d3), d4(_d4) { + // empty } // ------------------------------------------------------------------------------------------------ template <typename TReal> template <typename TOther> -aiMatrix4x4t<TReal>::operator aiMatrix4x4t<TOther> () const -{ +aiMatrix4x4t<TReal>::operator aiMatrix4x4t<TOther> () const { return aiMatrix4x4t<TOther>(static_cast<TOther>(a1),static_cast<TOther>(a2),static_cast<TOther>(a3),static_cast<TOther>(a4), static_cast<TOther>(b1),static_cast<TOther>(b2),static_cast<TOther>(b3),static_cast<TOther>(b4), static_cast<TOther>(c1),static_cast<TOther>(c2),static_cast<TOther>(c3),static_cast<TOther>(c4), @@ -97,8 +93,8 @@ aiMatrix4x4t<TReal>::operator aiMatrix4x4t<TOther> () const // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal>::aiMatrix4x4t (const aiMatrix3x3t<TReal>& m) -{ +AI_FORCE_INLINE +aiMatrix4x4t<TReal>::aiMatrix4x4t (const aiMatrix3x3t<TReal>& m) { a1 = m.a1; a2 = m.a2; a3 = m.a3; a4 = static_cast<TReal>(0.0); b1 = m.b1; b2 = m.b2; b3 = m.b3; b4 = static_cast<TReal>(0.0); c1 = m.c1; c2 = m.c2; c3 = m.c3; c4 = static_cast<TReal>(0.0); @@ -107,8 +103,8 @@ inline aiMatrix4x4t<TReal>::aiMatrix4x4t (const aiMatrix3x3t<TReal>& m) // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal>::aiMatrix4x4t (const aiVector3t<TReal>& scaling, const aiQuaterniont<TReal>& rotation, const aiVector3t<TReal>& position) -{ +AI_FORCE_INLINE +aiMatrix4x4t<TReal>::aiMatrix4x4t (const aiVector3t<TReal>& scaling, const aiQuaterniont<TReal>& rotation, const aiVector3t<TReal>& position) { // build a 3x3 rotation matrix aiMatrix3x3t<TReal> m = rotation.GetMatrix(); @@ -135,8 +131,8 @@ inline aiMatrix4x4t<TReal>::aiMatrix4x4t (const aiVector3t<TReal>& scaling, cons // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::operator *= (const aiMatrix4x4t<TReal>& m) -{ +AI_FORCE_INLINE +aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::operator *= (const aiMatrix4x4t<TReal>& m) { *this = aiMatrix4x4t<TReal>( m.a1 * a1 + m.b1 * a2 + m.c1 * a3 + m.d1 * a4, m.a2 * a1 + m.b2 * a2 + m.c2 * a3 + m.d2 * a4, @@ -159,8 +155,7 @@ inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::operator *= (const aiMatrix4x4t // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal> aiMatrix4x4t<TReal>::operator* (const TReal& aFloat) const -{ +AI_FORCE_INLINE aiMatrix4x4t<TReal> aiMatrix4x4t<TReal>::operator* (const TReal& aFloat) const { aiMatrix4x4t<TReal> temp( a1 * aFloat, a2 * aFloat, @@ -183,8 +178,8 @@ inline aiMatrix4x4t<TReal> aiMatrix4x4t<TReal>::operator* (const TReal& aFloat) // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal> aiMatrix4x4t<TReal>::operator+ (const aiMatrix4x4t<TReal>& m) const -{ +AI_FORCE_INLINE +aiMatrix4x4t<TReal> aiMatrix4x4t<TReal>::operator+ (const aiMatrix4x4t<TReal>& m) const { aiMatrix4x4t<TReal> temp( m.a1 + a1, m.a2 + a2, @@ -207,18 +202,16 @@ inline aiMatrix4x4t<TReal> aiMatrix4x4t<TReal>::operator+ (const aiMatrix4x4t<TR // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal> aiMatrix4x4t<TReal>::operator* (const aiMatrix4x4t<TReal>& m) const -{ +AI_FORCE_INLINE +aiMatrix4x4t<TReal> aiMatrix4x4t<TReal>::operator* (const aiMatrix4x4t<TReal>& m) const { aiMatrix4x4t<TReal> temp( *this); temp *= m; return temp; } - // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::Transpose() -{ +AI_FORCE_INLINE aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::Transpose() { // (TReal&) don't remove, GCC complains cause of packed fields std::swap( (TReal&)b1, (TReal&)a2); std::swap( (TReal&)c1, (TReal&)a3); @@ -229,11 +222,10 @@ inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::Transpose() return *this; } - // ---------------------------------------------------------------------------------------- template <typename TReal> -inline TReal aiMatrix4x4t<TReal>::Determinant() const -{ +AI_FORCE_INLINE +TReal aiMatrix4x4t<TReal>::Determinant() const { return a1*b2*c3*d4 - a1*b2*c4*d3 + a1*b3*c4*d2 - a1*b3*c2*d4 + a1*b4*c2*d3 - a1*b4*c3*d2 - a2*b3*c4*d1 + a2*b3*c1*d4 - a2*b4*c1*d3 + a2*b4*c3*d1 - a2*b1*c3*d4 + a2*b1*c4*d3 @@ -244,8 +236,8 @@ inline TReal aiMatrix4x4t<TReal>::Determinant() const // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::Inverse() -{ +AI_FORCE_INLINE +aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::Inverse() { // Compute the reciprocal determinant const TReal det = Determinant(); if(det == static_cast<TReal>(0.0)) @@ -289,9 +281,10 @@ inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::Inverse() // ---------------------------------------------------------------------------------------- template <typename TReal> -inline TReal* aiMatrix4x4t<TReal>::operator[](unsigned int p_iIndex) { +AI_FORCE_INLINE +TReal* aiMatrix4x4t<TReal>::operator[](unsigned int p_iIndex) { if (p_iIndex > 3) { - return NULL; + return nullptr; } switch ( p_iIndex ) { case 0: @@ -310,9 +303,10 @@ inline TReal* aiMatrix4x4t<TReal>::operator[](unsigned int p_iIndex) { // ---------------------------------------------------------------------------------------- template <typename TReal> -inline const TReal* aiMatrix4x4t<TReal>::operator[](unsigned int p_iIndex) const { +AI_FORCE_INLINE +const TReal* aiMatrix4x4t<TReal>::operator[](unsigned int p_iIndex) const { if (p_iIndex > 3) { - return NULL; + return nullptr; } switch ( p_iIndex ) { @@ -332,8 +326,8 @@ inline const TReal* aiMatrix4x4t<TReal>::operator[](unsigned int p_iIndex) const // ---------------------------------------------------------------------------------------- template <typename TReal> -inline bool aiMatrix4x4t<TReal>::operator== (const aiMatrix4x4t<TReal>& m) const -{ +AI_FORCE_INLINE +bool aiMatrix4x4t<TReal>::operator== (const aiMatrix4x4t<TReal>& m) const { return (a1 == m.a1 && a2 == m.a2 && a3 == m.a3 && a4 == m.a4 && b1 == m.b1 && b2 == m.b2 && b3 == m.b3 && b4 == m.b4 && c1 == m.c1 && c2 == m.c2 && c3 == m.c3 && c4 == m.c4 && @@ -342,14 +336,15 @@ inline bool aiMatrix4x4t<TReal>::operator== (const aiMatrix4x4t<TReal>& m) const // ---------------------------------------------------------------------------------------- template <typename TReal> -inline bool aiMatrix4x4t<TReal>::operator!= (const aiMatrix4x4t<TReal>& m) const -{ +AI_FORCE_INLINE +bool aiMatrix4x4t<TReal>::operator!= (const aiMatrix4x4t<TReal>& m) const { return !(*this == m); } // --------------------------------------------------------------------------- template<typename TReal> -inline bool aiMatrix4x4t<TReal>::Equal(const aiMatrix4x4t<TReal>& m, TReal epsilon) const { +AI_FORCE_INLINE +bool aiMatrix4x4t<TReal>::Equal(const aiMatrix4x4t<TReal>& m, TReal epsilon) const { return std::abs(a1 - m.a1) <= epsilon && std::abs(a2 - m.a2) <= epsilon && @@ -401,13 +396,10 @@ inline bool aiMatrix4x4t<TReal>::Equal(const aiMatrix4x4t<TReal>& m, TReal epsil \ do {} while(false) - - - template <typename TReal> -inline void aiMatrix4x4t<TReal>::Decompose (aiVector3t<TReal>& pScaling, aiQuaterniont<TReal>& pRotation, - aiVector3t<TReal>& pPosition) const -{ +AI_FORCE_INLINE +void aiMatrix4x4t<TReal>::Decompose (aiVector3t<TReal>& pScaling, aiQuaterniont<TReal>& pRotation, + aiVector3t<TReal>& pPosition) const { ASSIMP_MATRIX4_4_DECOMPOSE_PART; // build a 3x3 rotation matrix @@ -420,8 +412,8 @@ inline void aiMatrix4x4t<TReal>::Decompose (aiVector3t<TReal>& pScaling, aiQuate } template <typename TReal> -inline void aiMatrix4x4t<TReal>::Decompose(aiVector3t<TReal>& pScaling, aiVector3t<TReal>& pRotation, aiVector3t<TReal>& pPosition) const -{ +AI_FORCE_INLINE +void aiMatrix4x4t<TReal>::Decompose(aiVector3t<TReal>& pScaling, aiVector3t<TReal>& pRotation, aiVector3t<TReal>& pPosition) const { ASSIMP_MATRIX4_4_DECOMPOSE_PART; /* @@ -442,7 +434,7 @@ inline void aiMatrix4x4t<TReal>::Decompose(aiVector3t<TReal>& pScaling, aiVector */ // Use a small epsilon to solve floating-point inaccuracies - const TReal epsilon = 10e-3f; + const TReal epsilon = Assimp::Math::getEpsilon<TReal>(); pRotation.y = std::asin(-vCols[0].z);// D. Angle around oY. @@ -475,10 +467,10 @@ inline void aiMatrix4x4t<TReal>::Decompose(aiVector3t<TReal>& pScaling, aiVector #undef ASSIMP_MATRIX4_4_DECOMPOSE_PART template <typename TReal> -inline void aiMatrix4x4t<TReal>::Decompose(aiVector3t<TReal>& pScaling, aiVector3t<TReal>& pRotationAxis, TReal& pRotationAngle, - aiVector3t<TReal>& pPosition) const -{ -aiQuaterniont<TReal> pRotation; +AI_FORCE_INLINE +void aiMatrix4x4t<TReal>::Decompose(aiVector3t<TReal>& pScaling, aiVector3t<TReal>& pRotationAxis, TReal& pRotationAngle, + aiVector3t<TReal>& pPosition) const { + aiQuaterniont<TReal> pRotation; Decompose(pScaling, pRotation, pPosition); pRotation.Normalize(); @@ -500,9 +492,9 @@ aiQuaterniont<TReal> pRotation; // ---------------------------------------------------------------------------------------- template <typename TReal> -inline void aiMatrix4x4t<TReal>::DecomposeNoScaling (aiQuaterniont<TReal>& rotation, - aiVector3t<TReal>& position) const -{ +AI_FORCE_INLINE +void aiMatrix4x4t<TReal>::DecomposeNoScaling (aiQuaterniont<TReal>& rotation, + aiVector3t<TReal>& position) const { const aiMatrix4x4t<TReal>& _this = *this; // extract translation @@ -516,15 +508,15 @@ inline void aiMatrix4x4t<TReal>::DecomposeNoScaling (aiQuaterniont<TReal>& rotat // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::FromEulerAnglesXYZ(const aiVector3t<TReal>& blubb) -{ +AI_FORCE_INLINE +aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::FromEulerAnglesXYZ(const aiVector3t<TReal>& blubb) { return FromEulerAnglesXYZ(blubb.x,blubb.y,blubb.z); } // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::FromEulerAnglesXYZ(TReal x, TReal y, TReal z) -{ +AI_FORCE_INLINE +aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::FromEulerAnglesXYZ(TReal x, TReal y, TReal z) { aiMatrix4x4t<TReal>& _this = *this; TReal cx = std::cos(x); @@ -552,8 +544,8 @@ inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::FromEulerAnglesXYZ(TReal x, TRe // ---------------------------------------------------------------------------------------- template <typename TReal> -inline bool aiMatrix4x4t<TReal>::IsIdentity() const -{ +AI_FORCE_INLINE +bool aiMatrix4x4t<TReal>::IsIdentity() const { // Use a small epsilon to solve floating-point inaccuracies const static TReal epsilon = 10e-3f; @@ -577,8 +569,8 @@ inline bool aiMatrix4x4t<TReal>::IsIdentity() const // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::RotationX(TReal a, aiMatrix4x4t<TReal>& out) -{ +AI_FORCE_INLINE +aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::RotationX(TReal a, aiMatrix4x4t<TReal>& out) { /* | 1 0 0 0 | M = | 0 cos(A) -sin(A) 0 | @@ -592,8 +584,8 @@ inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::RotationX(TReal a, aiMatrix4x4t // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::RotationY(TReal a, aiMatrix4x4t<TReal>& out) -{ +AI_FORCE_INLINE +aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::RotationY(TReal a, aiMatrix4x4t<TReal>& out) { /* | cos(A) 0 sin(A) 0 | M = | 0 1 0 0 | @@ -608,8 +600,8 @@ inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::RotationY(TReal a, aiMatrix4x4t // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::RotationZ(TReal a, aiMatrix4x4t<TReal>& out) -{ +AI_FORCE_INLINE +aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::RotationZ(TReal a, aiMatrix4x4t<TReal>& out) { /* | cos(A) -sin(A) 0 0 | M = | sin(A) cos(A) 0 0 | @@ -624,26 +616,25 @@ inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::RotationZ(TReal a, aiMatrix4x4t // ---------------------------------------------------------------------------------------- // Returns a rotation matrix for a rotation around an arbitrary axis. template <typename TReal> -inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::Rotation( TReal a, const aiVector3t<TReal>& axis, aiMatrix4x4t<TReal>& out) -{ - TReal c = std::cos( a), s = std::sin( a), t = 1 - c; - TReal x = axis.x, y = axis.y, z = axis.z; - - // Many thanks to MathWorld and Wikipedia - out.a1 = t*x*x + c; out.a2 = t*x*y - s*z; out.a3 = t*x*z + s*y; - out.b1 = t*x*y + s*z; out.b2 = t*y*y + c; out.b3 = t*y*z - s*x; - out.c1 = t*x*z - s*y; out.c2 = t*y*z + s*x; out.c3 = t*z*z + c; - out.a4 = out.b4 = out.c4 = static_cast<TReal>(0.0); - out.d1 = out.d2 = out.d3 = static_cast<TReal>(0.0); - out.d4 = static_cast<TReal>(1.0); - - return out; +AI_FORCE_INLINE +aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::Rotation( TReal a, const aiVector3t<TReal>& axis, aiMatrix4x4t<TReal>& out) { + TReal c = std::cos( a), s = std::sin( a), t = 1 - c; + TReal x = axis.x, y = axis.y, z = axis.z; + + // Many thanks to MathWorld and Wikipedia + out.a1 = t*x*x + c; out.a2 = t*x*y - s*z; out.a3 = t*x*z + s*y; + out.b1 = t*x*y + s*z; out.b2 = t*y*y + c; out.b3 = t*y*z - s*x; + out.c1 = t*x*z - s*y; out.c2 = t*y*z + s*x; out.c3 = t*z*z + c; + out.a4 = out.b4 = out.c4 = static_cast<TReal>(0.0); + out.d1 = out.d2 = out.d3 = static_cast<TReal>(0.0); + out.d4 = static_cast<TReal>(1.0); + + return out; } // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::Translation( const aiVector3t<TReal>& v, aiMatrix4x4t<TReal>& out) -{ +AI_FORCE_INLINE aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::Translation( const aiVector3t<TReal>& v, aiMatrix4x4t<TReal>& out) { out = aiMatrix4x4t<TReal>(); out.a4 = v.x; out.b4 = v.y; @@ -653,8 +644,8 @@ inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::Translation( const aiVector3t<T // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::Scaling( const aiVector3t<TReal>& v, aiMatrix4x4t<TReal>& out) -{ +AI_FORCE_INLINE +aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::Scaling( const aiVector3t<TReal>& v, aiMatrix4x4t<TReal>& out) { out = aiMatrix4x4t<TReal>(); out.a1 = v.x; out.b2 = v.y; @@ -673,9 +664,9 @@ inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::Scaling( const aiVector3t<TReal */ // ---------------------------------------------------------------------------------------- template <typename TReal> -inline aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::FromToMatrix(const aiVector3t<TReal>& from, - const aiVector3t<TReal>& to, aiMatrix4x4t<TReal>& mtx) -{ +AI_FORCE_INLINE +aiMatrix4x4t<TReal>& aiMatrix4x4t<TReal>::FromToMatrix(const aiVector3t<TReal>& from, + const aiVector3t<TReal>& to, aiMatrix4x4t<TReal>& mtx) { aiMatrix3x3t<TReal> m3; aiMatrix3x3t<TReal>::FromToMatrix(from,to,m3); mtx = aiMatrix4x4t<TReal>(m3); diff --git a/thirdparty/assimp/include/assimp/mesh.h b/thirdparty/assimp/include/assimp/mesh.h index f1628f1f54..fbf2a857ad 100644 --- a/thirdparty/assimp/include/assimp/mesh.h +++ b/thirdparty/assimp/include/assimp/mesh.h @@ -48,6 +48,10 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_MESH_H_INC #define AI_MESH_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/types.h> #include <assimp/aabb.h> @@ -248,6 +252,9 @@ struct aiVertexWeight { }; +// Forward declare aiNode (pointer use only) +struct aiNode; + // --------------------------------------------------------------------------- /** @brief A single bone of a mesh. * @@ -264,6 +271,16 @@ struct aiBone { //! The maximum value for this member is #AI_MAX_BONE_WEIGHTS. unsigned int mNumWeights; +#ifndef ASSIMP_BUILD_NO_ARMATUREPOPULATE_PROCESS + // The bone armature node - used for skeleton conversion + // you must enable aiProcess_PopulateArmatureData to populate this + C_STRUCT aiNode* mArmature; + + // The bone node in the scene - used for skeleton conversion + // you must enable aiProcess_PopulateArmatureData to populate this + C_STRUCT aiNode* mNode; + +#endif //! The influence weights of this bone, by vertex index. C_STRUCT aiVertexWeight* mWeights; @@ -418,11 +435,11 @@ struct aiAnimMesh /**Anim Mesh name */ C_STRUCT aiString mName; - /** Replacement for aiMesh::mVertices. If this array is non-NULL, + /** Replacement for aiMesh::mVertices. If this array is non-nullptr, * it *must* contain mNumVertices entries. The corresponding - * array in the host mesh must be non-NULL as well - animation + * array in the host mesh must be non-nullptr as well - animation * meshes may neither add or nor remove vertex components (if - * a replacement array is NULL and the corresponding source + * a replacement array is nullptr and the corresponding source * array is not, the source data is taken instead)*/ C_STRUCT aiVector3D* mVertices; @@ -596,7 +613,7 @@ struct aiMesh C_STRUCT aiVector3D* mVertices; /** Vertex normals. - * The array contains normalized vectors, NULL if not present. + * The array contains normalized vectors, nullptr if not present. * The array is mNumVertices in size. Normals are undefined for * point and line primitives. A mesh consisting of points and * lines only may not have normal vectors. Meshes with mixed @@ -619,7 +636,7 @@ struct aiMesh /** Vertex tangents. * The tangent of a vertex points in the direction of the positive - * X texture axis. The array contains normalized vectors, NULL if + * X texture axis. The array contains normalized vectors, nullptr if * not present. The array is mNumVertices in size. A mesh consisting * of points and lines only may not have normal vectors. Meshes with * mixed primitive types (i.e. lines and triangles) may have @@ -633,7 +650,7 @@ struct aiMesh /** Vertex bitangents. * The bitangent of a vertex points in the direction of the positive - * Y texture axis. The array contains normalized vectors, NULL if not + * Y texture axis. The array contains normalized vectors, nullptr if not * present. The array is mNumVertices in size. * @note If the mesh contains tangents, it automatically also contains * bitangents. @@ -642,14 +659,14 @@ struct aiMesh /** Vertex color sets. * A mesh may contain 0 to #AI_MAX_NUMBER_OF_COLOR_SETS vertex - * colors per vertex. NULL if not present. Each array is + * colors per vertex. nullptr if not present. Each array is * mNumVertices in size if present. */ C_STRUCT aiColor4D* mColors[AI_MAX_NUMBER_OF_COLOR_SETS]; /** Vertex texture coords, also known as UV channels. * A mesh may contain 0 to AI_MAX_NUMBER_OF_TEXTURECOORDS per - * vertex. NULL if not present. The array is mNumVertices in size. + * vertex. nullptr if not present. The array is mNumVertices in size. */ C_STRUCT aiVector3D* mTextureCoords[AI_MAX_NUMBER_OF_TEXTURECOORDS]; @@ -671,7 +688,7 @@ struct aiMesh C_STRUCT aiFace* mFaces; /** The number of bones this mesh contains. - * Can be 0, in which case the mBones array is NULL. + * Can be 0, in which case the mBones array is nullptr. */ unsigned int mNumBones; @@ -769,7 +786,10 @@ struct aiMesh // DO NOT REMOVE THIS ADDITIONAL CHECK if (mNumBones && mBones) { for( unsigned int a = 0; a < mNumBones; a++) { - delete mBones[a]; + if(mBones[a]) + { + delete mBones[a]; + } } delete [] mBones; } diff --git a/thirdparty/assimp/include/assimp/metadata.h b/thirdparty/assimp/include/assimp/metadata.h index 3a1dd1442a..849d90f485 100644 --- a/thirdparty/assimp/include/assimp/metadata.h +++ b/thirdparty/assimp/include/assimp/metadata.h @@ -48,6 +48,10 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_METADATA_H_INC #define AI_METADATA_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #if defined(_MSC_VER) && (_MSC_VER <= 1500) # include "Compiler/pstdint.h" #else diff --git a/thirdparty/assimp/include/assimp/pbrmaterial.h b/thirdparty/assimp/include/assimp/pbrmaterial.h index ce5f822173..892a6347f7 100644 --- a/thirdparty/assimp/include/assimp/pbrmaterial.h +++ b/thirdparty/assimp/include/assimp/pbrmaterial.h @@ -5,8 +5,6 @@ Open Asset Import Library (assimp) Copyright (c) 2006-2019, assimp team - - All rights reserved. Redistribution and use of this software in source and binary forms, @@ -44,9 +42,14 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /** @file pbrmaterial.h * @brief Defines the material system of the library */ +#pragma once #ifndef AI_PBRMATERIAL_H_INC #define AI_PBRMATERIAL_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #define AI_MATKEY_GLTF_PBRMETALLICROUGHNESS_BASE_COLOR_FACTOR "$mat.gltf.pbrMetallicRoughness.baseColorFactor", 0, 0 #define AI_MATKEY_GLTF_PBRMETALLICROUGHNESS_METALLIC_FACTOR "$mat.gltf.pbrMetallicRoughness.metallicFactor", 0, 0 #define AI_MATKEY_GLTF_PBRMETALLICROUGHNESS_ROUGHNESS_FACTOR "$mat.gltf.pbrMetallicRoughness.roughnessFactor", 0, 0 diff --git a/thirdparty/assimp/include/assimp/postprocess.h b/thirdparty/assimp/include/assimp/postprocess.h index 2a74414216..4b6732e80a 100644 --- a/thirdparty/assimp/include/assimp/postprocess.h +++ b/thirdparty/assimp/include/assimp/postprocess.h @@ -47,7 +47,11 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_POSTPROCESS_H_INC #define AI_POSTPROCESS_H_INC -#include "types.h" +#include <assimp/types.h> + +#ifdef __GNUC__ +# pragma GCC system_header +#endif #ifdef __cplusplus extern "C" { @@ -316,6 +320,19 @@ enum aiPostProcessSteps */ aiProcess_FixInfacingNormals = 0x2000, + + + // ------------------------------------------------------------------------- + /** + * This step generically populates aiBone->mArmature and aiBone->mNode generically + * The point of these is it saves you later having to calculate these elements + * This is useful when handling rest information or skin information + * If you have multiple armatures on your models we strongly recommend enabling this + * Instead of writing your own multi-root, multi-armature lookups we have done the + * hard work for you :) + */ + aiProcess_PopulateArmatureData = 0x4000, + // ------------------------------------------------------------------------- /** <hr>This step splits meshes with more than one primitive type in * homogeneous sub-meshes. @@ -533,6 +550,8 @@ enum aiPostProcessSteps */ aiProcess_Debone = 0x4000000, + + // ------------------------------------------------------------------------- /** <hr>This step will perform a global scale of the model. * diff --git a/thirdparty/assimp/include/assimp/qnan.h b/thirdparty/assimp/include/assimp/qnan.h index 0918bde5e7..06780da5b8 100644 --- a/thirdparty/assimp/include/assimp/qnan.h +++ b/thirdparty/assimp/include/assimp/qnan.h @@ -50,19 +50,23 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * but last time I checked compiler coverage was so bad that I decided * to reinvent the wheel. */ - +#pragma once #ifndef AI_QNAN_H_INCLUDED #define AI_QNAN_H_INCLUDED +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #include <assimp/defs.h> + #include <limits> #include <stdint.h> // --------------------------------------------------------------------------- /** Data structure to represent the bit pattern of a 32 Bit * IEEE 754 floating-point number. */ -union _IEEESingle -{ +union _IEEESingle { float Float; struct { @@ -75,8 +79,7 @@ union _IEEESingle // --------------------------------------------------------------------------- /** Data structure to represent the bit pattern of a 64 Bit * IEEE 754 floating-point number. */ -union _IEEEDouble -{ +union _IEEEDouble { double Double; struct { @@ -89,8 +92,7 @@ union _IEEEDouble // --------------------------------------------------------------------------- /** Check whether a given float is qNaN. * @param in Input value */ -AI_FORCE_INLINE bool is_qnan(float in) -{ +AI_FORCE_INLINE bool is_qnan(float in) { // the straightforward solution does not work: // return (in != in); // compiler generates code like this @@ -107,8 +109,7 @@ AI_FORCE_INLINE bool is_qnan(float in) // --------------------------------------------------------------------------- /** Check whether a given double is qNaN. * @param in Input value */ -AI_FORCE_INLINE bool is_qnan(double in) -{ +AI_FORCE_INLINE bool is_qnan(double in) { // the straightforward solution does not work: // return (in != in); // compiler generates code like this @@ -127,8 +128,7 @@ AI_FORCE_INLINE bool is_qnan(double in) * * Denorms return false, they're treated like normal values. * @param in Input value */ -AI_FORCE_INLINE bool is_special_float(float in) -{ +AI_FORCE_INLINE bool is_special_float(float in) { _IEEESingle temp; memcpy(&temp, &in, sizeof(float)); return (temp.IEEE.Exp == (1u << 8)-1); @@ -139,8 +139,7 @@ AI_FORCE_INLINE bool is_special_float(float in) * * Denorms return false, they're treated like normal values. * @param in Input value */ -AI_FORCE_INLINE bool is_special_float(double in) -{ +AI_FORCE_INLINE bool is_special_float(double in) { _IEEESingle temp; memcpy(&temp, &in, sizeof(float)); return (temp.IEEE.Exp == (1u << 11)-1); @@ -150,15 +149,13 @@ AI_FORCE_INLINE bool is_special_float(double in) /** Check whether a float is NOT qNaN. * @param in Input value */ template<class TReal> -AI_FORCE_INLINE bool is_not_qnan(TReal in) -{ +AI_FORCE_INLINE bool is_not_qnan(TReal in) { return !is_qnan(in); } // --------------------------------------------------------------------------- /** @brief Get a fresh qnan. */ -AI_FORCE_INLINE ai_real get_qnan() -{ +AI_FORCE_INLINE ai_real get_qnan() { return std::numeric_limits<ai_real>::quiet_NaN(); } diff --git a/thirdparty/assimp/include/assimp/quaternion.h b/thirdparty/assimp/include/assimp/quaternion.h index 96574d24b9..ae45959b41 100644 --- a/thirdparty/assimp/include/assimp/quaternion.h +++ b/thirdparty/assimp/include/assimp/quaternion.h @@ -49,7 +49,11 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifdef __cplusplus -#include "defs.h" +#ifdef __GNUC__ +# pragma GCC system_header +#endif + +#include <assimp/defs.h> template <typename TReal> class aiVector3t; template <typename TReal> class aiMatrix3x3t; diff --git a/thirdparty/assimp/include/assimp/quaternion.inl b/thirdparty/assimp/include/assimp/quaternion.inl index c26648215e..3ce514d1bb 100644 --- a/thirdparty/assimp/include/assimp/quaternion.inl +++ b/thirdparty/assimp/include/assimp/quaternion.inl @@ -48,8 +48,12 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_QUATERNION_INL_INC #define AI_QUATERNION_INL_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #ifdef __cplusplus -#include "quaternion.h" +#include <assimp/quaternion.h> #include <cmath> diff --git a/thirdparty/assimp/include/assimp/scene.h b/thirdparty/assimp/include/assimp/scene.h index 2667db85b3..b76709eb15 100644 --- a/thirdparty/assimp/include/assimp/scene.h +++ b/thirdparty/assimp/include/assimp/scene.h @@ -48,14 +48,18 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_SCENE_H_INC #define AI_SCENE_H_INC -#include "types.h" -#include "texture.h" -#include "mesh.h" -#include "light.h" -#include "camera.h" -#include "material.h" -#include "anim.h" -#include "metadata.h" +#ifdef __GNUC__ +# pragma GCC system_header +#endif + +#include <assimp/types.h> +#include <assimp/texture.h> +#include <assimp/mesh.h> +#include <assimp/light.h> +#include <assimp/camera.h> +#include <assimp/material.h> +#include <assimp/anim.h> +#include <assimp/metadata.h> #ifdef __cplusplus # include <cstdlib> @@ -106,13 +110,13 @@ struct ASSIMP_API aiNode /** The transformation relative to the node's parent. */ C_STRUCT aiMatrix4x4 mTransformation; - /** Parent node. NULL if this node is the root node. */ + /** Parent node. nullptr if this node is the root node. */ C_STRUCT aiNode* mParent; /** The number of child nodes of this node. */ unsigned int mNumChildren; - /** The child nodes of this node. NULL if mNumChildren is 0. */ + /** The child nodes of this node. nullptr if mNumChildren is 0. */ C_STRUCT aiNode** mChildren; /** The number of meshes of this node. */ @@ -123,7 +127,7 @@ struct ASSIMP_API aiNode */ unsigned int* mMeshes; - /** Metadata associated with this node or NULL if there is no metadata. + /** Metadata associated with this node or nullptr if there is no metadata. * Whether any metadata is generated depends on the source file format. See the * @link importer_notes @endlink page for more information on every source file * format. Importers that don't document any metadata don't write any. @@ -145,7 +149,7 @@ struct ASSIMP_API aiNode * of the scene. * * @param name Name to search for - * @return NULL or a valid Node if the search was successful. + * @return nullptr or a valid Node if the search was successful. */ inline const aiNode* FindNode(const aiString& name) const { @@ -340,7 +344,7 @@ struct aiScene #ifdef __cplusplus - //! Default constructor - set everything to 0/NULL + //! Default constructor - set everything to 0/nullptr ASSIMP_API aiScene(); //! Destructor @@ -349,33 +353,33 @@ struct aiScene //! Check whether the scene contains meshes //! Unless no special scene flags are set this will always be true. inline bool HasMeshes() const { - return mMeshes != NULL && mNumMeshes > 0; + return mMeshes != nullptr && mNumMeshes > 0; } //! Check whether the scene contains materials //! Unless no special scene flags are set this will always be true. inline bool HasMaterials() const { - return mMaterials != NULL && mNumMaterials > 0; + return mMaterials != nullptr && mNumMaterials > 0; } //! Check whether the scene contains lights inline bool HasLights() const { - return mLights != NULL && mNumLights > 0; + return mLights != nullptr && mNumLights > 0; } //! Check whether the scene contains textures inline bool HasTextures() const { - return mTextures != NULL && mNumTextures > 0; + return mTextures != nullptr && mNumTextures > 0; } //! Check whether the scene contains cameras inline bool HasCameras() const { - return mCameras != NULL && mNumCameras > 0; + return mCameras != nullptr && mNumCameras > 0; } //! Check whether the scene contains animations inline bool HasAnimations() const { - return mAnimations != NULL && mNumAnimations > 0; + return mAnimations != nullptr && mNumAnimations > 0; } //! Returns a short filename from a full path diff --git a/thirdparty/assimp/include/assimp/texture.h b/thirdparty/assimp/include/assimp/texture.h index dc6cbef65c..0867659f46 100644 --- a/thirdparty/assimp/include/assimp/texture.h +++ b/thirdparty/assimp/include/assimp/texture.h @@ -5,8 +5,6 @@ Open Asset Import Library (assimp) Copyright (c) 2006-2019, assimp team - - All rights reserved. Redistribution and use of this software in source and binary forms, @@ -53,13 +51,16 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_TEXTURE_H_INC #define AI_TEXTURE_H_INC -#include "types.h" +#ifdef __GNUC__ +# pragma GCC system_header +#endif + +#include <assimp/types.h> #ifdef __cplusplus extern "C" { #endif - // -------------------------------------------------------------------------------- /** \def AI_EMBEDDED_TEXNAME_PREFIX @@ -79,7 +80,6 @@ extern "C" { # define AI_MAKE_EMBEDDED_TEXNAME(_n_) AI_EMBEDDED_TEXNAME_PREFIX # _n_ #endif - #include "./Compiler/pushpack1.h" // -------------------------------------------------------------------------------- @@ -87,8 +87,7 @@ extern "C" { * * Used by aiTexture. */ -struct aiTexel -{ +struct aiTexel { unsigned char b,g,r,a; #ifdef __cplusplus diff --git a/thirdparty/assimp/include/assimp/types.h b/thirdparty/assimp/include/assimp/types.h index 331b8cd03f..e32cae331c 100644 --- a/thirdparty/assimp/include/assimp/types.h +++ b/thirdparty/assimp/include/assimp/types.h @@ -48,22 +48,30 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_TYPES_H_INC #define AI_TYPES_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + // Some runtime headers #include <sys/types.h> #include <stddef.h> #include <string.h> #include <limits.h> +#include <stdint.h> // Our compile configuration -#include "defs.h" +#include <assimp/defs.h> // Some types moved to separate header due to size of operators -#include "vector3.h" -#include "vector2.h" -#include "color4.h" -#include "matrix3x3.h" -#include "matrix4x4.h" -#include "quaternion.h" +#include <assimp/vector3.h> +#include <assimp/vector2.h> +#include <assimp/color4.h> +#include <assimp/matrix3x3.h> +#include <assimp/matrix4x4.h> +#include <assimp/quaternion.h> + +typedef int32_t ai_int32; +typedef uint32_t ai_uint32 ; #ifdef __cplusplus #include <cstring> @@ -71,7 +79,7 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #include <string> // for aiString::Set(const std::string&) namespace Assimp { - //! @cond never +//! @cond never namespace Intern { // -------------------------------------------------------------------- /** @brief Internal helper class to utilize our internal new/delete @@ -269,8 +277,8 @@ struct aiString } /** Copy constructor */ - aiString(const aiString& rOther) : - length(rOther.length) + aiString(const aiString& rOther) + : length(rOther.length) { // Crop the string to the maximum length length = length>=MAXLEN?MAXLEN-1:length; @@ -280,7 +288,7 @@ struct aiString /** Constructor from std::string */ explicit aiString(const std::string& pString) : - length(pString.length()) + length( (ai_uint32) pString.length()) { length = length>=MAXLEN?MAXLEN-1:length; memcpy( data, pString.c_str(), length); @@ -292,15 +300,15 @@ struct aiString if( pString.length() > MAXLEN - 1) { return; } - length = pString.length(); + length = (ai_uint32)pString.length(); memcpy( data, pString.c_str(), length); data[length] = 0; } /** Copy a const char* to the aiString */ void Set( const char* sz) { - const size_t len = ::strlen(sz); - if( len > MAXLEN - 1) { + const ai_int32 len = (ai_uint32) ::strlen(sz); + if( len > (ai_int32)MAXLEN - 1) { return; } length = len; @@ -346,7 +354,7 @@ struct aiString /** Append a string to the string */ void Append (const char* app) { - const size_t len = ::strlen(app); + const ai_uint32 len = (ai_uint32) ::strlen(app); if (!len) { return; } @@ -379,7 +387,7 @@ struct aiString /** Binary length of the string excluding the terminal 0. This is NOT the * logical length of strings containing UTF-8 multi-byte sequences! It's * the number of bytes from the beginning of the string to its end.*/ - size_t length; + ai_uint32 length; /** String buffer. Size limit is MAXLEN */ char data[MAXLEN]; diff --git a/thirdparty/assimp/include/assimp/vector2.h b/thirdparty/assimp/include/assimp/vector2.h index d5ef001542..c8b1ebbbc2 100644 --- a/thirdparty/assimp/include/assimp/vector2.h +++ b/thirdparty/assimp/include/assimp/vector2.h @@ -47,6 +47,10 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_VECTOR2D_H_INC #define AI_VECTOR2D_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #ifdef __cplusplus # include <cmath> #else diff --git a/thirdparty/assimp/include/assimp/vector2.inl b/thirdparty/assimp/include/assimp/vector2.inl index 3b7a7beabb..4bbf432ff8 100644 --- a/thirdparty/assimp/include/assimp/vector2.inl +++ b/thirdparty/assimp/include/assimp/vector2.inl @@ -48,8 +48,12 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_VECTOR2D_INL_INC #define AI_VECTOR2D_INL_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #ifdef __cplusplus -#include "vector2.h" +#include <assimp/vector2.h> #include <cmath> diff --git a/thirdparty/assimp/include/assimp/vector3.h b/thirdparty/assimp/include/assimp/vector3.h index 7ff25cf0ad..fffeb12ad7 100644 --- a/thirdparty/assimp/include/assimp/vector3.h +++ b/thirdparty/assimp/include/assimp/vector3.h @@ -47,13 +47,17 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_VECTOR3D_H_INC #define AI_VECTOR3D_H_INC +#ifdef __GNUC__ +# pragma GCC system_header +#endif + #ifdef __cplusplus # include <cmath> #else # include <math.h> #endif -#include "defs.h" +#include <assimp/defs.h> #ifdef __cplusplus @@ -63,16 +67,13 @@ template<typename TReal> class aiMatrix4x4t; // --------------------------------------------------------------------------- /** Represents a three-dimensional vector. */ template <typename TReal> -class aiVector3t -{ +class aiVector3t { public: aiVector3t() AI_NO_EXCEPT : x(), y(), z() {} aiVector3t(TReal _x, TReal _y, TReal _z) : x(_x), y(_y), z(_z) {} explicit aiVector3t (TReal _xyz ) : x(_xyz), y(_xyz), z(_xyz) {} aiVector3t( const aiVector3t& o ) = default; -public: - // combined operators const aiVector3t& operator += (const aiVector3t& o); const aiVector3t& operator -= (const aiVector3t& o); @@ -97,7 +98,6 @@ public: template <typename TOther> operator aiVector3t<TOther> () const; -public: /** @brief Set the components of a vector * @param pX X component * @param pY Y component diff --git a/thirdparty/assimp/include/assimp/vector3.inl b/thirdparty/assimp/include/assimp/vector3.inl index 2fce6eddee..6682d3b32c 100644 --- a/thirdparty/assimp/include/assimp/vector3.inl +++ b/thirdparty/assimp/include/assimp/vector3.inl @@ -49,7 +49,7 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #define AI_VECTOR3D_INL_INC #ifdef __cplusplus -#include "vector3.h" +#include <assimp/vector3.h> #include <cmath> diff --git a/thirdparty/assimp/include/assimp/version.h b/thirdparty/assimp/include/assimp/version.h index c62a40e118..2fdd37a43c 100644 --- a/thirdparty/assimp/include/assimp/version.h +++ b/thirdparty/assimp/include/assimp/version.h @@ -49,7 +49,7 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. #ifndef AI_VERSION_H_INC #define AI_VERSION_H_INC -#include "defs.h" +#include <assimp/defs.h> #ifdef __cplusplus extern "C" { diff --git a/thirdparty/glad/glad.c b/thirdparty/glad/glad.c index 08c9c7e228..a1fe4d25ef 100644 --- a/thirdparty/glad/glad.c +++ b/thirdparty/glad/glad.c @@ -1,6 +1,6 @@ /* - OpenGL loader generated by glad 0.1.31 on Thu Jul 11 10:09:18 2019. + OpenGL loader generated by glad 0.1.33 on Tue Nov 12 08:44:22 2019. Language/Generator: C/C++ Specification: gl @@ -31,6 +31,9 @@ static void* get_proc(const char *namez); #if defined(_WIN32) || defined(__CYGWIN__) +#ifndef _WINDOWS_ +#undef APIENTRY +#endif #include <windows.h> static HMODULE libGL; diff --git a/thirdparty/glad/glad/glad.h b/thirdparty/glad/glad/glad.h index acf96d8cd9..fae71a865e 100644 --- a/thirdparty/glad/glad/glad.h +++ b/thirdparty/glad/glad/glad.h @@ -1,6 +1,6 @@ /* - OpenGL loader generated by glad 0.1.31 on Thu Jul 11 10:09:18 2019. + OpenGL loader generated by glad 0.1.33 on Tue Nov 12 08:44:22 2019. Language/Generator: C/C++ Specification: gl @@ -3404,7 +3404,7 @@ GLAPI PFNGLWAITSYNCPROC glad_glWaitSync; typedef void (APIENTRYP PFNGLGETINTEGER64VPROC)(GLenum pname, GLint64 *data); GLAPI PFNGLGETINTEGER64VPROC glad_glGetInteger64v; #define glGetInteger64v glad_glGetInteger64v -typedef void (APIENTRYP PFNGLGETSYNCIVPROC)(GLsync sync, GLenum pname, GLsizei bufSize, GLsizei *length, GLint *values); +typedef void (APIENTRYP PFNGLGETSYNCIVPROC)(GLsync sync, GLenum pname, GLsizei count, GLsizei *length, GLint *values); GLAPI PFNGLGETSYNCIVPROC glad_glGetSynciv; #define glGetSynciv glad_glGetSynciv typedef void (APIENTRYP PFNGLGETINTEGER64I_VPROC)(GLenum target, GLuint index, GLint64 *data); diff --git a/thirdparty/mbedtls/include/mbedtls/bn_mul.h b/thirdparty/mbedtls/include/mbedtls/bn_mul.h index c33bd8d4ab..748975ea51 100644 --- a/thirdparty/mbedtls/include/mbedtls/bn_mul.h +++ b/thirdparty/mbedtls/include/mbedtls/bn_mul.h @@ -642,7 +642,8 @@ "r6", "r7", "r8", "r9", "cc" \ ); -#elif defined (__ARM_FEATURE_DSP) && (__ARM_FEATURE_DSP == 1) +#elif (__ARM_ARCH >= 6) && \ + defined (__ARM_FEATURE_DSP) && (__ARM_FEATURE_DSP == 1) #define MULADDC_INIT \ asm( diff --git a/thirdparty/mbedtls/include/mbedtls/check_config.h b/thirdparty/mbedtls/include/mbedtls/check_config.h index b86e5807e0..6eabcc8748 100644 --- a/thirdparty/mbedtls/include/mbedtls/check_config.h +++ b/thirdparty/mbedtls/include/mbedtls/check_config.h @@ -123,7 +123,7 @@ #error "MBEDTLS_ECDSA_DETERMINISTIC defined, but not all prerequisites" #endif -#if defined(MBEDTLS_ECP_C) && ( !defined(MBEDTLS_BIGNUM_C) || ( \ +#if defined(MBEDTLS_ECP_C) && ( !defined(MBEDTLS_BIGNUM_C) || ( \ !defined(MBEDTLS_ECP_DP_SECP192R1_ENABLED) && \ !defined(MBEDTLS_ECP_DP_SECP224R1_ENABLED) && \ !defined(MBEDTLS_ECP_DP_SECP256R1_ENABLED) && \ @@ -134,7 +134,9 @@ !defined(MBEDTLS_ECP_DP_BP512R1_ENABLED) && \ !defined(MBEDTLS_ECP_DP_SECP192K1_ENABLED) && \ !defined(MBEDTLS_ECP_DP_SECP224K1_ENABLED) && \ - !defined(MBEDTLS_ECP_DP_SECP256K1_ENABLED) ) ) + !defined(MBEDTLS_ECP_DP_SECP256K1_ENABLED) && \ + !defined(MBEDTLS_ECP_DP_CURVE25519_ENABLED) && \ + !defined(MBEDTLS_ECP_DP_CURVE448_ENABLED) ) ) #error "MBEDTLS_ECP_C defined, but not all prerequisites" #endif @@ -691,7 +693,7 @@ /* * Avoid warning from -pedantic. This is a convenient place for this * workaround since this is included by every single file before the - * #if defined(MBEDTLS_xxx_C) that results in emtpy translation units. + * #if defined(MBEDTLS_xxx_C) that results in empty translation units. */ typedef int mbedtls_iso_c_forbids_empty_translation_units; diff --git a/thirdparty/mbedtls/include/mbedtls/config.h b/thirdparty/mbedtls/include/mbedtls/config.h index e16e1e53d3..0cc502cd79 100644 --- a/thirdparty/mbedtls/include/mbedtls/config.h +++ b/thirdparty/mbedtls/include/mbedtls/config.h @@ -139,7 +139,7 @@ * * System has time.h, time(), and an implementation for * mbedtls_platform_gmtime_r() (see below). - * The time needs to be correct (not necesarily very accurate, but at least + * The time needs to be correct (not necessarily very accurate, but at least * the date should be correct). This is used to verify the validity period of * X.509 certificates. * @@ -276,28 +276,52 @@ * For example, when a function accepts as input a pointer to a buffer that may * contain untrusted data, and its documentation mentions that this pointer * must not be NULL: - * - the pointer is checked to be non-NULL only if this option is enabled - * - the content of the buffer is always validated + * - The pointer is checked to be non-NULL only if this option is enabled. + * - The content of the buffer is always validated. * * When this flag is defined, if a library function receives a parameter that - * is invalid, it will: - * - invoke the macro MBEDTLS_PARAM_FAILED() which by default expands to a - * call to the function mbedtls_param_failed() - * - immediately return (with a specific error code unless the function - * returns void and can't communicate an error). - * - * When defining this flag, you also need to: - * - either provide a definition of the function mbedtls_param_failed() in - * your application (see platform_util.h for its prototype) as the library - * calls that function, but does not provide a default definition for it, - * - or provide a different definition of the macro MBEDTLS_PARAM_FAILED() - * below if the above mechanism is not flexible enough to suit your needs. - * See the documentation of this macro later in this file. + * is invalid: + * 1. The function will invoke the macro MBEDTLS_PARAM_FAILED(). + * 2. If MBEDTLS_PARAM_FAILED() did not terminate the program, the function + * will immediately return. If the function returns an Mbed TLS error code, + * the error code in this case is MBEDTLS_ERR_xxx_BAD_INPUT_DATA. + * + * When defining this flag, you also need to arrange a definition for + * MBEDTLS_PARAM_FAILED(). You can do this by any of the following methods: + * - By default, the library defines MBEDTLS_PARAM_FAILED() to call a + * function mbedtls_param_failed(), but the library does not define this + * function. If you do not make any other arrangements, you must provide + * the function mbedtls_param_failed() in your application. + * See `platform_util.h` for its prototype. + * - If you enable the macro #MBEDTLS_CHECK_PARAMS_ASSERT, then the + * library defines #MBEDTLS_PARAM_FAILED(\c cond) to be `assert(cond)`. + * You can still supply an alternative definition of + * MBEDTLS_PARAM_FAILED(), which may call `assert`. + * - If you define a macro MBEDTLS_PARAM_FAILED() before including `config.h` + * or you uncomment the definition of MBEDTLS_PARAM_FAILED() in `config.h`, + * the library will call the macro that you defined and will not supply + * its own version. Note that if MBEDTLS_PARAM_FAILED() calls `assert`, + * you need to enable #MBEDTLS_CHECK_PARAMS_ASSERT so that library source + * files include `<assert.h>`. * * Uncomment to enable validation of application-controlled parameters. */ //#define MBEDTLS_CHECK_PARAMS +/** + * \def MBEDTLS_CHECK_PARAMS_ASSERT + * + * Allow MBEDTLS_PARAM_FAILED() to call `assert`, and make it default to + * `assert`. This macro is only used if #MBEDTLS_CHECK_PARAMS is defined. + * + * If this macro is not defined, then MBEDTLS_PARAM_FAILED() defaults to + * calling a function mbedtls_param_failed(). See the documentation of + * #MBEDTLS_CHECK_PARAMS for details. + * + * Uncomment to allow MBEDTLS_PARAM_FAILED() to call `assert`. + */ +//#define MBEDTLS_CHECK_PARAMS_ASSERT + /* \} name SECTION: System support */ /** @@ -401,7 +425,7 @@ * \note Because of a signature change, the core AES encryption and decryption routines are * currently named mbedtls_aes_internal_encrypt and mbedtls_aes_internal_decrypt, * respectively. When setting up alternative implementations, these functions should - * be overriden, but the wrapper functions mbedtls_aes_decrypt and mbedtls_aes_encrypt + * be overridden, but the wrapper functions mbedtls_aes_decrypt and mbedtls_aes_encrypt * must stay untouched. * * \note If you use the AES_xxx_ALT macros, then is is recommended to also set @@ -416,6 +440,16 @@ * dependencies on them, and considering stronger message digests * and ciphers instead. * + * \warning If both MBEDTLS_ECDSA_SIGN_ALT and MBEDTLS_ECDSA_DETERMINISTIC are + * enabled, then the deterministic ECDH signature functions pass the + * the static HMAC-DRBG as RNG to mbedtls_ecdsa_sign(). Therefore + * alternative implementations should use the RNG only for generating + * the ephemeral key and nothing else. If this is not possible, then + * MBEDTLS_ECDSA_DETERMINISTIC should be disabled and an alternative + * implementation should be provided for mbedtls_ecdsa_sign_det_ext() + * (and for mbedtls_ecdsa_sign_det() too if backward compatibility is + * desirable). + * */ //#define MBEDTLS_MD2_PROCESS_ALT //#define MBEDTLS_MD4_PROCESS_ALT @@ -1558,7 +1592,7 @@ * \def MBEDTLS_SSL_SESSION_TICKETS * * Enable support for RFC 5077 session tickets in SSL. - * Client-side, provides full support for session tickets (maintainance of a + * Client-side, provides full support for session tickets (maintenance of a * session store remains the responsibility of the application, though). * Server-side, you also need to provide callbacks for writing and parsing * tickets, including authenticated encryption and key management. Example @@ -1642,9 +1676,7 @@ * * Uncomment this to enable pthread mutexes. */ -// -- GODOT start -- //#define MBEDTLS_THREADING_PTHREAD -// -- GODOT end -- /** * \def MBEDTLS_VERSION_FEATURES @@ -1726,7 +1758,7 @@ * * \warning TLS-level compression MAY REDUCE SECURITY! See for example the * CRIME attack. Before enabling this option, you should examine with care if - * CRIME or similar exploits may be a applicable to your use case. + * CRIME or similar exploits may be applicable to your use case. * * \note Currently compression can't be used with DTLS. * @@ -2838,9 +2870,7 @@ * * Enable this layer to allow use of mutexes within mbed TLS */ -// -- GODOT start -- //#define MBEDTLS_THREADING_C -// -- GODOT end -- /** * \def MBEDTLS_TIMING_C @@ -3042,7 +3072,7 @@ //#define MBEDTLS_PLATFORM_STD_TIME time /**< Default time to use, can be undefined. MBEDTLS_HAVE_TIME must be enabled */ //#define MBEDTLS_PLATFORM_STD_FPRINTF fprintf /**< Default fprintf to use, can be undefined */ //#define MBEDTLS_PLATFORM_STD_PRINTF printf /**< Default printf to use, can be undefined */ -/* Note: your snprintf must correclty zero-terminate the buffer! */ +/* Note: your snprintf must correctly zero-terminate the buffer! */ //#define MBEDTLS_PLATFORM_STD_SNPRINTF snprintf /**< Default snprintf to use, can be undefined */ //#define MBEDTLS_PLATFORM_STD_EXIT_SUCCESS 0 /**< Default exit value to use, can be undefined */ //#define MBEDTLS_PLATFORM_STD_EXIT_FAILURE 1 /**< Default exit value to use, can be undefined */ @@ -3059,20 +3089,23 @@ //#define MBEDTLS_PLATFORM_TIME_TYPE_MACRO time_t /**< Default time macro to use, can be undefined. MBEDTLS_HAVE_TIME must be enabled */ //#define MBEDTLS_PLATFORM_FPRINTF_MACRO fprintf /**< Default fprintf macro to use, can be undefined */ //#define MBEDTLS_PLATFORM_PRINTF_MACRO printf /**< Default printf macro to use, can be undefined */ -/* Note: your snprintf must correclty zero-terminate the buffer! */ +/* Note: your snprintf must correctly zero-terminate the buffer! */ //#define MBEDTLS_PLATFORM_SNPRINTF_MACRO snprintf /**< Default snprintf macro to use, can be undefined */ //#define MBEDTLS_PLATFORM_NV_SEED_READ_MACRO mbedtls_platform_std_nv_seed_read /**< Default nv_seed_read function to use, can be undefined */ //#define MBEDTLS_PLATFORM_NV_SEED_WRITE_MACRO mbedtls_platform_std_nv_seed_write /**< Default nv_seed_write function to use, can be undefined */ /** * \brief This macro is invoked by the library when an invalid parameter - * is detected that is only checked with MBEDTLS_CHECK_PARAMS + * is detected that is only checked with #MBEDTLS_CHECK_PARAMS * (see the documentation of that option for context). * - * When you leave this undefined here, a default definition is - * provided that invokes the function mbedtls_param_failed(), - * which is declared in platform_util.h for the benefit of the - * library, but that you need to define in your application. + * When you leave this undefined here, the library provides + * a default definition. If the macro #MBEDTLS_CHECK_PARAMS_ASSERT + * is defined, the default definition is `assert(cond)`, + * otherwise the default definition calls a function + * mbedtls_param_failed(). This function is declared in + * `platform_util.h` for the benefit of the library, but + * you need to define in your application. * * When you define this here, this replaces the default * definition in platform_util.h (which no longer declares the @@ -3081,6 +3114,9 @@ * particular, that all the necessary declarations are visible * from within the library - you can ensure that by providing * them in this file next to the macro definition). + * If you define this macro to call `assert`, also define + * #MBEDTLS_CHECK_PARAMS_ASSERT so that library source files + * include `<assert.h>`. * * Note that you may define this macro to expand to nothing, in * which case you don't have to worry about declarations or diff --git a/thirdparty/mbedtls/include/mbedtls/ecdsa.h b/thirdparty/mbedtls/include/mbedtls/ecdsa.h index f8b28507c2..932acc6d14 100644 --- a/thirdparty/mbedtls/include/mbedtls/ecdsa.h +++ b/thirdparty/mbedtls/include/mbedtls/ecdsa.h @@ -175,6 +175,19 @@ int mbedtls_ecdsa_sign( mbedtls_ecp_group *grp, mbedtls_mpi *r, mbedtls_mpi *s, * (SECG): SEC1 Elliptic Curve Cryptography</em>, section * 4.1.3, step 5. * + * \warning Since the output of the internal RNG is always the same for + * the same key and message, this limits the efficiency of + * blinding and leaks information through side channels. For + * secure behavior use mbedtls_ecdsa_sign_det_ext() instead. + * + * (Optimally the blinding is a random value that is different + * on every execution. In this case the blinding is still + * random from the attackers perspective, but is the same on + * each execution. This means that this blinding does not + * prevent attackers from recovering secrets by combining + * several measurement traces, but may prevent some attacks + * that exploit relationships between secret data.) + * * \see ecp.h * * \param grp The context for the elliptic curve to use. @@ -200,6 +213,52 @@ int mbedtls_ecdsa_sign_det( mbedtls_ecp_group *grp, mbedtls_mpi *r, mbedtls_mpi *s, const mbedtls_mpi *d, const unsigned char *buf, size_t blen, mbedtls_md_type_t md_alg ); +/** + * \brief This function computes the ECDSA signature of a + * previously-hashed message, deterministic version. + * + * For more information, see <em>RFC-6979: Deterministic + * Usage of the Digital Signature Algorithm (DSA) and Elliptic + * Curve Digital Signature Algorithm (ECDSA)</em>. + * + * \note If the bitlength of the message hash is larger than the + * bitlength of the group order, then the hash is truncated as + * defined in <em>Standards for Efficient Cryptography Group + * (SECG): SEC1 Elliptic Curve Cryptography</em>, section + * 4.1.3, step 5. + * + * \see ecp.h + * + * \param grp The context for the elliptic curve to use. + * This must be initialized and have group parameters + * set, for example through mbedtls_ecp_group_load(). + * \param r The MPI context in which to store the first part + * the signature. This must be initialized. + * \param s The MPI context in which to store the second part + * the signature. This must be initialized. + * \param d The private signing key. This must be initialized + * and setup, for example through mbedtls_ecp_gen_privkey(). + * \param buf The hashed content to be signed. This must be a readable + * buffer of length \p blen Bytes. It may be \c NULL if + * \p blen is zero. + * \param blen The length of \p buf in Bytes. + * \param md_alg The hash algorithm used to hash the original data. + * \param f_rng_blind The RNG function used for blinding. This must not be + * \c NULL. + * \param p_rng_blind The RNG context to be passed to \p f_rng. This may be + * \c NULL if \p f_rng doesn't need a context parameter. + * + * \return \c 0 on success. + * \return An \c MBEDTLS_ERR_ECP_XXX or \c MBEDTLS_MPI_XXX + * error code on failure. + */ +int mbedtls_ecdsa_sign_det_ext( mbedtls_ecp_group *grp, mbedtls_mpi *r, + mbedtls_mpi *s, const mbedtls_mpi *d, + const unsigned char *buf, size_t blen, + mbedtls_md_type_t md_alg, + int (*f_rng_blind)(void *, unsigned char *, + size_t), + void *p_rng_blind ); #endif /* MBEDTLS_ECDSA_DETERMINISTIC */ /** diff --git a/thirdparty/mbedtls/include/mbedtls/hkdf.h b/thirdparty/mbedtls/include/mbedtls/hkdf.h index 40ee64eb03..bcafe42513 100644 --- a/thirdparty/mbedtls/include/mbedtls/hkdf.h +++ b/thirdparty/mbedtls/include/mbedtls/hkdf.h @@ -7,22 +7,22 @@ * specified by RFC 5869. */ /* - * Copyright (C) 2016-2018, ARM Limited, All Rights Reserved - * SPDX-License-Identifier: Apache-2.0 + * Copyright (C) 2016-2019, ARM Limited, All Rights Reserved + * SPDX-License-Identifier: Apache-2.0 * - * Licensed under the Apache License, Version 2.0 (the "License"); you may - * not use this file except in compliance with the License. - * You may obtain a copy of the License at + * Licensed under the Apache License, Version 2.0 (the "License"); you may + * not use this file except in compliance with the License. + * You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * http://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT - * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. * - * This file is part of mbed TLS (https://tls.mbed.org) + * This file is part of mbed TLS (https://tls.mbed.org) */ #ifndef MBEDTLS_HKDF_H #define MBEDTLS_HKDF_H diff --git a/thirdparty/mbedtls/include/mbedtls/hmac_drbg.h b/thirdparty/mbedtls/include/mbedtls/hmac_drbg.h index 7eae32bbd6..f1289cb306 100644 --- a/thirdparty/mbedtls/include/mbedtls/hmac_drbg.h +++ b/thirdparty/mbedtls/include/mbedtls/hmac_drbg.h @@ -82,7 +82,7 @@ extern "C" { */ typedef struct mbedtls_hmac_drbg_context { - /* Working state: the key K is not stored explicitely, + /* Working state: the key K is not stored explicitly, * but is implied by the HMAC context */ mbedtls_md_context_t md_ctx; /*!< HMAC context (inc. K) */ unsigned char V[MBEDTLS_MD_MAX_SIZE]; /*!< V in the spec */ diff --git a/thirdparty/mbedtls/include/mbedtls/pk.h b/thirdparty/mbedtls/include/mbedtls/pk.h index 91950f9407..136427503a 100644 --- a/thirdparty/mbedtls/include/mbedtls/pk.h +++ b/thirdparty/mbedtls/include/mbedtls/pk.h @@ -416,6 +416,10 @@ int mbedtls_pk_verify_ext( mbedtls_pk_type_t type, const void *options, * * \note For RSA, md_alg may be MBEDTLS_MD_NONE if hash_len != 0. * For ECDSA, md_alg may never be MBEDTLS_MD_NONE. + * + * \note In order to ensure enough space for the signature, the + * \p sig buffer size must be of at least + * `max(MBEDTLS_ECDSA_MAX_LEN, MBEDTLS_MPI_MAX_SIZE)` bytes. */ int mbedtls_pk_sign( mbedtls_pk_context *ctx, mbedtls_md_type_t md_alg, const unsigned char *hash, size_t hash_len, @@ -430,6 +434,10 @@ int mbedtls_pk_sign( mbedtls_pk_context *ctx, mbedtls_md_type_t md_alg, * \c mbedtls_ecp_set_max_ops() to reduce blocking for ECC * operations. For RSA, same as \c mbedtls_pk_sign(). * + * \note In order to ensure enough space for the signature, the + * \p sig buffer size must be of at least + * `max(MBEDTLS_ECDSA_MAX_LEN, MBEDTLS_MPI_MAX_SIZE)` bytes. + * * \param ctx The PK context to use. It must have been set up * with a private key. * \param md_alg Hash algorithm used (see notes) diff --git a/thirdparty/mbedtls/include/mbedtls/platform_util.h b/thirdparty/mbedtls/include/mbedtls/platform_util.h index dba6d45982..09d0965182 100644 --- a/thirdparty/mbedtls/include/mbedtls/platform_util.h +++ b/thirdparty/mbedtls/include/mbedtls/platform_util.h @@ -43,6 +43,12 @@ extern "C" { #if defined(MBEDTLS_CHECK_PARAMS) +#if defined(MBEDTLS_CHECK_PARAMS_ASSERT) +/* Allow the user to define MBEDTLS_PARAM_FAILED to something like assert + * (which is what our config.h suggests). */ +#include <assert.h> +#endif /* MBEDTLS_CHECK_PARAMS_ASSERT */ + #if defined(MBEDTLS_PARAM_FAILED) /** An alternative definition of MBEDTLS_PARAM_FAILED has been set in config.h. * @@ -50,6 +56,11 @@ extern "C" { * MBEDTLS_PARAM_FAILED() will expand to a call to mbedtls_param_failed(). */ #define MBEDTLS_PARAM_FAILED_ALT + +#elif defined(MBEDTLS_CHECK_PARAMS_ASSERT) +#define MBEDTLS_PARAM_FAILED( cond ) assert( cond ) +#define MBEDTLS_PARAM_FAILED_ALT + #else /* MBEDTLS_PARAM_FAILED */ #define MBEDTLS_PARAM_FAILED( cond ) \ mbedtls_param_failed( #cond, __FILE__, __LINE__ ) diff --git a/thirdparty/mbedtls/include/mbedtls/rsa.h b/thirdparty/mbedtls/include/mbedtls/rsa.h index 906c427332..35bacd8763 100644 --- a/thirdparty/mbedtls/include/mbedtls/rsa.h +++ b/thirdparty/mbedtls/include/mbedtls/rsa.h @@ -150,13 +150,13 @@ mbedtls_rsa_context; * \note The choice of padding mode is strictly enforced for private key * operations, since there might be security concerns in * mixing padding modes. For public key operations it is - * a default value, which can be overriden by calling specific + * a default value, which can be overridden by calling specific * \c rsa_rsaes_xxx or \c rsa_rsassa_xxx functions. * * \note The hash selected in \p hash_id is always used for OEAP * encryption. For PSS signatures, it is always used for - * making signatures, but can be overriden for verifying them. - * If set to #MBEDTLS_MD_NONE, it is always overriden. + * making signatures, but can be overridden for verifying them. + * If set to #MBEDTLS_MD_NONE, it is always overridden. * * \param ctx The RSA context to initialize. This must not be \c NULL. * \param padding The padding mode to use. This must be either @@ -904,7 +904,8 @@ int mbedtls_rsa_rsaes_oaep_decrypt( mbedtls_rsa_context *ctx, * the size of the hash corresponding to \p md_alg. * \param sig The buffer to hold the signature. This must be a writable * buffer of length \c ctx->len Bytes. For example, \c 256 Bytes - * for an 2048-bit RSA modulus. + * for an 2048-bit RSA modulus. A buffer length of + * #MBEDTLS_MPI_MAX_SIZE is always safe. * * \return \c 0 if the signing operation was successful. * \return An \c MBEDTLS_ERR_RSA_XXX error code on failure. @@ -951,7 +952,8 @@ int mbedtls_rsa_pkcs1_sign( mbedtls_rsa_context *ctx, * the size of the hash corresponding to \p md_alg. * \param sig The buffer to hold the signature. This must be a writable * buffer of length \c ctx->len Bytes. For example, \c 256 Bytes - * for an 2048-bit RSA modulus. + * for an 2048-bit RSA modulus. A buffer length of + * #MBEDTLS_MPI_MAX_SIZE is always safe. * * \return \c 0 if the signing operation was successful. * \return An \c MBEDTLS_ERR_RSA_XXX error code on failure. @@ -1012,7 +1014,8 @@ int mbedtls_rsa_rsassa_pkcs1_v15_sign( mbedtls_rsa_context *ctx, * the size of the hash corresponding to \p md_alg. * \param sig The buffer to hold the signature. This must be a writable * buffer of length \c ctx->len Bytes. For example, \c 256 Bytes - * for an 2048-bit RSA modulus. + * for an 2048-bit RSA modulus. A buffer length of + * #MBEDTLS_MPI_MAX_SIZE is always safe. * * \return \c 0 if the signing operation was successful. * \return An \c MBEDTLS_ERR_RSA_XXX error code on failure. diff --git a/thirdparty/mbedtls/include/mbedtls/ssl.h b/thirdparty/mbedtls/include/mbedtls/ssl.h index d31f6cdd56..1adf9608cc 100644 --- a/thirdparty/mbedtls/include/mbedtls/ssl.h +++ b/thirdparty/mbedtls/include/mbedtls/ssl.h @@ -2033,7 +2033,7 @@ void mbedtls_ssl_conf_ca_chain( mbedtls_ssl_config *conf, * provision more than one cert/key pair (eg one ECDSA, one * RSA with SHA-256, one RSA with SHA-1). An adequate * certificate will be selected according to the client's - * advertised capabilities. In case mutliple certificates are + * advertised capabilities. In case multiple certificates are * adequate, preference is given to the one set by the first * call to this function, then second, etc. * @@ -3206,7 +3206,7 @@ void mbedtls_ssl_free( mbedtls_ssl_context *ssl ); * mbedtls_ssl_config_defaults() or mbedtls_ssl_config_free(). * * \note You need to call mbedtls_ssl_config_defaults() unless you - * manually set all of the relevent fields yourself. + * manually set all of the relevant fields yourself. * * \param conf SSL configuration context */ diff --git a/thirdparty/mbedtls/include/mbedtls/ssl_ticket.h b/thirdparty/mbedtls/include/mbedtls/ssl_ticket.h index a84e7816e4..774a007a9f 100644 --- a/thirdparty/mbedtls/include/mbedtls/ssl_ticket.h +++ b/thirdparty/mbedtls/include/mbedtls/ssl_ticket.h @@ -117,14 +117,14 @@ int mbedtls_ssl_ticket_setup( mbedtls_ssl_ticket_context *ctx, /** * \brief Implementation of the ticket write callback * - * \note See \c mbedlts_ssl_ticket_write_t for description + * \note See \c mbedtls_ssl_ticket_write_t for description */ mbedtls_ssl_ticket_write_t mbedtls_ssl_ticket_write; /** * \brief Implementation of the ticket parse callback * - * \note See \c mbedlts_ssl_ticket_parse_t for description + * \note See \c mbedtls_ssl_ticket_parse_t for description */ mbedtls_ssl_ticket_parse_t mbedtls_ssl_ticket_parse; diff --git a/thirdparty/mbedtls/include/mbedtls/version.h b/thirdparty/mbedtls/include/mbedtls/version.h index ef8e4c1f4f..b4eef71e50 100644 --- a/thirdparty/mbedtls/include/mbedtls/version.h +++ b/thirdparty/mbedtls/include/mbedtls/version.h @@ -40,16 +40,16 @@ */ #define MBEDTLS_VERSION_MAJOR 2 #define MBEDTLS_VERSION_MINOR 16 -#define MBEDTLS_VERSION_PATCH 2 +#define MBEDTLS_VERSION_PATCH 3 /** * The single version number has the following structure: * MMNNPP00 * Major version | Minor version | Patch version */ -#define MBEDTLS_VERSION_NUMBER 0x02100200 -#define MBEDTLS_VERSION_STRING "2.16.2" -#define MBEDTLS_VERSION_STRING_FULL "mbed TLS 2.16.2" +#define MBEDTLS_VERSION_NUMBER 0x02100300 +#define MBEDTLS_VERSION_STRING "2.16.3" +#define MBEDTLS_VERSION_STRING_FULL "mbed TLS 2.16.3" #if defined(MBEDTLS_VERSION_C) diff --git a/thirdparty/mbedtls/include/mbedtls/x509.h b/thirdparty/mbedtls/include/mbedtls/x509.h index 9ae825c183..63aae32d87 100644 --- a/thirdparty/mbedtls/include/mbedtls/x509.h +++ b/thirdparty/mbedtls/include/mbedtls/x509.h @@ -77,7 +77,7 @@ #define MBEDTLS_ERR_X509_ALLOC_FAILED -0x2880 /**< Allocation of memory failed. */ #define MBEDTLS_ERR_X509_FILE_IO_ERROR -0x2900 /**< Read/write of file failed. */ #define MBEDTLS_ERR_X509_BUFFER_TOO_SMALL -0x2980 /**< Destination buffer is too small. */ -#define MBEDTLS_ERR_X509_FATAL_ERROR -0x3000 /**< A fatal error occured, eg the chain is too long or the vrfy callback failed. */ +#define MBEDTLS_ERR_X509_FATAL_ERROR -0x3000 /**< A fatal error occurred, eg the chain is too long or the vrfy callback failed. */ /* \} name */ /** @@ -250,7 +250,7 @@ int mbedtls_x509_serial_gets( char *buf, size_t size, const mbedtls_x509_buf *se * * \param to mbedtls_x509_time to check * - * \return 1 if the given time is in the past or an error occured, + * \return 1 if the given time is in the past or an error occurred, * 0 otherwise. */ int mbedtls_x509_time_is_past( const mbedtls_x509_time *to ); @@ -264,7 +264,7 @@ int mbedtls_x509_time_is_past( const mbedtls_x509_time *to ); * * \param from mbedtls_x509_time to check * - * \return 1 if the given time is in the future or an error occured, + * \return 1 if the given time is in the future or an error occurred, * 0 otherwise. */ int mbedtls_x509_time_is_future( const mbedtls_x509_time *from ); diff --git a/thirdparty/mbedtls/include/mbedtls/x509_crl.h b/thirdparty/mbedtls/include/mbedtls/x509_crl.h index 08a4283a67..fa838d68cb 100644 --- a/thirdparty/mbedtls/include/mbedtls/x509_crl.h +++ b/thirdparty/mbedtls/include/mbedtls/x509_crl.h @@ -111,7 +111,7 @@ int mbedtls_x509_crl_parse_der( mbedtls_x509_crl *chain, /** * \brief Parse one or more CRLs and append them to the chained list * - * \note Mutliple CRLs are accepted only if using PEM format + * \note Multiple CRLs are accepted only if using PEM format * * \param chain points to the start of the chain * \param buf buffer holding the CRL data in PEM or DER format @@ -126,7 +126,7 @@ int mbedtls_x509_crl_parse( mbedtls_x509_crl *chain, const unsigned char *buf, s /** * \brief Load one or more CRLs and append them to the chained list * - * \note Mutliple CRLs are accepted only if using PEM format + * \note Multiple CRLs are accepted only if using PEM format * * \param chain points to the start of the chain * \param path filename to read the CRLs from (in PEM or DER encoding) diff --git a/thirdparty/mbedtls/library/bignum.c b/thirdparty/mbedtls/library/bignum.c index 41946183c5..d1717e9435 100644 --- a/thirdparty/mbedtls/library/bignum.c +++ b/thirdparty/mbedtls/library/bignum.c @@ -742,10 +742,15 @@ cleanup: static mbedtls_mpi_uint mpi_uint_bigendian_to_host_c( mbedtls_mpi_uint x ) { uint8_t i; + unsigned char *x_ptr; mbedtls_mpi_uint tmp = 0; - /* This works regardless of the endianness. */ - for( i = 0; i < ciL; i++, x >>= 8 ) - tmp |= ( x & 0xFF ) << ( ( ciL - 1 - i ) << 3 ); + + for( i = 0, x_ptr = (unsigned char*) &x; i < ciL; i++, x_ptr++ ) + { + tmp <<= CHAR_BIT; + tmp |= (mbedtls_mpi_uint) *x_ptr; + } + return( tmp ); } @@ -2351,7 +2356,8 @@ static int mpi_miller_rabin( const mbedtls_mpi *X, size_t rounds, } if (count++ > 30) { - return MBEDTLS_ERR_MPI_NOT_ACCEPTABLE; + ret = MBEDTLS_ERR_MPI_NOT_ACCEPTABLE; + goto cleanup; } } while ( mbedtls_mpi_cmp_mpi( &A, &W ) >= 0 || diff --git a/thirdparty/mbedtls/library/certs.c b/thirdparty/mbedtls/library/certs.c index b07fd8a3a1..80ab0b9d6c 100644 --- a/thirdparty/mbedtls/library/certs.c +++ b/thirdparty/mbedtls/library/certs.c @@ -46,75 +46,67 @@ /* BEGIN FILE string macro TEST_CA_CRT_EC_PEM tests/data_files/test-ca2.crt */ #define TEST_CA_CRT_EC_PEM \ "-----BEGIN CERTIFICATE-----\r\n" \ - "MIICUjCCAdegAwIBAgIJAMFD4n5iQ8zoMAoGCCqGSM49BAMCMD4xCzAJBgNVBAYT\r\n" \ - "Ak5MMREwDwYDVQQKEwhQb2xhclNTTDEcMBoGA1UEAxMTUG9sYXJzc2wgVGVzdCBF\r\n" \ - "QyBDQTAeFw0xMzA5MjQxNTQ5NDhaFw0yMzA5MjIxNTQ5NDhaMD4xCzAJBgNVBAYT\r\n" \ - "Ak5MMREwDwYDVQQKEwhQb2xhclNTTDEcMBoGA1UEAxMTUG9sYXJzc2wgVGVzdCBF\r\n" \ - "QyBDQTB2MBAGByqGSM49AgEGBSuBBAAiA2IABMPaKzRBN1gvh1b+/Im6KUNLTuBu\r\n" \ - "ww5XUzM5WNRStJGVOQsj318XJGJI/BqVKc4sLYfCiFKAr9ZqqyHduNMcbli4yuiy\r\n" \ - "aY7zQa0pw7RfdadHb9UZKVVpmlM7ILRmFmAzHqOBoDCBnTAdBgNVHQ4EFgQUnW0g\r\n" \ - "JEkBPyvLeLUZvH4kydv7NnwwbgYDVR0jBGcwZYAUnW0gJEkBPyvLeLUZvH4kydv7\r\n" \ - "NnyhQqRAMD4xCzAJBgNVBAYTAk5MMREwDwYDVQQKEwhQb2xhclNTTDEcMBoGA1UE\r\n" \ - "AxMTUG9sYXJzc2wgVGVzdCBFQyBDQYIJAMFD4n5iQ8zoMAwGA1UdEwQFMAMBAf8w\r\n" \ - "CgYIKoZIzj0EAwIDaQAwZgIxAMO0YnNWKJUAfXgSJtJxexn4ipg+kv4znuR50v56\r\n" \ - "t4d0PCu412mUC6Nnd7izvtE2MgIxAP1nnJQjZ8BWukszFQDG48wxCCyci9qpdSMv\r\n" \ - "uCjn8pwUOkABXK8Mss90fzCfCEOtIA==\r\n" \ + "MIICBDCCAYigAwIBAgIJAMFD4n5iQ8zoMAwGCCqGSM49BAMCBQAwPjELMAkGA1UE\r\n" \ + "BhMCTkwxETAPBgNVBAoMCFBvbGFyU1NMMRwwGgYDVQQDDBNQb2xhcnNzbCBUZXN0\r\n" \ + "IEVDIENBMB4XDTE5MDIxMDE0NDQwMFoXDTI5MDIxMDE0NDQwMFowPjELMAkGA1UE\r\n" \ + "BhMCTkwxETAPBgNVBAoMCFBvbGFyU1NMMRwwGgYDVQQDDBNQb2xhcnNzbCBUZXN0\r\n" \ + "IEVDIENBMHYwEAYHKoZIzj0CAQYFK4EEACIDYgAEw9orNEE3WC+HVv78ibopQ0tO\r\n" \ + "4G7DDldTMzlY1FK0kZU5CyPfXxckYkj8GpUpziwth8KIUoCv1mqrId240xxuWLjK\r\n" \ + "6LJpjvNBrSnDtF91p0dv1RkpVWmaUzsgtGYWYDMeo1AwTjAMBgNVHRMEBTADAQH/\r\n" \ + "MB0GA1UdDgQWBBSdbSAkSQE/K8t4tRm8fiTJ2/s2fDAfBgNVHSMEGDAWgBSdbSAk\r\n" \ + "SQE/K8t4tRm8fiTJ2/s2fDAMBggqhkjOPQQDAgUAA2gAMGUCMFHKrjAPpHB0BN1a\r\n" \ + "LH8TwcJ3vh0AxeKZj30mRdOKBmg/jLS3rU3g8VQBHpn8sOTTBwIxANxPO5AerimZ\r\n" \ + "hCjMe0d4CTHf1gFZMF70+IqEP+o5VHsIp2Cqvflb0VGWFC5l9a4cQg==\r\n" \ "-----END CERTIFICATE-----\r\n" /* END FILE */ /* This is generated from tests/data_files/test-ca2.crt.der using `xxd -i`. */ /* BEGIN FILE binary macro TEST_CA_CRT_EC_DER tests/data_files/test-ca2.crt.der */ #define TEST_CA_CRT_EC_DER { \ - 0x30, 0x82, 0x02, 0x52, 0x30, 0x82, 0x01, 0xd7, 0xa0, 0x03, 0x02, 0x01, \ - 0x02, 0x02, 0x09, 0x00, 0xc1, 0x43, 0xe2, 0x7e, 0x62, 0x43, 0xcc, 0xe8, \ - 0x30, 0x0a, 0x06, 0x08, 0x2a, 0x86, 0x48, 0xce, 0x3d, 0x04, 0x03, 0x02, \ - 0x30, 0x3e, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, \ - 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, \ - 0x13, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, 0x31, 0x1c, \ - 0x30, 0x1a, 0x06, 0x03, 0x55, 0x04, 0x03, 0x13, 0x13, 0x50, 0x6f, 0x6c, \ - 0x61, 0x72, 0x73, 0x73, 0x6c, 0x20, 0x54, 0x65, 0x73, 0x74, 0x20, 0x45, \ - 0x43, 0x20, 0x43, 0x41, 0x30, 0x1e, 0x17, 0x0d, 0x31, 0x33, 0x30, 0x39, \ - 0x32, 0x34, 0x31, 0x35, 0x34, 0x39, 0x34, 0x38, 0x5a, 0x17, 0x0d, 0x32, \ - 0x33, 0x30, 0x39, 0x32, 0x32, 0x31, 0x35, 0x34, 0x39, 0x34, 0x38, 0x5a, \ - 0x30, 0x3e, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, \ - 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, \ - 0x13, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, 0x31, 0x1c, \ - 0x30, 0x1a, 0x06, 0x03, 0x55, 0x04, 0x03, 0x13, 0x13, 0x50, 0x6f, 0x6c, \ - 0x61, 0x72, 0x73, 0x73, 0x6c, 0x20, 0x54, 0x65, 0x73, 0x74, 0x20, 0x45, \ - 0x43, 0x20, 0x43, 0x41, 0x30, 0x76, 0x30, 0x10, 0x06, 0x07, 0x2a, 0x86, \ - 0x48, 0xce, 0x3d, 0x02, 0x01, 0x06, 0x05, 0x2b, 0x81, 0x04, 0x00, 0x22, \ - 0x03, 0x62, 0x00, 0x04, 0xc3, 0xda, 0x2b, 0x34, 0x41, 0x37, 0x58, 0x2f, \ - 0x87, 0x56, 0xfe, 0xfc, 0x89, 0xba, 0x29, 0x43, 0x4b, 0x4e, 0xe0, 0x6e, \ - 0xc3, 0x0e, 0x57, 0x53, 0x33, 0x39, 0x58, 0xd4, 0x52, 0xb4, 0x91, 0x95, \ - 0x39, 0x0b, 0x23, 0xdf, 0x5f, 0x17, 0x24, 0x62, 0x48, 0xfc, 0x1a, 0x95, \ - 0x29, 0xce, 0x2c, 0x2d, 0x87, 0xc2, 0x88, 0x52, 0x80, 0xaf, 0xd6, 0x6a, \ - 0xab, 0x21, 0xdd, 0xb8, 0xd3, 0x1c, 0x6e, 0x58, 0xb8, 0xca, 0xe8, 0xb2, \ - 0x69, 0x8e, 0xf3, 0x41, 0xad, 0x29, 0xc3, 0xb4, 0x5f, 0x75, 0xa7, 0x47, \ - 0x6f, 0xd5, 0x19, 0x29, 0x55, 0x69, 0x9a, 0x53, 0x3b, 0x20, 0xb4, 0x66, \ - 0x16, 0x60, 0x33, 0x1e, 0xa3, 0x81, 0xa0, 0x30, 0x81, 0x9d, 0x30, 0x1d, \ - 0x06, 0x03, 0x55, 0x1d, 0x0e, 0x04, 0x16, 0x04, 0x14, 0x9d, 0x6d, 0x20, \ - 0x24, 0x49, 0x01, 0x3f, 0x2b, 0xcb, 0x78, 0xb5, 0x19, 0xbc, 0x7e, 0x24, \ - 0xc9, 0xdb, 0xfb, 0x36, 0x7c, 0x30, 0x6e, 0x06, 0x03, 0x55, 0x1d, 0x23, \ - 0x04, 0x67, 0x30, 0x65, 0x80, 0x14, 0x9d, 0x6d, 0x20, 0x24, 0x49, 0x01, \ - 0x3f, 0x2b, 0xcb, 0x78, 0xb5, 0x19, 0xbc, 0x7e, 0x24, 0xc9, 0xdb, 0xfb, \ - 0x36, 0x7c, 0xa1, 0x42, 0xa4, 0x40, 0x30, 0x3e, 0x31, 0x0b, 0x30, 0x09, \ - 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, \ - 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, 0x13, 0x08, 0x50, 0x6f, 0x6c, 0x61, \ - 0x72, 0x53, 0x53, 0x4c, 0x31, 0x1c, 0x30, 0x1a, 0x06, 0x03, 0x55, 0x04, \ - 0x03, 0x13, 0x13, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x73, 0x73, 0x6c, 0x20, \ - 0x54, 0x65, 0x73, 0x74, 0x20, 0x45, 0x43, 0x20, 0x43, 0x41, 0x82, 0x09, \ - 0x00, 0xc1, 0x43, 0xe2, 0x7e, 0x62, 0x43, 0xcc, 0xe8, 0x30, 0x0c, 0x06, \ - 0x03, 0x55, 0x1d, 0x13, 0x04, 0x05, 0x30, 0x03, 0x01, 0x01, 0xff, 0x30, \ - 0x0a, 0x06, 0x08, 0x2a, 0x86, 0x48, 0xce, 0x3d, 0x04, 0x03, 0x02, 0x03, \ - 0x69, 0x00, 0x30, 0x66, 0x02, 0x31, 0x00, 0xc3, 0xb4, 0x62, 0x73, 0x56, \ - 0x28, 0x95, 0x00, 0x7d, 0x78, 0x12, 0x26, 0xd2, 0x71, 0x7b, 0x19, 0xf8, \ - 0x8a, 0x98, 0x3e, 0x92, 0xfe, 0x33, 0x9e, 0xe4, 0x79, 0xd2, 0xfe, 0x7a, \ - 0xb7, 0x87, 0x74, 0x3c, 0x2b, 0xb8, 0xd7, 0x69, 0x94, 0x0b, 0xa3, 0x67, \ - 0x77, 0xb8, 0xb3, 0xbe, 0xd1, 0x36, 0x32, 0x02, 0x31, 0x00, 0xfd, 0x67, \ - 0x9c, 0x94, 0x23, 0x67, 0xc0, 0x56, 0xba, 0x4b, 0x33, 0x15, 0x00, 0xc6, \ - 0xe3, 0xcc, 0x31, 0x08, 0x2c, 0x9c, 0x8b, 0xda, 0xa9, 0x75, 0x23, 0x2f, \ - 0xb8, 0x28, 0xe7, 0xf2, 0x9c, 0x14, 0x3a, 0x40, 0x01, 0x5c, 0xaf, 0x0c, \ - 0xb2, 0xcf, 0x74, 0x7f, 0x30, 0x9f, 0x08, 0x43, 0xad, 0x20 \ + 0x30, 0x82, 0x02, 0x04, 0x30, 0x82, 0x01, 0x88, 0xa0, 0x03, 0x02, 0x01, \ + 0x02, 0x02, 0x09, 0x00, 0xc1, 0x43, 0xe2, 0x7e, 0x62, 0x43, 0xcc, 0xe8, \ + 0x30, 0x0c, 0x06, 0x08, 0x2a, 0x86, 0x48, 0xce, 0x3d, 0x04, 0x03, 0x02, \ + 0x05, 0x00, 0x30, 0x3e, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, 0x55, 0x04, \ + 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, 0x03, 0x55, \ + 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, \ + 0x31, 0x1c, 0x30, 0x1a, 0x06, 0x03, 0x55, 0x04, 0x03, 0x0c, 0x13, 0x50, \ + 0x6f, 0x6c, 0x61, 0x72, 0x73, 0x73, 0x6c, 0x20, 0x54, 0x65, 0x73, 0x74, \ + 0x20, 0x45, 0x43, 0x20, 0x43, 0x41, 0x30, 0x1e, 0x17, 0x0d, 0x31, 0x39, \ + 0x30, 0x32, 0x31, 0x30, 0x31, 0x34, 0x34, 0x34, 0x30, 0x30, 0x5a, 0x17, \ + 0x0d, 0x32, 0x39, 0x30, 0x32, 0x31, 0x30, 0x31, 0x34, 0x34, 0x34, 0x30, \ + 0x30, 0x5a, 0x30, 0x3e, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, 0x55, 0x04, \ + 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, 0x03, 0x55, \ + 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, \ + 0x31, 0x1c, 0x30, 0x1a, 0x06, 0x03, 0x55, 0x04, 0x03, 0x0c, 0x13, 0x50, \ + 0x6f, 0x6c, 0x61, 0x72, 0x73, 0x73, 0x6c, 0x20, 0x54, 0x65, 0x73, 0x74, \ + 0x20, 0x45, 0x43, 0x20, 0x43, 0x41, 0x30, 0x76, 0x30, 0x10, 0x06, 0x07, \ + 0x2a, 0x86, 0x48, 0xce, 0x3d, 0x02, 0x01, 0x06, 0x05, 0x2b, 0x81, 0x04, \ + 0x00, 0x22, 0x03, 0x62, 0x00, 0x04, 0xc3, 0xda, 0x2b, 0x34, 0x41, 0x37, \ + 0x58, 0x2f, 0x87, 0x56, 0xfe, 0xfc, 0x89, 0xba, 0x29, 0x43, 0x4b, 0x4e, \ + 0xe0, 0x6e, 0xc3, 0x0e, 0x57, 0x53, 0x33, 0x39, 0x58, 0xd4, 0x52, 0xb4, \ + 0x91, 0x95, 0x39, 0x0b, 0x23, 0xdf, 0x5f, 0x17, 0x24, 0x62, 0x48, 0xfc, \ + 0x1a, 0x95, 0x29, 0xce, 0x2c, 0x2d, 0x87, 0xc2, 0x88, 0x52, 0x80, 0xaf, \ + 0xd6, 0x6a, 0xab, 0x21, 0xdd, 0xb8, 0xd3, 0x1c, 0x6e, 0x58, 0xb8, 0xca, \ + 0xe8, 0xb2, 0x69, 0x8e, 0xf3, 0x41, 0xad, 0x29, 0xc3, 0xb4, 0x5f, 0x75, \ + 0xa7, 0x47, 0x6f, 0xd5, 0x19, 0x29, 0x55, 0x69, 0x9a, 0x53, 0x3b, 0x20, \ + 0xb4, 0x66, 0x16, 0x60, 0x33, 0x1e, 0xa3, 0x50, 0x30, 0x4e, 0x30, 0x0c, \ + 0x06, 0x03, 0x55, 0x1d, 0x13, 0x04, 0x05, 0x30, 0x03, 0x01, 0x01, 0xff, \ + 0x30, 0x1d, 0x06, 0x03, 0x55, 0x1d, 0x0e, 0x04, 0x16, 0x04, 0x14, 0x9d, \ + 0x6d, 0x20, 0x24, 0x49, 0x01, 0x3f, 0x2b, 0xcb, 0x78, 0xb5, 0x19, 0xbc, \ + 0x7e, 0x24, 0xc9, 0xdb, 0xfb, 0x36, 0x7c, 0x30, 0x1f, 0x06, 0x03, 0x55, \ + 0x1d, 0x23, 0x04, 0x18, 0x30, 0x16, 0x80, 0x14, 0x9d, 0x6d, 0x20, 0x24, \ + 0x49, 0x01, 0x3f, 0x2b, 0xcb, 0x78, 0xb5, 0x19, 0xbc, 0x7e, 0x24, 0xc9, \ + 0xdb, 0xfb, 0x36, 0x7c, 0x30, 0x0c, 0x06, 0x08, 0x2a, 0x86, 0x48, 0xce, \ + 0x3d, 0x04, 0x03, 0x02, 0x05, 0x00, 0x03, 0x68, 0x00, 0x30, 0x65, 0x02, \ + 0x30, 0x51, 0xca, 0xae, 0x30, 0x0f, 0xa4, 0x70, 0x74, 0x04, 0xdd, 0x5a, \ + 0x2c, 0x7f, 0x13, 0xc1, 0xc2, 0x77, 0xbe, 0x1d, 0x00, 0xc5, 0xe2, 0x99, \ + 0x8f, 0x7d, 0x26, 0x45, 0xd3, 0x8a, 0x06, 0x68, 0x3f, 0x8c, 0xb4, 0xb7, \ + 0xad, 0x4d, 0xe0, 0xf1, 0x54, 0x01, 0x1e, 0x99, 0xfc, 0xb0, 0xe4, 0xd3, \ + 0x07, 0x02, 0x31, 0x00, 0xdc, 0x4f, 0x3b, 0x90, 0x1e, 0xae, 0x29, 0x99, \ + 0x84, 0x28, 0xcc, 0x7b, 0x47, 0x78, 0x09, 0x31, 0xdf, 0xd6, 0x01, 0x59, \ + 0x30, 0x5e, 0xf4, 0xf8, 0x8a, 0x84, 0x3f, 0xea, 0x39, 0x54, 0x7b, 0x08, \ + 0xa7, 0x60, 0xaa, 0xbd, 0xf9, 0x5b, 0xd1, 0x51, 0x96, 0x14, 0x2e, 0x65, \ + 0xf5, 0xae, 0x1c, 0x42 \ } /* END FILE */ @@ -160,7 +152,7 @@ "-----BEGIN CERTIFICATE-----\r\n" \ "MIIDQTCCAimgAwIBAgIBAzANBgkqhkiG9w0BAQsFADA7MQswCQYDVQQGEwJOTDER\r\n" \ "MA8GA1UECgwIUG9sYXJTU0wxGTAXBgNVBAMMEFBvbGFyU1NMIFRlc3QgQ0EwHhcN\r\n" \ - "MTEwMjEyMTQ0NDAwWhcNMjEwMjEyMTQ0NDAwWjA7MQswCQYDVQQGEwJOTDERMA8G\r\n" \ + "MTkwMjEwMTQ0NDAwWhcNMjkwMjEwMTQ0NDAwWjA7MQswCQYDVQQGEwJOTDERMA8G\r\n" \ "A1UECgwIUG9sYXJTU0wxGTAXBgNVBAMMEFBvbGFyU1NMIFRlc3QgQ0EwggEiMA0G\r\n" \ "CSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDA3zf8F7vglp0/ht6WMn1EpRagzSHx\r\n" \ "mdTs6st8GFgIlKXsm8WL3xoemTiZhx57wI053zhdcHgH057Zk+i5clHFzqMwUqny\r\n" \ @@ -170,12 +162,12 @@ "KNF+AksjoBXyOGVkCeoMbo4bF6BxyLObyavpw/LPh5aPgAIynplYb6LVAgMBAAGj\r\n" \ "UDBOMAwGA1UdEwQFMAMBAf8wHQYDVR0OBBYEFLRa5KWz3tJS9rnVppUP6z68x/3/\r\n" \ "MB8GA1UdIwQYMBaAFLRa5KWz3tJS9rnVppUP6z68x/3/MA0GCSqGSIb3DQEBCwUA\r\n" \ - "A4IBAQB2W2dIy4q4KysbrTL4HIaOqu62RceGuQ/KhyiI6O0ndCtQ/PgCBqHHTP8u\r\n" \ - "8F1X2ivb60ynHV6baMLPI4Kf1k4MONtLSf/++1qh0Gdycd3A8IDAfy0YnC1F3OPK\r\n" \ - "vWO/cZGitKoTbEpP4y4Rng3sFCDndRCWIRIDOEEW/H3lCcfL7sOQojdLl85ajFkh\r\n" \ - "YvcDqjmnTcspUnuq9Y00C7porXJthZwz1S18qVjcFNk0zEhVMUbupSrdXVmKtOJW\r\n" \ - "MWZjgcA+OXzcnb2hSKWbhjykH/u6/PqkuHPkD723rwXbmHdxRVS9CW57kDkn5ezJ\r\n" \ - "5pE6Sam4qFsCNFJNBV9FRf3ZBMFi\r\n" \ + "A4IBAQA4qFSCth2q22uJIdE4KGHJsJjVEfw2/xn+MkTvCMfxVrvmRvqCtjE4tKDl\r\n" \ + "oK4MxFOek07oDZwvtAT9ijn1hHftTNS7RH9zd/fxNpfcHnMZXVC4w4DNA1fSANtW\r\n" \ + "5sY1JB5Je9jScrsLSS+mAjyv0Ow3Hb2Bix8wu7xNNrV5fIf7Ubm+wt6SqEBxu3Kb\r\n" \ + "+EfObAT4huf3czznhH3C17ed6NSbXwoXfby7stWUDeRJv08RaFOykf/Aae7bY5PL\r\n" \ + "yTVrkAnikMntJ9YI+hNNYt3inqq11A5cN0+rVTst8UKCxzQ4GpvroSwPKTFkbMw4\r\n" \ + "/anT1dVxr/BtwJfiESoK3/4CeXR1\r\n" \ "-----END CERTIFICATE-----\r\n" /* END FILE */ @@ -183,76 +175,76 @@ * using `xxd -i`. */ /* BEGIN FILE binary macro TEST_CA_CRT_RSA_SHA256_DER tests/data_files/test-ca-sha256.crt.der */ #define TEST_CA_CRT_RSA_SHA256_DER { \ - 0x30, 0x82, 0x03, 0x41, 0x30, 0x82, 0x02, 0x29, 0xa0, 0x03, 0x02, 0x01, \ - 0x02, 0x02, 0x01, 0x03, 0x30, 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, \ - 0xf7, 0x0d, 0x01, 0x01, 0x0b, 0x05, 0x00, 0x30, 0x3b, 0x31, 0x0b, 0x30, \ - 0x09, 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, \ - 0x30, 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, \ - 0x61, 0x72, 0x53, 0x53, 0x4c, 0x31, 0x19, 0x30, 0x17, 0x06, 0x03, 0x55, \ - 0x04, 0x03, 0x0c, 0x10, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, \ - 0x20, 0x54, 0x65, 0x73, 0x74, 0x20, 0x43, 0x41, 0x30, 0x1e, 0x17, 0x0d, \ - 0x31, 0x31, 0x30, 0x32, 0x31, 0x32, 0x31, 0x34, 0x34, 0x34, 0x30, 0x30, \ - 0x5a, 0x17, 0x0d, 0x32, 0x31, 0x30, 0x32, 0x31, 0x32, 0x31, 0x34, 0x34, \ - 0x34, 0x30, 0x30, 0x5a, 0x30, 0x3b, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, \ - 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, \ - 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, \ - 0x53, 0x4c, 0x31, 0x19, 0x30, 0x17, 0x06, 0x03, 0x55, 0x04, 0x03, 0x0c, \ - 0x10, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, 0x20, 0x54, 0x65, \ - 0x73, 0x74, 0x20, 0x43, 0x41, 0x30, 0x82, 0x01, 0x22, 0x30, 0x0d, 0x06, \ - 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x01, 0x05, 0x00, \ - 0x03, 0x82, 0x01, 0x0f, 0x00, 0x30, 0x82, 0x01, 0x0a, 0x02, 0x82, 0x01, \ - 0x01, 0x00, 0xc0, 0xdf, 0x37, 0xfc, 0x17, 0xbb, 0xe0, 0x96, 0x9d, 0x3f, \ - 0x86, 0xde, 0x96, 0x32, 0x7d, 0x44, 0xa5, 0x16, 0xa0, 0xcd, 0x21, 0xf1, \ - 0x99, 0xd4, 0xec, 0xea, 0xcb, 0x7c, 0x18, 0x58, 0x08, 0x94, 0xa5, 0xec, \ - 0x9b, 0xc5, 0x8b, 0xdf, 0x1a, 0x1e, 0x99, 0x38, 0x99, 0x87, 0x1e, 0x7b, \ - 0xc0, 0x8d, 0x39, 0xdf, 0x38, 0x5d, 0x70, 0x78, 0x07, 0xd3, 0x9e, 0xd9, \ - 0x93, 0xe8, 0xb9, 0x72, 0x51, 0xc5, 0xce, 0xa3, 0x30, 0x52, 0xa9, 0xf2, \ - 0xe7, 0x40, 0x70, 0x14, 0xcb, 0x44, 0xa2, 0x72, 0x0b, 0xc2, 0xe5, 0x40, \ - 0xf9, 0x3e, 0xe5, 0xa6, 0x0e, 0xb3, 0xf9, 0xec, 0x4a, 0x63, 0xc0, 0xb8, \ - 0x29, 0x00, 0x74, 0x9c, 0x57, 0x3b, 0xa8, 0xa5, 0x04, 0x90, 0x71, 0xf1, \ - 0xbd, 0x83, 0xd9, 0x3f, 0xd6, 0xa5, 0xe2, 0x3c, 0x2a, 0x8f, 0xef, 0x27, \ - 0x60, 0xc3, 0xc6, 0x9f, 0xcb, 0xba, 0xec, 0x60, 0x7d, 0xb7, 0xe6, 0x84, \ - 0x32, 0xbe, 0x4f, 0xfb, 0x58, 0x26, 0x22, 0x03, 0x5b, 0xd4, 0xb4, 0xd5, \ - 0xfb, 0xf5, 0xe3, 0x96, 0x2e, 0x70, 0xc0, 0xe4, 0x2e, 0xbd, 0xfc, 0x2e, \ - 0xee, 0xe2, 0x41, 0x55, 0xc0, 0x34, 0x2e, 0x7d, 0x24, 0x72, 0x69, 0xcb, \ - 0x47, 0xb1, 0x14, 0x40, 0x83, 0x7d, 0x67, 0xf4, 0x86, 0xf6, 0x31, 0xab, \ - 0xf1, 0x79, 0xa4, 0xb2, 0xb5, 0x2e, 0x12, 0xf9, 0x84, 0x17, 0xf0, 0x62, \ - 0x6f, 0x27, 0x3e, 0x13, 0x58, 0xb1, 0x54, 0x0d, 0x21, 0x9a, 0x73, 0x37, \ - 0xa1, 0x30, 0xcf, 0x6f, 0x92, 0xdc, 0xf6, 0xe9, 0xfc, 0xac, 0xdb, 0x2e, \ - 0x28, 0xd1, 0x7e, 0x02, 0x4b, 0x23, 0xa0, 0x15, 0xf2, 0x38, 0x65, 0x64, \ - 0x09, 0xea, 0x0c, 0x6e, 0x8e, 0x1b, 0x17, 0xa0, 0x71, 0xc8, 0xb3, 0x9b, \ - 0xc9, 0xab, 0xe9, 0xc3, 0xf2, 0xcf, 0x87, 0x96, 0x8f, 0x80, 0x02, 0x32, \ - 0x9e, 0x99, 0x58, 0x6f, 0xa2, 0xd5, 0x02, 0x03, 0x01, 0x00, 0x01, 0xa3, \ - 0x50, 0x30, 0x4e, 0x30, 0x0c, 0x06, 0x03, 0x55, 0x1d, 0x13, 0x04, 0x05, \ - 0x30, 0x03, 0x01, 0x01, 0xff, 0x30, 0x1d, 0x06, 0x03, 0x55, 0x1d, 0x0e, \ - 0x04, 0x16, 0x04, 0x14, 0xb4, 0x5a, 0xe4, 0xa5, 0xb3, 0xde, 0xd2, 0x52, \ - 0xf6, 0xb9, 0xd5, 0xa6, 0x95, 0x0f, 0xeb, 0x3e, 0xbc, 0xc7, 0xfd, 0xff, \ - 0x30, 0x1f, 0x06, 0x03, 0x55, 0x1d, 0x23, 0x04, 0x18, 0x30, 0x16, 0x80, \ - 0x14, 0xb4, 0x5a, 0xe4, 0xa5, 0xb3, 0xde, 0xd2, 0x52, 0xf6, 0xb9, 0xd5, \ - 0xa6, 0x95, 0x0f, 0xeb, 0x3e, 0xbc, 0xc7, 0xfd, 0xff, 0x30, 0x0d, 0x06, \ - 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x0b, 0x05, 0x00, \ - 0x03, 0x82, 0x01, 0x01, 0x00, 0x76, 0x5b, 0x67, 0x48, 0xcb, 0x8a, 0xb8, \ - 0x2b, 0x2b, 0x1b, 0xad, 0x32, 0xf8, 0x1c, 0x86, 0x8e, 0xaa, 0xee, 0xb6, \ - 0x45, 0xc7, 0x86, 0xb9, 0x0f, 0xca, 0x87, 0x28, 0x88, 0xe8, 0xed, 0x27, \ - 0x74, 0x2b, 0x50, 0xfc, 0xf8, 0x02, 0x06, 0xa1, 0xc7, 0x4c, 0xff, 0x2e, \ - 0xf0, 0x5d, 0x57, 0xda, 0x2b, 0xdb, 0xeb, 0x4c, 0xa7, 0x1d, 0x5e, 0x9b, \ - 0x68, 0xc2, 0xcf, 0x23, 0x82, 0x9f, 0xd6, 0x4e, 0x0c, 0x38, 0xdb, 0x4b, \ - 0x49, 0xff, 0xfe, 0xfb, 0x5a, 0xa1, 0xd0, 0x67, 0x72, 0x71, 0xdd, 0xc0, \ - 0xf0, 0x80, 0xc0, 0x7f, 0x2d, 0x18, 0x9c, 0x2d, 0x45, 0xdc, 0xe3, 0xca, \ - 0xbd, 0x63, 0xbf, 0x71, 0x91, 0xa2, 0xb4, 0xaa, 0x13, 0x6c, 0x4a, 0x4f, \ - 0xe3, 0x2e, 0x11, 0x9e, 0x0d, 0xec, 0x14, 0x20, 0xe7, 0x75, 0x10, 0x96, \ - 0x21, 0x12, 0x03, 0x38, 0x41, 0x16, 0xfc, 0x7d, 0xe5, 0x09, 0xc7, 0xcb, \ - 0xee, 0xc3, 0x90, 0xa2, 0x37, 0x4b, 0x97, 0xce, 0x5a, 0x8c, 0x59, 0x21, \ - 0x62, 0xf7, 0x03, 0xaa, 0x39, 0xa7, 0x4d, 0xcb, 0x29, 0x52, 0x7b, 0xaa, \ - 0xf5, 0x8d, 0x34, 0x0b, 0xba, 0x68, 0xad, 0x72, 0x6d, 0x85, 0x9c, 0x33, \ - 0xd5, 0x2d, 0x7c, 0xa9, 0x58, 0xdc, 0x14, 0xd9, 0x34, 0xcc, 0x48, 0x55, \ - 0x31, 0x46, 0xee, 0xa5, 0x2a, 0xdd, 0x5d, 0x59, 0x8a, 0xb4, 0xe2, 0x56, \ - 0x31, 0x66, 0x63, 0x81, 0xc0, 0x3e, 0x39, 0x7c, 0xdc, 0x9d, 0xbd, 0xa1, \ - 0x48, 0xa5, 0x9b, 0x86, 0x3c, 0xa4, 0x1f, 0xfb, 0xba, 0xfc, 0xfa, 0xa4, \ - 0xb8, 0x73, 0xe4, 0x0f, 0xbd, 0xb7, 0xaf, 0x05, 0xdb, 0x98, 0x77, 0x71, \ - 0x45, 0x54, 0xbd, 0x09, 0x6e, 0x7b, 0x90, 0x39, 0x27, 0xe5, 0xec, 0xc9, \ - 0xe6, 0x91, 0x3a, 0x49, 0xa9, 0xb8, 0xa8, 0x5b, 0x02, 0x34, 0x52, 0x4d, \ - 0x05, 0x5f, 0x45, 0x45, 0xfd, 0xd9, 0x04, 0xc1, 0x62 \ + 0x30, 0x82, 0x03, 0x41, 0x30, 0x82, 0x02, 0x29, 0xa0, 0x03, 0x02, 0x01, \ + 0x02, 0x02, 0x01, 0x03, 0x30, 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, \ + 0xf7, 0x0d, 0x01, 0x01, 0x0b, 0x05, 0x00, 0x30, 0x3b, 0x31, 0x0b, 0x30, \ + 0x09, 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, \ + 0x30, 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, \ + 0x61, 0x72, 0x53, 0x53, 0x4c, 0x31, 0x19, 0x30, 0x17, 0x06, 0x03, 0x55, \ + 0x04, 0x03, 0x0c, 0x10, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, \ + 0x20, 0x54, 0x65, 0x73, 0x74, 0x20, 0x43, 0x41, 0x30, 0x1e, 0x17, 0x0d, \ + 0x31, 0x39, 0x30, 0x32, 0x31, 0x30, 0x31, 0x34, 0x34, 0x34, 0x30, 0x30, \ + 0x5a, 0x17, 0x0d, 0x32, 0x39, 0x30, 0x32, 0x31, 0x30, 0x31, 0x34, 0x34, \ + 0x34, 0x30, 0x30, 0x5a, 0x30, 0x3b, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, \ + 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, \ + 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, \ + 0x53, 0x4c, 0x31, 0x19, 0x30, 0x17, 0x06, 0x03, 0x55, 0x04, 0x03, 0x0c, \ + 0x10, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, 0x20, 0x54, 0x65, \ + 0x73, 0x74, 0x20, 0x43, 0x41, 0x30, 0x82, 0x01, 0x22, 0x30, 0x0d, 0x06, \ + 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x01, 0x05, 0x00, \ + 0x03, 0x82, 0x01, 0x0f, 0x00, 0x30, 0x82, 0x01, 0x0a, 0x02, 0x82, 0x01, \ + 0x01, 0x00, 0xc0, 0xdf, 0x37, 0xfc, 0x17, 0xbb, 0xe0, 0x96, 0x9d, 0x3f, \ + 0x86, 0xde, 0x96, 0x32, 0x7d, 0x44, 0xa5, 0x16, 0xa0, 0xcd, 0x21, 0xf1, \ + 0x99, 0xd4, 0xec, 0xea, 0xcb, 0x7c, 0x18, 0x58, 0x08, 0x94, 0xa5, 0xec, \ + 0x9b, 0xc5, 0x8b, 0xdf, 0x1a, 0x1e, 0x99, 0x38, 0x99, 0x87, 0x1e, 0x7b, \ + 0xc0, 0x8d, 0x39, 0xdf, 0x38, 0x5d, 0x70, 0x78, 0x07, 0xd3, 0x9e, 0xd9, \ + 0x93, 0xe8, 0xb9, 0x72, 0x51, 0xc5, 0xce, 0xa3, 0x30, 0x52, 0xa9, 0xf2, \ + 0xe7, 0x40, 0x70, 0x14, 0xcb, 0x44, 0xa2, 0x72, 0x0b, 0xc2, 0xe5, 0x40, \ + 0xf9, 0x3e, 0xe5, 0xa6, 0x0e, 0xb3, 0xf9, 0xec, 0x4a, 0x63, 0xc0, 0xb8, \ + 0x29, 0x00, 0x74, 0x9c, 0x57, 0x3b, 0xa8, 0xa5, 0x04, 0x90, 0x71, 0xf1, \ + 0xbd, 0x83, 0xd9, 0x3f, 0xd6, 0xa5, 0xe2, 0x3c, 0x2a, 0x8f, 0xef, 0x27, \ + 0x60, 0xc3, 0xc6, 0x9f, 0xcb, 0xba, 0xec, 0x60, 0x7d, 0xb7, 0xe6, 0x84, \ + 0x32, 0xbe, 0x4f, 0xfb, 0x58, 0x26, 0x22, 0x03, 0x5b, 0xd4, 0xb4, 0xd5, \ + 0xfb, 0xf5, 0xe3, 0x96, 0x2e, 0x70, 0xc0, 0xe4, 0x2e, 0xbd, 0xfc, 0x2e, \ + 0xee, 0xe2, 0x41, 0x55, 0xc0, 0x34, 0x2e, 0x7d, 0x24, 0x72, 0x69, 0xcb, \ + 0x47, 0xb1, 0x14, 0x40, 0x83, 0x7d, 0x67, 0xf4, 0x86, 0xf6, 0x31, 0xab, \ + 0xf1, 0x79, 0xa4, 0xb2, 0xb5, 0x2e, 0x12, 0xf9, 0x84, 0x17, 0xf0, 0x62, \ + 0x6f, 0x27, 0x3e, 0x13, 0x58, 0xb1, 0x54, 0x0d, 0x21, 0x9a, 0x73, 0x37, \ + 0xa1, 0x30, 0xcf, 0x6f, 0x92, 0xdc, 0xf6, 0xe9, 0xfc, 0xac, 0xdb, 0x2e, \ + 0x28, 0xd1, 0x7e, 0x02, 0x4b, 0x23, 0xa0, 0x15, 0xf2, 0x38, 0x65, 0x64, \ + 0x09, 0xea, 0x0c, 0x6e, 0x8e, 0x1b, 0x17, 0xa0, 0x71, 0xc8, 0xb3, 0x9b, \ + 0xc9, 0xab, 0xe9, 0xc3, 0xf2, 0xcf, 0x87, 0x96, 0x8f, 0x80, 0x02, 0x32, \ + 0x9e, 0x99, 0x58, 0x6f, 0xa2, 0xd5, 0x02, 0x03, 0x01, 0x00, 0x01, 0xa3, \ + 0x50, 0x30, 0x4e, 0x30, 0x0c, 0x06, 0x03, 0x55, 0x1d, 0x13, 0x04, 0x05, \ + 0x30, 0x03, 0x01, 0x01, 0xff, 0x30, 0x1d, 0x06, 0x03, 0x55, 0x1d, 0x0e, \ + 0x04, 0x16, 0x04, 0x14, 0xb4, 0x5a, 0xe4, 0xa5, 0xb3, 0xde, 0xd2, 0x52, \ + 0xf6, 0xb9, 0xd5, 0xa6, 0x95, 0x0f, 0xeb, 0x3e, 0xbc, 0xc7, 0xfd, 0xff, \ + 0x30, 0x1f, 0x06, 0x03, 0x55, 0x1d, 0x23, 0x04, 0x18, 0x30, 0x16, 0x80, \ + 0x14, 0xb4, 0x5a, 0xe4, 0xa5, 0xb3, 0xde, 0xd2, 0x52, 0xf6, 0xb9, 0xd5, \ + 0xa6, 0x95, 0x0f, 0xeb, 0x3e, 0xbc, 0xc7, 0xfd, 0xff, 0x30, 0x0d, 0x06, \ + 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x0b, 0x05, 0x00, \ + 0x03, 0x82, 0x01, 0x01, 0x00, 0x38, 0xa8, 0x54, 0x82, 0xb6, 0x1d, 0xaa, \ + 0xdb, 0x6b, 0x89, 0x21, 0xd1, 0x38, 0x28, 0x61, 0xc9, 0xb0, 0x98, 0xd5, \ + 0x11, 0xfc, 0x36, 0xff, 0x19, 0xfe, 0x32, 0x44, 0xef, 0x08, 0xc7, 0xf1, \ + 0x56, 0xbb, 0xe6, 0x46, 0xfa, 0x82, 0xb6, 0x31, 0x38, 0xb4, 0xa0, 0xe5, \ + 0xa0, 0xae, 0x0c, 0xc4, 0x53, 0x9e, 0x93, 0x4e, 0xe8, 0x0d, 0x9c, 0x2f, \ + 0xb4, 0x04, 0xfd, 0x8a, 0x39, 0xf5, 0x84, 0x77, 0xed, 0x4c, 0xd4, 0xbb, \ + 0x44, 0x7f, 0x73, 0x77, 0xf7, 0xf1, 0x36, 0x97, 0xdc, 0x1e, 0x73, 0x19, \ + 0x5d, 0x50, 0xb8, 0xc3, 0x80, 0xcd, 0x03, 0x57, 0xd2, 0x00, 0xdb, 0x56, \ + 0xe6, 0xc6, 0x35, 0x24, 0x1e, 0x49, 0x7b, 0xd8, 0xd2, 0x72, 0xbb, 0x0b, \ + 0x49, 0x2f, 0xa6, 0x02, 0x3c, 0xaf, 0xd0, 0xec, 0x37, 0x1d, 0xbd, 0x81, \ + 0x8b, 0x1f, 0x30, 0xbb, 0xbc, 0x4d, 0x36, 0xb5, 0x79, 0x7c, 0x87, 0xfb, \ + 0x51, 0xb9, 0xbe, 0xc2, 0xde, 0x92, 0xa8, 0x40, 0x71, 0xbb, 0x72, 0x9b, \ + 0xf8, 0x47, 0xce, 0x6c, 0x04, 0xf8, 0x86, 0xe7, 0xf7, 0x73, 0x3c, 0xe7, \ + 0x84, 0x7d, 0xc2, 0xd7, 0xb7, 0x9d, 0xe8, 0xd4, 0x9b, 0x5f, 0x0a, 0x17, \ + 0x7d, 0xbc, 0xbb, 0xb2, 0xd5, 0x94, 0x0d, 0xe4, 0x49, 0xbf, 0x4f, 0x11, \ + 0x68, 0x53, 0xb2, 0x91, 0xff, 0xc0, 0x69, 0xee, 0xdb, 0x63, 0x93, 0xcb, \ + 0xc9, 0x35, 0x6b, 0x90, 0x09, 0xe2, 0x90, 0xc9, 0xed, 0x27, 0xd6, 0x08, \ + 0xfa, 0x13, 0x4d, 0x62, 0xdd, 0xe2, 0x9e, 0xaa, 0xb5, 0xd4, 0x0e, 0x5c, \ + 0x37, 0x4f, 0xab, 0x55, 0x3b, 0x2d, 0xf1, 0x42, 0x82, 0xc7, 0x34, 0x38, \ + 0x1a, 0x9b, 0xeb, 0xa1, 0x2c, 0x0f, 0x29, 0x31, 0x64, 0x6c, 0xcc, 0x38, \ + 0xfd, 0xa9, 0xd3, 0xd5, 0xd5, 0x71, 0xaf, 0xf0, 0x6d, 0xc0, 0x97, 0xe2, \ + 0x11, 0x2a, 0x0a, 0xdf, 0xfe, 0x02, 0x79, 0x74, 0x75 \ } /* END FILE */ @@ -262,7 +254,7 @@ "-----BEGIN CERTIFICATE-----\r\n" \ "MIIDQTCCAimgAwIBAgIBAzANBgkqhkiG9w0BAQUFADA7MQswCQYDVQQGEwJOTDER\r\n" \ "MA8GA1UECgwIUG9sYXJTU0wxGTAXBgNVBAMMEFBvbGFyU1NMIFRlc3QgQ0EwHhcN\r\n" \ - "MTEwMjEyMTQ0NDAwWhcNMjEwMjEyMTQ0NDAwWjA7MQswCQYDVQQGEwJOTDERMA8G\r\n" \ + "MTkwMjEwMTQ0NDAwWhcNMjkwMjEwMTQ0NDAwWjA7MQswCQYDVQQGEwJOTDERMA8G\r\n" \ "A1UECgwIUG9sYXJTU0wxGTAXBgNVBAMMEFBvbGFyU1NMIFRlc3QgQ0EwggEiMA0G\r\n" \ "CSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDA3zf8F7vglp0/ht6WMn1EpRagzSHx\r\n" \ "mdTs6st8GFgIlKXsm8WL3xoemTiZhx57wI053zhdcHgH057Zk+i5clHFzqMwUqny\r\n" \ @@ -272,88 +264,88 @@ "KNF+AksjoBXyOGVkCeoMbo4bF6BxyLObyavpw/LPh5aPgAIynplYb6LVAgMBAAGj\r\n" \ "UDBOMAwGA1UdEwQFMAMBAf8wHQYDVR0OBBYEFLRa5KWz3tJS9rnVppUP6z68x/3/\r\n" \ "MB8GA1UdIwQYMBaAFLRa5KWz3tJS9rnVppUP6z68x/3/MA0GCSqGSIb3DQEBBQUA\r\n" \ - "A4IBAQABE3OEPfEd/bcJW5ZdU3/VgPNS4tMzh8gnJP/V2FcvFtGylMpQq6YnEBYI\r\n" \ - "yBHAL4DRvlMY5rnXGBp3ODR8MpqHC6AquRTCLzjS57iYff//4QFQqW9n92zctspv\r\n" \ - "czkaPKgjqo1No3Uq0Xaz10rcxyTUPrf5wNVRZ2V0KvllvAAVSzbI4mpdUXztjhST\r\n" \ - "S5A2BeWQAAOr0zq1F7TSRVJpJs7jmB2ai/igkh1IAjcuwV6VwlP+sbw0gjQ0NpGM\r\n" \ - "iHpnlzRAi/tIbtOvMIGOBU2TIfax/5jq1agUx5aPmT5TWAiJPOOP6l5xXnDwxeYS\r\n" \ - "NWqiX9GyusBZjezaCaHabjDLU0qQ\r\n" \ + "A4IBAQB0ZiNRFdia6kskaPnhrqejIRq8YMEGAf2oIPnyZ78xoyERgc35lHGyMtsL\r\n" \ + "hWicNjP4d/hS9As4j5KA2gdNGi5ETA1X7SowWOGsryivSpMSHVy1+HdfWlsYQOzm\r\n" \ + "8o+faQNUm8XzPVmttfAVspxeHSxJZ36Oo+QWZ5wZlCIEyjEdLUId+Tm4Bz3B5jRD\r\n" \ + "zZa/SaqDokq66N2zpbgKKAl3GU2O++fBqP2dSkdQykmTxhLLWRN8FJqhYATyQntZ\r\n" \ + "0QSi3W9HfSZPnFTcPIXeoiPd2pLlxt1hZu8dws2LTXE63uP6MM4LHvWxiuJaWkP/\r\n" \ + "mtxyUALj2pQxRitopORFQdn7AOY5\r\n" \ "-----END CERTIFICATE-----\r\n" /* END FILE */ /* This is taken from tests/data_files/test-ca-sha1.crt.der. */ /* BEGIN FILE binary macro TEST_CA_CRT_RSA_SHA1_DER tests/data_files/test-ca-sha1.crt.der */ #define TEST_CA_CRT_RSA_SHA1_DER { \ - 0x30, 0x82, 0x03, 0x41, 0x30, 0x82, 0x02, 0x29, 0xa0, 0x03, 0x02, 0x01, \ - 0x02, 0x02, 0x01, 0x03, 0x30, 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, \ - 0xf7, 0x0d, 0x01, 0x01, 0x05, 0x05, 0x00, 0x30, 0x3b, 0x31, 0x0b, 0x30, \ - 0x09, 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, \ - 0x30, 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, \ - 0x61, 0x72, 0x53, 0x53, 0x4c, 0x31, 0x19, 0x30, 0x17, 0x06, 0x03, 0x55, \ - 0x04, 0x03, 0x0c, 0x10, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, \ - 0x20, 0x54, 0x65, 0x73, 0x74, 0x20, 0x43, 0x41, 0x30, 0x1e, 0x17, 0x0d, \ - 0x31, 0x31, 0x30, 0x32, 0x31, 0x32, 0x31, 0x34, 0x34, 0x34, 0x30, 0x30, \ - 0x5a, 0x17, 0x0d, 0x32, 0x31, 0x30, 0x32, 0x31, 0x32, 0x31, 0x34, 0x34, \ - 0x34, 0x30, 0x30, 0x5a, 0x30, 0x3b, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, \ - 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, \ - 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, \ - 0x53, 0x4c, 0x31, 0x19, 0x30, 0x17, 0x06, 0x03, 0x55, 0x04, 0x03, 0x0c, \ - 0x10, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, 0x20, 0x54, 0x65, \ - 0x73, 0x74, 0x20, 0x43, 0x41, 0x30, 0x82, 0x01, 0x22, 0x30, 0x0d, 0x06, \ - 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x01, 0x05, 0x00, \ - 0x03, 0x82, 0x01, 0x0f, 0x00, 0x30, 0x82, 0x01, 0x0a, 0x02, 0x82, 0x01, \ - 0x01, 0x00, 0xc0, 0xdf, 0x37, 0xfc, 0x17, 0xbb, 0xe0, 0x96, 0x9d, 0x3f, \ - 0x86, 0xde, 0x96, 0x32, 0x7d, 0x44, 0xa5, 0x16, 0xa0, 0xcd, 0x21, 0xf1, \ - 0x99, 0xd4, 0xec, 0xea, 0xcb, 0x7c, 0x18, 0x58, 0x08, 0x94, 0xa5, 0xec, \ - 0x9b, 0xc5, 0x8b, 0xdf, 0x1a, 0x1e, 0x99, 0x38, 0x99, 0x87, 0x1e, 0x7b, \ - 0xc0, 0x8d, 0x39, 0xdf, 0x38, 0x5d, 0x70, 0x78, 0x07, 0xd3, 0x9e, 0xd9, \ - 0x93, 0xe8, 0xb9, 0x72, 0x51, 0xc5, 0xce, 0xa3, 0x30, 0x52, 0xa9, 0xf2, \ - 0xe7, 0x40, 0x70, 0x14, 0xcb, 0x44, 0xa2, 0x72, 0x0b, 0xc2, 0xe5, 0x40, \ - 0xf9, 0x3e, 0xe5, 0xa6, 0x0e, 0xb3, 0xf9, 0xec, 0x4a, 0x63, 0xc0, 0xb8, \ - 0x29, 0x00, 0x74, 0x9c, 0x57, 0x3b, 0xa8, 0xa5, 0x04, 0x90, 0x71, 0xf1, \ - 0xbd, 0x83, 0xd9, 0x3f, 0xd6, 0xa5, 0xe2, 0x3c, 0x2a, 0x8f, 0xef, 0x27, \ - 0x60, 0xc3, 0xc6, 0x9f, 0xcb, 0xba, 0xec, 0x60, 0x7d, 0xb7, 0xe6, 0x84, \ - 0x32, 0xbe, 0x4f, 0xfb, 0x58, 0x26, 0x22, 0x03, 0x5b, 0xd4, 0xb4, 0xd5, \ - 0xfb, 0xf5, 0xe3, 0x96, 0x2e, 0x70, 0xc0, 0xe4, 0x2e, 0xbd, 0xfc, 0x2e, \ - 0xee, 0xe2, 0x41, 0x55, 0xc0, 0x34, 0x2e, 0x7d, 0x24, 0x72, 0x69, 0xcb, \ - 0x47, 0xb1, 0x14, 0x40, 0x83, 0x7d, 0x67, 0xf4, 0x86, 0xf6, 0x31, 0xab, \ - 0xf1, 0x79, 0xa4, 0xb2, 0xb5, 0x2e, 0x12, 0xf9, 0x84, 0x17, 0xf0, 0x62, \ - 0x6f, 0x27, 0x3e, 0x13, 0x58, 0xb1, 0x54, 0x0d, 0x21, 0x9a, 0x73, 0x37, \ - 0xa1, 0x30, 0xcf, 0x6f, 0x92, 0xdc, 0xf6, 0xe9, 0xfc, 0xac, 0xdb, 0x2e, \ - 0x28, 0xd1, 0x7e, 0x02, 0x4b, 0x23, 0xa0, 0x15, 0xf2, 0x38, 0x65, 0x64, \ - 0x09, 0xea, 0x0c, 0x6e, 0x8e, 0x1b, 0x17, 0xa0, 0x71, 0xc8, 0xb3, 0x9b, \ - 0xc9, 0xab, 0xe9, 0xc3, 0xf2, 0xcf, 0x87, 0x96, 0x8f, 0x80, 0x02, 0x32, \ - 0x9e, 0x99, 0x58, 0x6f, 0xa2, 0xd5, 0x02, 0x03, 0x01, 0x00, 0x01, 0xa3, \ - 0x50, 0x30, 0x4e, 0x30, 0x0c, 0x06, 0x03, 0x55, 0x1d, 0x13, 0x04, 0x05, \ - 0x30, 0x03, 0x01, 0x01, 0xff, 0x30, 0x1d, 0x06, 0x03, 0x55, 0x1d, 0x0e, \ - 0x04, 0x16, 0x04, 0x14, 0xb4, 0x5a, 0xe4, 0xa5, 0xb3, 0xde, 0xd2, 0x52, \ - 0xf6, 0xb9, 0xd5, 0xa6, 0x95, 0x0f, 0xeb, 0x3e, 0xbc, 0xc7, 0xfd, 0xff, \ - 0x30, 0x1f, 0x06, 0x03, 0x55, 0x1d, 0x23, 0x04, 0x18, 0x30, 0x16, 0x80, \ - 0x14, 0xb4, 0x5a, 0xe4, 0xa5, 0xb3, 0xde, 0xd2, 0x52, 0xf6, 0xb9, 0xd5, \ - 0xa6, 0x95, 0x0f, 0xeb, 0x3e, 0xbc, 0xc7, 0xfd, 0xff, 0x30, 0x0d, 0x06, \ - 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x05, 0x05, 0x00, \ - 0x03, 0x82, 0x01, 0x01, 0x00, 0x01, 0x13, 0x73, 0x84, 0x3d, 0xf1, 0x1d, \ - 0xfd, 0xb7, 0x09, 0x5b, 0x96, 0x5d, 0x53, 0x7f, 0xd5, 0x80, 0xf3, 0x52, \ - 0xe2, 0xd3, 0x33, 0x87, 0xc8, 0x27, 0x24, 0xff, 0xd5, 0xd8, 0x57, 0x2f, \ - 0x16, 0xd1, 0xb2, 0x94, 0xca, 0x50, 0xab, 0xa6, 0x27, 0x10, 0x16, 0x08, \ - 0xc8, 0x11, 0xc0, 0x2f, 0x80, 0xd1, 0xbe, 0x53, 0x18, 0xe6, 0xb9, 0xd7, \ - 0x18, 0x1a, 0x77, 0x38, 0x34, 0x7c, 0x32, 0x9a, 0x87, 0x0b, 0xa0, 0x2a, \ - 0xb9, 0x14, 0xc2, 0x2f, 0x38, 0xd2, 0xe7, 0xb8, 0x98, 0x7d, 0xff, 0xff, \ - 0xe1, 0x01, 0x50, 0xa9, 0x6f, 0x67, 0xf7, 0x6c, 0xdc, 0xb6, 0xca, 0x6f, \ - 0x73, 0x39, 0x1a, 0x3c, 0xa8, 0x23, 0xaa, 0x8d, 0x4d, 0xa3, 0x75, 0x2a, \ - 0xd1, 0x76, 0xb3, 0xd7, 0x4a, 0xdc, 0xc7, 0x24, 0xd4, 0x3e, 0xb7, 0xf9, \ - 0xc0, 0xd5, 0x51, 0x67, 0x65, 0x74, 0x2a, 0xf9, 0x65, 0xbc, 0x00, 0x15, \ - 0x4b, 0x36, 0xc8, 0xe2, 0x6a, 0x5d, 0x51, 0x7c, 0xed, 0x8e, 0x14, 0x93, \ - 0x4b, 0x90, 0x36, 0x05, 0xe5, 0x90, 0x00, 0x03, 0xab, 0xd3, 0x3a, 0xb5, \ - 0x17, 0xb4, 0xd2, 0x45, 0x52, 0x69, 0x26, 0xce, 0xe3, 0x98, 0x1d, 0x9a, \ - 0x8b, 0xf8, 0xa0, 0x92, 0x1d, 0x48, 0x02, 0x37, 0x2e, 0xc1, 0x5e, 0x95, \ - 0xc2, 0x53, 0xfe, 0xb1, 0xbc, 0x34, 0x82, 0x34, 0x34, 0x36, 0x91, 0x8c, \ - 0x88, 0x7a, 0x67, 0x97, 0x34, 0x40, 0x8b, 0xfb, 0x48, 0x6e, 0xd3, 0xaf, \ - 0x30, 0x81, 0x8e, 0x05, 0x4d, 0x93, 0x21, 0xf6, 0xb1, 0xff, 0x98, 0xea, \ - 0xd5, 0xa8, 0x14, 0xc7, 0x96, 0x8f, 0x99, 0x3e, 0x53, 0x58, 0x08, 0x89, \ - 0x3c, 0xe3, 0x8f, 0xea, 0x5e, 0x71, 0x5e, 0x70, 0xf0, 0xc5, 0xe6, 0x12, \ - 0x35, 0x6a, 0xa2, 0x5f, 0xd1, 0xb2, 0xba, 0xc0, 0x59, 0x8d, 0xec, 0xda, \ - 0x09, 0xa1, 0xda, 0x6e, 0x30, 0xcb, 0x53, 0x4a, 0x90 \ + 0x30, 0x82, 0x03, 0x41, 0x30, 0x82, 0x02, 0x29, 0xa0, 0x03, 0x02, 0x01, \ + 0x02, 0x02, 0x01, 0x03, 0x30, 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, \ + 0xf7, 0x0d, 0x01, 0x01, 0x05, 0x05, 0x00, 0x30, 0x3b, 0x31, 0x0b, 0x30, \ + 0x09, 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, \ + 0x30, 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, \ + 0x61, 0x72, 0x53, 0x53, 0x4c, 0x31, 0x19, 0x30, 0x17, 0x06, 0x03, 0x55, \ + 0x04, 0x03, 0x0c, 0x10, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, \ + 0x20, 0x54, 0x65, 0x73, 0x74, 0x20, 0x43, 0x41, 0x30, 0x1e, 0x17, 0x0d, \ + 0x31, 0x39, 0x30, 0x32, 0x31, 0x30, 0x31, 0x34, 0x34, 0x34, 0x30, 0x30, \ + 0x5a, 0x17, 0x0d, 0x32, 0x39, 0x30, 0x32, 0x31, 0x30, 0x31, 0x34, 0x34, \ + 0x34, 0x30, 0x30, 0x5a, 0x30, 0x3b, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, \ + 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, \ + 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, \ + 0x53, 0x4c, 0x31, 0x19, 0x30, 0x17, 0x06, 0x03, 0x55, 0x04, 0x03, 0x0c, \ + 0x10, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, 0x20, 0x54, 0x65, \ + 0x73, 0x74, 0x20, 0x43, 0x41, 0x30, 0x82, 0x01, 0x22, 0x30, 0x0d, 0x06, \ + 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x01, 0x05, 0x00, \ + 0x03, 0x82, 0x01, 0x0f, 0x00, 0x30, 0x82, 0x01, 0x0a, 0x02, 0x82, 0x01, \ + 0x01, 0x00, 0xc0, 0xdf, 0x37, 0xfc, 0x17, 0xbb, 0xe0, 0x96, 0x9d, 0x3f, \ + 0x86, 0xde, 0x96, 0x32, 0x7d, 0x44, 0xa5, 0x16, 0xa0, 0xcd, 0x21, 0xf1, \ + 0x99, 0xd4, 0xec, 0xea, 0xcb, 0x7c, 0x18, 0x58, 0x08, 0x94, 0xa5, 0xec, \ + 0x9b, 0xc5, 0x8b, 0xdf, 0x1a, 0x1e, 0x99, 0x38, 0x99, 0x87, 0x1e, 0x7b, \ + 0xc0, 0x8d, 0x39, 0xdf, 0x38, 0x5d, 0x70, 0x78, 0x07, 0xd3, 0x9e, 0xd9, \ + 0x93, 0xe8, 0xb9, 0x72, 0x51, 0xc5, 0xce, 0xa3, 0x30, 0x52, 0xa9, 0xf2, \ + 0xe7, 0x40, 0x70, 0x14, 0xcb, 0x44, 0xa2, 0x72, 0x0b, 0xc2, 0xe5, 0x40, \ + 0xf9, 0x3e, 0xe5, 0xa6, 0x0e, 0xb3, 0xf9, 0xec, 0x4a, 0x63, 0xc0, 0xb8, \ + 0x29, 0x00, 0x74, 0x9c, 0x57, 0x3b, 0xa8, 0xa5, 0x04, 0x90, 0x71, 0xf1, \ + 0xbd, 0x83, 0xd9, 0x3f, 0xd6, 0xa5, 0xe2, 0x3c, 0x2a, 0x8f, 0xef, 0x27, \ + 0x60, 0xc3, 0xc6, 0x9f, 0xcb, 0xba, 0xec, 0x60, 0x7d, 0xb7, 0xe6, 0x84, \ + 0x32, 0xbe, 0x4f, 0xfb, 0x58, 0x26, 0x22, 0x03, 0x5b, 0xd4, 0xb4, 0xd5, \ + 0xfb, 0xf5, 0xe3, 0x96, 0x2e, 0x70, 0xc0, 0xe4, 0x2e, 0xbd, 0xfc, 0x2e, \ + 0xee, 0xe2, 0x41, 0x55, 0xc0, 0x34, 0x2e, 0x7d, 0x24, 0x72, 0x69, 0xcb, \ + 0x47, 0xb1, 0x14, 0x40, 0x83, 0x7d, 0x67, 0xf4, 0x86, 0xf6, 0x31, 0xab, \ + 0xf1, 0x79, 0xa4, 0xb2, 0xb5, 0x2e, 0x12, 0xf9, 0x84, 0x17, 0xf0, 0x62, \ + 0x6f, 0x27, 0x3e, 0x13, 0x58, 0xb1, 0x54, 0x0d, 0x21, 0x9a, 0x73, 0x37, \ + 0xa1, 0x30, 0xcf, 0x6f, 0x92, 0xdc, 0xf6, 0xe9, 0xfc, 0xac, 0xdb, 0x2e, \ + 0x28, 0xd1, 0x7e, 0x02, 0x4b, 0x23, 0xa0, 0x15, 0xf2, 0x38, 0x65, 0x64, \ + 0x09, 0xea, 0x0c, 0x6e, 0x8e, 0x1b, 0x17, 0xa0, 0x71, 0xc8, 0xb3, 0x9b, \ + 0xc9, 0xab, 0xe9, 0xc3, 0xf2, 0xcf, 0x87, 0x96, 0x8f, 0x80, 0x02, 0x32, \ + 0x9e, 0x99, 0x58, 0x6f, 0xa2, 0xd5, 0x02, 0x03, 0x01, 0x00, 0x01, 0xa3, \ + 0x50, 0x30, 0x4e, 0x30, 0x0c, 0x06, 0x03, 0x55, 0x1d, 0x13, 0x04, 0x05, \ + 0x30, 0x03, 0x01, 0x01, 0xff, 0x30, 0x1d, 0x06, 0x03, 0x55, 0x1d, 0x0e, \ + 0x04, 0x16, 0x04, 0x14, 0xb4, 0x5a, 0xe4, 0xa5, 0xb3, 0xde, 0xd2, 0x52, \ + 0xf6, 0xb9, 0xd5, 0xa6, 0x95, 0x0f, 0xeb, 0x3e, 0xbc, 0xc7, 0xfd, 0xff, \ + 0x30, 0x1f, 0x06, 0x03, 0x55, 0x1d, 0x23, 0x04, 0x18, 0x30, 0x16, 0x80, \ + 0x14, 0xb4, 0x5a, 0xe4, 0xa5, 0xb3, 0xde, 0xd2, 0x52, 0xf6, 0xb9, 0xd5, \ + 0xa6, 0x95, 0x0f, 0xeb, 0x3e, 0xbc, 0xc7, 0xfd, 0xff, 0x30, 0x0d, 0x06, \ + 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x05, 0x05, 0x00, \ + 0x03, 0x82, 0x01, 0x01, 0x00, 0x74, 0x66, 0x23, 0x51, 0x15, 0xd8, 0x9a, \ + 0xea, 0x4b, 0x24, 0x68, 0xf9, 0xe1, 0xae, 0xa7, 0xa3, 0x21, 0x1a, 0xbc, \ + 0x60, 0xc1, 0x06, 0x01, 0xfd, 0xa8, 0x20, 0xf9, 0xf2, 0x67, 0xbf, 0x31, \ + 0xa3, 0x21, 0x11, 0x81, 0xcd, 0xf9, 0x94, 0x71, 0xb2, 0x32, 0xdb, 0x0b, \ + 0x85, 0x68, 0x9c, 0x36, 0x33, 0xf8, 0x77, 0xf8, 0x52, 0xf4, 0x0b, 0x38, \ + 0x8f, 0x92, 0x80, 0xda, 0x07, 0x4d, 0x1a, 0x2e, 0x44, 0x4c, 0x0d, 0x57, \ + 0xed, 0x2a, 0x30, 0x58, 0xe1, 0xac, 0xaf, 0x28, 0xaf, 0x4a, 0x93, 0x12, \ + 0x1d, 0x5c, 0xb5, 0xf8, 0x77, 0x5f, 0x5a, 0x5b, 0x18, 0x40, 0xec, 0xe6, \ + 0xf2, 0x8f, 0x9f, 0x69, 0x03, 0x54, 0x9b, 0xc5, 0xf3, 0x3d, 0x59, 0xad, \ + 0xb5, 0xf0, 0x15, 0xb2, 0x9c, 0x5e, 0x1d, 0x2c, 0x49, 0x67, 0x7e, 0x8e, \ + 0xa3, 0xe4, 0x16, 0x67, 0x9c, 0x19, 0x94, 0x22, 0x04, 0xca, 0x31, 0x1d, \ + 0x2d, 0x42, 0x1d, 0xf9, 0x39, 0xb8, 0x07, 0x3d, 0xc1, 0xe6, 0x34, 0x43, \ + 0xcd, 0x96, 0xbf, 0x49, 0xaa, 0x83, 0xa2, 0x4a, 0xba, 0xe8, 0xdd, 0xb3, \ + 0xa5, 0xb8, 0x0a, 0x28, 0x09, 0x77, 0x19, 0x4d, 0x8e, 0xfb, 0xe7, 0xc1, \ + 0xa8, 0xfd, 0x9d, 0x4a, 0x47, 0x50, 0xca, 0x49, 0x93, 0xc6, 0x12, 0xcb, \ + 0x59, 0x13, 0x7c, 0x14, 0x9a, 0xa1, 0x60, 0x04, 0xf2, 0x42, 0x7b, 0x59, \ + 0xd1, 0x04, 0xa2, 0xdd, 0x6f, 0x47, 0x7d, 0x26, 0x4f, 0x9c, 0x54, 0xdc, \ + 0x3c, 0x85, 0xde, 0xa2, 0x23, 0xdd, 0xda, 0x92, 0xe5, 0xc6, 0xdd, 0x61, \ + 0x66, 0xef, 0x1d, 0xc2, 0xcd, 0x8b, 0x4d, 0x71, 0x3a, 0xde, 0xe3, 0xfa, \ + 0x30, 0xce, 0x0b, 0x1e, 0xf5, 0xb1, 0x8a, 0xe2, 0x5a, 0x5a, 0x43, 0xff, \ + 0x9a, 0xdc, 0x72, 0x50, 0x02, 0xe3, 0xda, 0x94, 0x31, 0x46, 0x2b, 0x68, \ + 0xa4, 0xe4, 0x45, 0x41, 0xd9, 0xfb, 0x00, 0xe6, 0x39 \ } /* END FILE */ @@ -617,7 +609,7 @@ "-----BEGIN CERTIFICATE-----\r\n" \ "MIIDNzCCAh+gAwIBAgIBAjANBgkqhkiG9w0BAQsFADA7MQswCQYDVQQGEwJOTDER\r\n" \ "MA8GA1UECgwIUG9sYXJTU0wxGTAXBgNVBAMMEFBvbGFyU1NMIFRlc3QgQ0EwHhcN\r\n" \ - "MTEwMjEyMTQ0NDA2WhcNMjEwMjEyMTQ0NDA2WjA0MQswCQYDVQQGEwJOTDERMA8G\r\n" \ + "MTkwMjEwMTQ0NDA2WhcNMjkwMjEwMTQ0NDA2WjA0MQswCQYDVQQGEwJOTDERMA8G\r\n" \ "A1UECgwIUG9sYXJTU0wxEjAQBgNVBAMMCWxvY2FsaG9zdDCCASIwDQYJKoZIhvcN\r\n" \ "AQEBBQADggEPADCCAQoCggEBAMFNo93nzR3RBNdJcriZrA545Do8Ss86ExbQWuTN\r\n" \ "owCIp+4ea5anUrSQ7y1yej4kmvy2NKwk9XfgJmSMnLAofaHa6ozmyRyWvP7BBFKz\r\n" \ @@ -626,88 +618,88 @@ "hYvai0Re4hjGYi/HZo36Xdh98yeJKQHFkA4/J/EwyEoO79bex8cna8cFPXrEAjya\r\n" \ "HT4P6DSYW8tzS1KW2BGiLICIaTla0w+w3lkvEcf36hIBMJcCAwEAAaNNMEswCQYD\r\n" \ "VR0TBAIwADAdBgNVHQ4EFgQUpQXoZLjc32APUBJNYKhkr02LQ5MwHwYDVR0jBBgw\r\n" \ - "FoAUtFrkpbPe0lL2udWmlQ/rPrzH/f8wDQYJKoZIhvcNAQELBQADggEBAGGEshT5\r\n" \ - "kvnRmLVScVeUEdwIrvW7ezbGbUvJ8VxeJ79/HSjlLiGbMc4uUathwtzEdi9R/4C5\r\n" \ - "DXBNeEPTkbB+fhG1W06iHYj/Dp8+aaG7fuDxKVKHVZSqBnmQLn73ymyclZNHii5A\r\n" \ - "3nTS8WUaHAzxN/rajOtoM7aH1P9tULpHrl+7HOeLMpxUnwI12ZqZaLIzxbcdJVcr\r\n" \ - "ra2F00aXCGkYVLvyvbZIq7LC+yVysej5gCeQYD7VFOEks0jhFjrS06gP0/XnWv6v\r\n" \ - "eBoPez9d+CCjkrhseiWzXOiriIMICX48EloO/DrsMRAtvlwq7EDz4QhILz6ffndm\r\n" \ - "e4K1cVANRPN2o9Y=\r\n" \ + "FoAUtFrkpbPe0lL2udWmlQ/rPrzH/f8wDQYJKoZIhvcNAQELBQADggEBAC465FJh\r\n" \ + "Pqel7zJngHIHJrqj/wVAxGAFOTF396XKATGAp+HRCqJ81Ry60CNK1jDzk8dv6M6U\r\n" \ + "HoS7RIFiM/9rXQCbJfiPD5xMTejZp5n5UYHAmxsxDaazfA5FuBhkfokKK6jD4Eq9\r\n" \ + "1C94xGKb6X4/VkaPF7cqoBBw/bHxawXc0UEPjqayiBpCYU/rJoVZgLqFVP7Px3sv\r\n" \ + "a1nOrNx8rPPI1hJ+ZOg8maiPTxHZnBVLakSSLQy/sWeWyazO1RnrbxjrbgQtYKz0\r\n" \ + "e3nwGpu1w13vfckFmUSBhHXH7AAS/HpKC4IH7G2GAk3+n8iSSN71sZzpxonQwVbo\r\n" \ + "pMZqLmbBm/7WPLc=\r\n" \ "-----END CERTIFICATE-----\r\n" /* END FILE */ /* This is taken from tests/data_files/server2-sha256.crt.der. */ /* BEGIN FILE binary macro TEST_SRV_CRT_RSA_SHA256_DER tests/data_files/server2-sha256.crt.der */ #define TEST_SRV_CRT_RSA_SHA256_DER { \ - 0x30, 0x82, 0x03, 0x37, 0x30, 0x82, 0x02, 0x1f, 0xa0, 0x03, 0x02, 0x01, \ - 0x02, 0x02, 0x01, 0x02, 0x30, 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, \ - 0xf7, 0x0d, 0x01, 0x01, 0x0b, 0x05, 0x00, 0x30, 0x3b, 0x31, 0x0b, 0x30, \ - 0x09, 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, \ - 0x30, 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, \ - 0x61, 0x72, 0x53, 0x53, 0x4c, 0x31, 0x19, 0x30, 0x17, 0x06, 0x03, 0x55, \ - 0x04, 0x03, 0x0c, 0x10, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, \ - 0x20, 0x54, 0x65, 0x73, 0x74, 0x20, 0x43, 0x41, 0x30, 0x1e, 0x17, 0x0d, \ - 0x31, 0x31, 0x30, 0x32, 0x31, 0x32, 0x31, 0x34, 0x34, 0x34, 0x30, 0x36, \ - 0x5a, 0x17, 0x0d, 0x32, 0x31, 0x30, 0x32, 0x31, 0x32, 0x31, 0x34, 0x34, \ - 0x34, 0x30, 0x36, 0x5a, 0x30, 0x34, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, \ - 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, \ - 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, \ - 0x53, 0x4c, 0x31, 0x12, 0x30, 0x10, 0x06, 0x03, 0x55, 0x04, 0x03, 0x0c, \ - 0x09, 0x6c, 0x6f, 0x63, 0x61, 0x6c, 0x68, 0x6f, 0x73, 0x74, 0x30, 0x82, \ - 0x01, 0x22, 0x30, 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, \ - 0x01, 0x01, 0x01, 0x05, 0x00, 0x03, 0x82, 0x01, 0x0f, 0x00, 0x30, 0x82, \ - 0x01, 0x0a, 0x02, 0x82, 0x01, 0x01, 0x00, 0xc1, 0x4d, 0xa3, 0xdd, 0xe7, \ - 0xcd, 0x1d, 0xd1, 0x04, 0xd7, 0x49, 0x72, 0xb8, 0x99, 0xac, 0x0e, 0x78, \ - 0xe4, 0x3a, 0x3c, 0x4a, 0xcf, 0x3a, 0x13, 0x16, 0xd0, 0x5a, 0xe4, 0xcd, \ - 0xa3, 0x00, 0x88, 0xa7, 0xee, 0x1e, 0x6b, 0x96, 0xa7, 0x52, 0xb4, 0x90, \ - 0xef, 0x2d, 0x72, 0x7a, 0x3e, 0x24, 0x9a, 0xfc, 0xb6, 0x34, 0xac, 0x24, \ - 0xf5, 0x77, 0xe0, 0x26, 0x64, 0x8c, 0x9c, 0xb0, 0x28, 0x7d, 0xa1, 0xda, \ - 0xea, 0x8c, 0xe6, 0xc9, 0x1c, 0x96, 0xbc, 0xfe, 0xc1, 0x04, 0x52, 0xb3, \ - 0x36, 0xd4, 0xa3, 0xfa, 0xe1, 0xb1, 0x76, 0xd8, 0x90, 0xc1, 0x61, 0xb4, \ - 0x66, 0x52, 0x36, 0xa2, 0x26, 0x53, 0xaa, 0xab, 0x74, 0x5e, 0x07, 0x7d, \ - 0x19, 0x82, 0xdb, 0x2a, 0xd8, 0x1f, 0xa0, 0xd9, 0x0d, 0x1c, 0x2d, 0x49, \ - 0x66, 0xf7, 0x5b, 0x25, 0x73, 0x46, 0xe8, 0x0b, 0x8a, 0x4f, 0x69, 0x0c, \ - 0xb5, 0x00, 0x90, 0xe1, 0xda, 0x82, 0x10, 0x66, 0x7d, 0xae, 0x54, 0x2b, \ - 0x8b, 0x65, 0x79, 0x91, 0xa1, 0xe2, 0x61, 0xc3, 0xcd, 0x40, 0x49, 0x08, \ - 0xee, 0x68, 0x0c, 0xf1, 0x8b, 0x86, 0xd2, 0x46, 0xbf, 0xd0, 0xb8, 0xaa, \ - 0x11, 0x03, 0x1e, 0x7f, 0x56, 0xa8, 0x1a, 0x1e, 0x44, 0x18, 0x0f, 0x0f, \ - 0x85, 0x8b, 0xda, 0x8b, 0x44, 0x5e, 0xe2, 0x18, 0xc6, 0x62, 0x2f, 0xc7, \ - 0x66, 0x8d, 0xfa, 0x5d, 0xd8, 0x7d, 0xf3, 0x27, 0x89, 0x29, 0x01, 0xc5, \ - 0x90, 0x0e, 0x3f, 0x27, 0xf1, 0x30, 0xc8, 0x4a, 0x0e, 0xef, 0xd6, 0xde, \ - 0xc7, 0xc7, 0x27, 0x6b, 0xc7, 0x05, 0x3d, 0x7a, 0xc4, 0x02, 0x3c, 0x9a, \ - 0x1d, 0x3e, 0x0f, 0xe8, 0x34, 0x98, 0x5b, 0xcb, 0x73, 0x4b, 0x52, 0x96, \ - 0xd8, 0x11, 0xa2, 0x2c, 0x80, 0x88, 0x69, 0x39, 0x5a, 0xd3, 0x0f, 0xb0, \ - 0xde, 0x59, 0x2f, 0x11, 0xc7, 0xf7, 0xea, 0x12, 0x01, 0x30, 0x97, 0x02, \ - 0x03, 0x01, 0x00, 0x01, 0xa3, 0x4d, 0x30, 0x4b, 0x30, 0x09, 0x06, 0x03, \ - 0x55, 0x1d, 0x13, 0x04, 0x02, 0x30, 0x00, 0x30, 0x1d, 0x06, 0x03, 0x55, \ - 0x1d, 0x0e, 0x04, 0x16, 0x04, 0x14, 0xa5, 0x05, 0xe8, 0x64, 0xb8, 0xdc, \ - 0xdf, 0x60, 0x0f, 0x50, 0x12, 0x4d, 0x60, 0xa8, 0x64, 0xaf, 0x4d, 0x8b, \ - 0x43, 0x93, 0x30, 0x1f, 0x06, 0x03, 0x55, 0x1d, 0x23, 0x04, 0x18, 0x30, \ - 0x16, 0x80, 0x14, 0xb4, 0x5a, 0xe4, 0xa5, 0xb3, 0xde, 0xd2, 0x52, 0xf6, \ - 0xb9, 0xd5, 0xa6, 0x95, 0x0f, 0xeb, 0x3e, 0xbc, 0xc7, 0xfd, 0xff, 0x30, \ - 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x0b, \ - 0x05, 0x00, 0x03, 0x82, 0x01, 0x01, 0x00, 0x61, 0x84, 0xb2, 0x14, 0xf9, \ - 0x92, 0xf9, 0xd1, 0x98, 0xb5, 0x52, 0x71, 0x57, 0x94, 0x11, 0xdc, 0x08, \ - 0xae, 0xf5, 0xbb, 0x7b, 0x36, 0xc6, 0x6d, 0x4b, 0xc9, 0xf1, 0x5c, 0x5e, \ - 0x27, 0xbf, 0x7f, 0x1d, 0x28, 0xe5, 0x2e, 0x21, 0x9b, 0x31, 0xce, 0x2e, \ - 0x51, 0xab, 0x61, 0xc2, 0xdc, 0xc4, 0x76, 0x2f, 0x51, 0xff, 0x80, 0xb9, \ - 0x0d, 0x70, 0x4d, 0x78, 0x43, 0xd3, 0x91, 0xb0, 0x7e, 0x7e, 0x11, 0xb5, \ - 0x5b, 0x4e, 0xa2, 0x1d, 0x88, 0xff, 0x0e, 0x9f, 0x3e, 0x69, 0xa1, 0xbb, \ - 0x7e, 0xe0, 0xf1, 0x29, 0x52, 0x87, 0x55, 0x94, 0xaa, 0x06, 0x79, 0x90, \ - 0x2e, 0x7e, 0xf7, 0xca, 0x6c, 0x9c, 0x95, 0x93, 0x47, 0x8a, 0x2e, 0x40, \ - 0xde, 0x74, 0xd2, 0xf1, 0x65, 0x1a, 0x1c, 0x0c, 0xf1, 0x37, 0xfa, 0xda, \ - 0x8c, 0xeb, 0x68, 0x33, 0xb6, 0x87, 0xd4, 0xff, 0x6d, 0x50, 0xba, 0x47, \ - 0xae, 0x5f, 0xbb, 0x1c, 0xe7, 0x8b, 0x32, 0x9c, 0x54, 0x9f, 0x02, 0x35, \ - 0xd9, 0x9a, 0x99, 0x68, 0xb2, 0x33, 0xc5, 0xb7, 0x1d, 0x25, 0x57, 0x2b, \ - 0xad, 0xad, 0x85, 0xd3, 0x46, 0x97, 0x08, 0x69, 0x18, 0x54, 0xbb, 0xf2, \ - 0xbd, 0xb6, 0x48, 0xab, 0xb2, 0xc2, 0xfb, 0x25, 0x72, 0xb1, 0xe8, 0xf9, \ - 0x80, 0x27, 0x90, 0x60, 0x3e, 0xd5, 0x14, 0xe1, 0x24, 0xb3, 0x48, 0xe1, \ - 0x16, 0x3a, 0xd2, 0xd3, 0xa8, 0x0f, 0xd3, 0xf5, 0xe7, 0x5a, 0xfe, 0xaf, \ - 0x78, 0x1a, 0x0f, 0x7b, 0x3f, 0x5d, 0xf8, 0x20, 0xa3, 0x92, 0xb8, 0x6c, \ - 0x7a, 0x25, 0xb3, 0x5c, 0xe8, 0xab, 0x88, 0x83, 0x08, 0x09, 0x7e, 0x3c, \ - 0x12, 0x5a, 0x0e, 0xfc, 0x3a, 0xec, 0x31, 0x10, 0x2d, 0xbe, 0x5c, 0x2a, \ - 0xec, 0x40, 0xf3, 0xe1, 0x08, 0x48, 0x2f, 0x3e, 0x9f, 0x7e, 0x77, 0x66, \ - 0x7b, 0x82, 0xb5, 0x71, 0x50, 0x0d, 0x44, 0xf3, 0x76, 0xa3, 0xd6 \ + 0x30, 0x82, 0x03, 0x37, 0x30, 0x82, 0x02, 0x1f, 0xa0, 0x03, 0x02, 0x01, \ + 0x02, 0x02, 0x01, 0x02, 0x30, 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, \ + 0xf7, 0x0d, 0x01, 0x01, 0x0b, 0x05, 0x00, 0x30, 0x3b, 0x31, 0x0b, 0x30, \ + 0x09, 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, \ + 0x30, 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, \ + 0x61, 0x72, 0x53, 0x53, 0x4c, 0x31, 0x19, 0x30, 0x17, 0x06, 0x03, 0x55, \ + 0x04, 0x03, 0x0c, 0x10, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, \ + 0x20, 0x54, 0x65, 0x73, 0x74, 0x20, 0x43, 0x41, 0x30, 0x1e, 0x17, 0x0d, \ + 0x31, 0x39, 0x30, 0x32, 0x31, 0x30, 0x31, 0x34, 0x34, 0x34, 0x30, 0x36, \ + 0x5a, 0x17, 0x0d, 0x32, 0x39, 0x30, 0x32, 0x31, 0x30, 0x31, 0x34, 0x34, \ + 0x34, 0x30, 0x36, 0x5a, 0x30, 0x34, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, \ + 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, \ + 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, \ + 0x53, 0x4c, 0x31, 0x12, 0x30, 0x10, 0x06, 0x03, 0x55, 0x04, 0x03, 0x0c, \ + 0x09, 0x6c, 0x6f, 0x63, 0x61, 0x6c, 0x68, 0x6f, 0x73, 0x74, 0x30, 0x82, \ + 0x01, 0x22, 0x30, 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, \ + 0x01, 0x01, 0x01, 0x05, 0x00, 0x03, 0x82, 0x01, 0x0f, 0x00, 0x30, 0x82, \ + 0x01, 0x0a, 0x02, 0x82, 0x01, 0x01, 0x00, 0xc1, 0x4d, 0xa3, 0xdd, 0xe7, \ + 0xcd, 0x1d, 0xd1, 0x04, 0xd7, 0x49, 0x72, 0xb8, 0x99, 0xac, 0x0e, 0x78, \ + 0xe4, 0x3a, 0x3c, 0x4a, 0xcf, 0x3a, 0x13, 0x16, 0xd0, 0x5a, 0xe4, 0xcd, \ + 0xa3, 0x00, 0x88, 0xa7, 0xee, 0x1e, 0x6b, 0x96, 0xa7, 0x52, 0xb4, 0x90, \ + 0xef, 0x2d, 0x72, 0x7a, 0x3e, 0x24, 0x9a, 0xfc, 0xb6, 0x34, 0xac, 0x24, \ + 0xf5, 0x77, 0xe0, 0x26, 0x64, 0x8c, 0x9c, 0xb0, 0x28, 0x7d, 0xa1, 0xda, \ + 0xea, 0x8c, 0xe6, 0xc9, 0x1c, 0x96, 0xbc, 0xfe, 0xc1, 0x04, 0x52, 0xb3, \ + 0x36, 0xd4, 0xa3, 0xfa, 0xe1, 0xb1, 0x76, 0xd8, 0x90, 0xc1, 0x61, 0xb4, \ + 0x66, 0x52, 0x36, 0xa2, 0x26, 0x53, 0xaa, 0xab, 0x74, 0x5e, 0x07, 0x7d, \ + 0x19, 0x82, 0xdb, 0x2a, 0xd8, 0x1f, 0xa0, 0xd9, 0x0d, 0x1c, 0x2d, 0x49, \ + 0x66, 0xf7, 0x5b, 0x25, 0x73, 0x46, 0xe8, 0x0b, 0x8a, 0x4f, 0x69, 0x0c, \ + 0xb5, 0x00, 0x90, 0xe1, 0xda, 0x82, 0x10, 0x66, 0x7d, 0xae, 0x54, 0x2b, \ + 0x8b, 0x65, 0x79, 0x91, 0xa1, 0xe2, 0x61, 0xc3, 0xcd, 0x40, 0x49, 0x08, \ + 0xee, 0x68, 0x0c, 0xf1, 0x8b, 0x86, 0xd2, 0x46, 0xbf, 0xd0, 0xb8, 0xaa, \ + 0x11, 0x03, 0x1e, 0x7f, 0x56, 0xa8, 0x1a, 0x1e, 0x44, 0x18, 0x0f, 0x0f, \ + 0x85, 0x8b, 0xda, 0x8b, 0x44, 0x5e, 0xe2, 0x18, 0xc6, 0x62, 0x2f, 0xc7, \ + 0x66, 0x8d, 0xfa, 0x5d, 0xd8, 0x7d, 0xf3, 0x27, 0x89, 0x29, 0x01, 0xc5, \ + 0x90, 0x0e, 0x3f, 0x27, 0xf1, 0x30, 0xc8, 0x4a, 0x0e, 0xef, 0xd6, 0xde, \ + 0xc7, 0xc7, 0x27, 0x6b, 0xc7, 0x05, 0x3d, 0x7a, 0xc4, 0x02, 0x3c, 0x9a, \ + 0x1d, 0x3e, 0x0f, 0xe8, 0x34, 0x98, 0x5b, 0xcb, 0x73, 0x4b, 0x52, 0x96, \ + 0xd8, 0x11, 0xa2, 0x2c, 0x80, 0x88, 0x69, 0x39, 0x5a, 0xd3, 0x0f, 0xb0, \ + 0xde, 0x59, 0x2f, 0x11, 0xc7, 0xf7, 0xea, 0x12, 0x01, 0x30, 0x97, 0x02, \ + 0x03, 0x01, 0x00, 0x01, 0xa3, 0x4d, 0x30, 0x4b, 0x30, 0x09, 0x06, 0x03, \ + 0x55, 0x1d, 0x13, 0x04, 0x02, 0x30, 0x00, 0x30, 0x1d, 0x06, 0x03, 0x55, \ + 0x1d, 0x0e, 0x04, 0x16, 0x04, 0x14, 0xa5, 0x05, 0xe8, 0x64, 0xb8, 0xdc, \ + 0xdf, 0x60, 0x0f, 0x50, 0x12, 0x4d, 0x60, 0xa8, 0x64, 0xaf, 0x4d, 0x8b, \ + 0x43, 0x93, 0x30, 0x1f, 0x06, 0x03, 0x55, 0x1d, 0x23, 0x04, 0x18, 0x30, \ + 0x16, 0x80, 0x14, 0xb4, 0x5a, 0xe4, 0xa5, 0xb3, 0xde, 0xd2, 0x52, 0xf6, \ + 0xb9, 0xd5, 0xa6, 0x95, 0x0f, 0xeb, 0x3e, 0xbc, 0xc7, 0xfd, 0xff, 0x30, \ + 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x0b, \ + 0x05, 0x00, 0x03, 0x82, 0x01, 0x01, 0x00, 0x2e, 0x3a, 0xe4, 0x52, 0x61, \ + 0x3e, 0xa7, 0xa5, 0xef, 0x32, 0x67, 0x80, 0x72, 0x07, 0x26, 0xba, 0xa3, \ + 0xff, 0x05, 0x40, 0xc4, 0x60, 0x05, 0x39, 0x31, 0x77, 0xf7, 0xa5, 0xca, \ + 0x01, 0x31, 0x80, 0xa7, 0xe1, 0xd1, 0x0a, 0xa2, 0x7c, 0xd5, 0x1c, 0xba, \ + 0xd0, 0x23, 0x4a, 0xd6, 0x30, 0xf3, 0x93, 0xc7, 0x6f, 0xe8, 0xce, 0x94, \ + 0x1e, 0x84, 0xbb, 0x44, 0x81, 0x62, 0x33, 0xff, 0x6b, 0x5d, 0x00, 0x9b, \ + 0x25, 0xf8, 0x8f, 0x0f, 0x9c, 0x4c, 0x4d, 0xe8, 0xd9, 0xa7, 0x99, 0xf9, \ + 0x51, 0x81, 0xc0, 0x9b, 0x1b, 0x31, 0x0d, 0xa6, 0xb3, 0x7c, 0x0e, 0x45, \ + 0xb8, 0x18, 0x64, 0x7e, 0x89, 0x0a, 0x2b, 0xa8, 0xc3, 0xe0, 0x4a, 0xbd, \ + 0xd4, 0x2f, 0x78, 0xc4, 0x62, 0x9b, 0xe9, 0x7e, 0x3f, 0x56, 0x46, 0x8f, \ + 0x17, 0xb7, 0x2a, 0xa0, 0x10, 0x70, 0xfd, 0xb1, 0xf1, 0x6b, 0x05, 0xdc, \ + 0xd1, 0x41, 0x0f, 0x8e, 0xa6, 0xb2, 0x88, 0x1a, 0x42, 0x61, 0x4f, 0xeb, \ + 0x26, 0x85, 0x59, 0x80, 0xba, 0x85, 0x54, 0xfe, 0xcf, 0xc7, 0x7b, 0x2f, \ + 0x6b, 0x59, 0xce, 0xac, 0xdc, 0x7c, 0xac, 0xf3, 0xc8, 0xd6, 0x12, 0x7e, \ + 0x64, 0xe8, 0x3c, 0x99, 0xa8, 0x8f, 0x4f, 0x11, 0xd9, 0x9c, 0x15, 0x4b, \ + 0x6a, 0x44, 0x92, 0x2d, 0x0c, 0xbf, 0xb1, 0x67, 0x96, 0xc9, 0xac, 0xce, \ + 0xd5, 0x19, 0xeb, 0x6f, 0x18, 0xeb, 0x6e, 0x04, 0x2d, 0x60, 0xac, 0xf4, \ + 0x7b, 0x79, 0xf0, 0x1a, 0x9b, 0xb5, 0xc3, 0x5d, 0xef, 0x7d, 0xc9, 0x05, \ + 0x99, 0x44, 0x81, 0x84, 0x75, 0xc7, 0xec, 0x00, 0x12, 0xfc, 0x7a, 0x4a, \ + 0x0b, 0x82, 0x07, 0xec, 0x6d, 0x86, 0x02, 0x4d, 0xfe, 0x9f, 0xc8, 0x92, \ + 0x48, 0xde, 0xf5, 0xb1, 0x9c, 0xe9, 0xc6, 0x89, 0xd0, 0xc1, 0x56, 0xe8, \ + 0xa4, 0xc6, 0x6a, 0x2e, 0x66, 0xc1, 0x9b, 0xfe, 0xd6, 0x3c, 0xb7 \ } /* END FILE */ @@ -717,7 +709,7 @@ "-----BEGIN CERTIFICATE-----\r\n" \ "MIIDNzCCAh+gAwIBAgIBAjANBgkqhkiG9w0BAQUFADA7MQswCQYDVQQGEwJOTDER\r\n" \ "MA8GA1UECgwIUG9sYXJTU0wxGTAXBgNVBAMMEFBvbGFyU1NMIFRlc3QgQ0EwHhcN\r\n" \ - "MTEwMjEyMTQ0NDA2WhcNMjEwMjEyMTQ0NDA2WjA0MQswCQYDVQQGEwJOTDERMA8G\r\n" \ + "MTkwMjEwMTQ0NDA2WhcNMjkwMjEwMTQ0NDA2WjA0MQswCQYDVQQGEwJOTDERMA8G\r\n" \ "A1UECgwIUG9sYXJTU0wxEjAQBgNVBAMMCWxvY2FsaG9zdDCCASIwDQYJKoZIhvcN\r\n" \ "AQEBBQADggEPADCCAQoCggEBAMFNo93nzR3RBNdJcriZrA545Do8Ss86ExbQWuTN\r\n" \ "owCIp+4ea5anUrSQ7y1yej4kmvy2NKwk9XfgJmSMnLAofaHa6ozmyRyWvP7BBFKz\r\n" \ @@ -726,88 +718,88 @@ "hYvai0Re4hjGYi/HZo36Xdh98yeJKQHFkA4/J/EwyEoO79bex8cna8cFPXrEAjya\r\n" \ "HT4P6DSYW8tzS1KW2BGiLICIaTla0w+w3lkvEcf36hIBMJcCAwEAAaNNMEswCQYD\r\n" \ "VR0TBAIwADAdBgNVHQ4EFgQUpQXoZLjc32APUBJNYKhkr02LQ5MwHwYDVR0jBBgw\r\n" \ - "FoAUtFrkpbPe0lL2udWmlQ/rPrzH/f8wDQYJKoZIhvcNAQEFBQADggEBAAFzC0rF\r\n" \ - "y6De8WMcdgQrEw3AhBHFjzqnxZw1ene4IBSC7lTw8rBSy3jOWQdPUWn+0y/pCeeF\r\n" \ - "kti6sevFdl1hLemGtd4q+T9TKEKGg3ND4ARfB5AUZZ9uEHq8WBkiwus5clGS17Qd\r\n" \ - "dS/TOisB59tQruLx1E1bPLtBKyqk4koC5WAULJwfpswGSyWJTpYwIpxcWE3D2tBu\r\n" \ - "UB6MZfXZFzWmWEOyKbeoXjXe8GBCGgHLywvYDsGQ36HSGtEsAvR2QaTLSxWYcfk1\r\n" \ - "fbDn4jSWkb4yZy1r01UEigFQtONieGwRFaUqEcFJHJvEEGVgh9keaVlOj2vrwf5r\r\n" \ - "4mN4lW7gLdenN6g=\r\n" \ + "FoAUtFrkpbPe0lL2udWmlQ/rPrzH/f8wDQYJKoZIhvcNAQEFBQADggEBAJklg3Q4\r\n" \ + "cB7v7BzsxM/vLyKccO6op0/gZzM4ghuLq2Y32kl0sM6kSNUUmduuq3u/+GmUZN2A\r\n" \ + "O/7c+Hw7hDFEIvZk98aBGjCLqn3DmgHIv8ToQ67nellQxx2Uj309PdgjNi/r9HOc\r\n" \ + "KNAYPbBcg6MJGWWj2TI6vNaceios/DhOYx5V0j5nfqSJ/pnU0g9Ign2LAhgYpGJE\r\n" \ + "iEM9wW7hEMkwmk0h/sqZsrJsGH5YsF/VThSq/JVO1e2mZH2vruyZKJVBq+8tDNYp\r\n" \ + "HkK6tSyVYQhzIt3StMJWKMl/o5k2AYz6tSC164+1oG+ML3LWg8XrGKa91H4UOKap\r\n" \ + "Awgk0+4m0T25cNs=\r\n" \ "-----END CERTIFICATE-----\r\n" /* END FILE */ /* This is taken from tests/data_files/server2.crt.der. */ /* BEGIN FILE binary macro TEST_SRV_CRT_RSA_SHA1_DER tests/data_files/server2.crt.der */ #define TEST_SRV_CRT_RSA_SHA1_DER { \ - 0x30, 0x82, 0x03, 0x37, 0x30, 0x82, 0x02, 0x1f, 0xa0, 0x03, 0x02, 0x01, \ - 0x02, 0x02, 0x01, 0x02, 0x30, 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, \ - 0xf7, 0x0d, 0x01, 0x01, 0x05, 0x05, 0x00, 0x30, 0x3b, 0x31, 0x0b, 0x30, \ - 0x09, 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, \ - 0x30, 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, \ - 0x61, 0x72, 0x53, 0x53, 0x4c, 0x31, 0x19, 0x30, 0x17, 0x06, 0x03, 0x55, \ - 0x04, 0x03, 0x0c, 0x10, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, \ - 0x20, 0x54, 0x65, 0x73, 0x74, 0x20, 0x43, 0x41, 0x30, 0x1e, 0x17, 0x0d, \ - 0x31, 0x31, 0x30, 0x32, 0x31, 0x32, 0x31, 0x34, 0x34, 0x34, 0x30, 0x36, \ - 0x5a, 0x17, 0x0d, 0x32, 0x31, 0x30, 0x32, 0x31, 0x32, 0x31, 0x34, 0x34, \ - 0x34, 0x30, 0x36, 0x5a, 0x30, 0x34, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, \ - 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, \ - 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, \ - 0x53, 0x4c, 0x31, 0x12, 0x30, 0x10, 0x06, 0x03, 0x55, 0x04, 0x03, 0x0c, \ - 0x09, 0x6c, 0x6f, 0x63, 0x61, 0x6c, 0x68, 0x6f, 0x73, 0x74, 0x30, 0x82, \ - 0x01, 0x22, 0x30, 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, \ - 0x01, 0x01, 0x01, 0x05, 0x00, 0x03, 0x82, 0x01, 0x0f, 0x00, 0x30, 0x82, \ - 0x01, 0x0a, 0x02, 0x82, 0x01, 0x01, 0x00, 0xc1, 0x4d, 0xa3, 0xdd, 0xe7, \ - 0xcd, 0x1d, 0xd1, 0x04, 0xd7, 0x49, 0x72, 0xb8, 0x99, 0xac, 0x0e, 0x78, \ - 0xe4, 0x3a, 0x3c, 0x4a, 0xcf, 0x3a, 0x13, 0x16, 0xd0, 0x5a, 0xe4, 0xcd, \ - 0xa3, 0x00, 0x88, 0xa7, 0xee, 0x1e, 0x6b, 0x96, 0xa7, 0x52, 0xb4, 0x90, \ - 0xef, 0x2d, 0x72, 0x7a, 0x3e, 0x24, 0x9a, 0xfc, 0xb6, 0x34, 0xac, 0x24, \ - 0xf5, 0x77, 0xe0, 0x26, 0x64, 0x8c, 0x9c, 0xb0, 0x28, 0x7d, 0xa1, 0xda, \ - 0xea, 0x8c, 0xe6, 0xc9, 0x1c, 0x96, 0xbc, 0xfe, 0xc1, 0x04, 0x52, 0xb3, \ - 0x36, 0xd4, 0xa3, 0xfa, 0xe1, 0xb1, 0x76, 0xd8, 0x90, 0xc1, 0x61, 0xb4, \ - 0x66, 0x52, 0x36, 0xa2, 0x26, 0x53, 0xaa, 0xab, 0x74, 0x5e, 0x07, 0x7d, \ - 0x19, 0x82, 0xdb, 0x2a, 0xd8, 0x1f, 0xa0, 0xd9, 0x0d, 0x1c, 0x2d, 0x49, \ - 0x66, 0xf7, 0x5b, 0x25, 0x73, 0x46, 0xe8, 0x0b, 0x8a, 0x4f, 0x69, 0x0c, \ - 0xb5, 0x00, 0x90, 0xe1, 0xda, 0x82, 0x10, 0x66, 0x7d, 0xae, 0x54, 0x2b, \ - 0x8b, 0x65, 0x79, 0x91, 0xa1, 0xe2, 0x61, 0xc3, 0xcd, 0x40, 0x49, 0x08, \ - 0xee, 0x68, 0x0c, 0xf1, 0x8b, 0x86, 0xd2, 0x46, 0xbf, 0xd0, 0xb8, 0xaa, \ - 0x11, 0x03, 0x1e, 0x7f, 0x56, 0xa8, 0x1a, 0x1e, 0x44, 0x18, 0x0f, 0x0f, \ - 0x85, 0x8b, 0xda, 0x8b, 0x44, 0x5e, 0xe2, 0x18, 0xc6, 0x62, 0x2f, 0xc7, \ - 0x66, 0x8d, 0xfa, 0x5d, 0xd8, 0x7d, 0xf3, 0x27, 0x89, 0x29, 0x01, 0xc5, \ - 0x90, 0x0e, 0x3f, 0x27, 0xf1, 0x30, 0xc8, 0x4a, 0x0e, 0xef, 0xd6, 0xde, \ - 0xc7, 0xc7, 0x27, 0x6b, 0xc7, 0x05, 0x3d, 0x7a, 0xc4, 0x02, 0x3c, 0x9a, \ - 0x1d, 0x3e, 0x0f, 0xe8, 0x34, 0x98, 0x5b, 0xcb, 0x73, 0x4b, 0x52, 0x96, \ - 0xd8, 0x11, 0xa2, 0x2c, 0x80, 0x88, 0x69, 0x39, 0x5a, 0xd3, 0x0f, 0xb0, \ - 0xde, 0x59, 0x2f, 0x11, 0xc7, 0xf7, 0xea, 0x12, 0x01, 0x30, 0x97, 0x02, \ - 0x03, 0x01, 0x00, 0x01, 0xa3, 0x4d, 0x30, 0x4b, 0x30, 0x09, 0x06, 0x03, \ - 0x55, 0x1d, 0x13, 0x04, 0x02, 0x30, 0x00, 0x30, 0x1d, 0x06, 0x03, 0x55, \ - 0x1d, 0x0e, 0x04, 0x16, 0x04, 0x14, 0xa5, 0x05, 0xe8, 0x64, 0xb8, 0xdc, \ - 0xdf, 0x60, 0x0f, 0x50, 0x12, 0x4d, 0x60, 0xa8, 0x64, 0xaf, 0x4d, 0x8b, \ - 0x43, 0x93, 0x30, 0x1f, 0x06, 0x03, 0x55, 0x1d, 0x23, 0x04, 0x18, 0x30, \ - 0x16, 0x80, 0x14, 0xb4, 0x5a, 0xe4, 0xa5, 0xb3, 0xde, 0xd2, 0x52, 0xf6, \ - 0xb9, 0xd5, 0xa6, 0x95, 0x0f, 0xeb, 0x3e, 0xbc, 0xc7, 0xfd, 0xff, 0x30, \ - 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x05, \ - 0x05, 0x00, 0x03, 0x82, 0x01, 0x01, 0x00, 0x01, 0x73, 0x0b, 0x4a, 0xc5, \ - 0xcb, 0xa0, 0xde, 0xf1, 0x63, 0x1c, 0x76, 0x04, 0x2b, 0x13, 0x0d, 0xc0, \ - 0x84, 0x11, 0xc5, 0x8f, 0x3a, 0xa7, 0xc5, 0x9c, 0x35, 0x7a, 0x77, 0xb8, \ - 0x20, 0x14, 0x82, 0xee, 0x54, 0xf0, 0xf2, 0xb0, 0x52, 0xcb, 0x78, 0xce, \ - 0x59, 0x07, 0x4f, 0x51, 0x69, 0xfe, 0xd3, 0x2f, 0xe9, 0x09, 0xe7, 0x85, \ - 0x92, 0xd8, 0xba, 0xb1, 0xeb, 0xc5, 0x76, 0x5d, 0x61, 0x2d, 0xe9, 0x86, \ - 0xb5, 0xde, 0x2a, 0xf9, 0x3f, 0x53, 0x28, 0x42, 0x86, 0x83, 0x73, 0x43, \ - 0xe0, 0x04, 0x5f, 0x07, 0x90, 0x14, 0x65, 0x9f, 0x6e, 0x10, 0x7a, 0xbc, \ - 0x58, 0x19, 0x22, 0xc2, 0xeb, 0x39, 0x72, 0x51, 0x92, 0xd7, 0xb4, 0x1d, \ - 0x75, 0x2f, 0xd3, 0x3a, 0x2b, 0x01, 0xe7, 0xdb, 0x50, 0xae, 0xe2, 0xf1, \ - 0xd4, 0x4d, 0x5b, 0x3c, 0xbb, 0x41, 0x2b, 0x2a, 0xa4, 0xe2, 0x4a, 0x02, \ - 0xe5, 0x60, 0x14, 0x2c, 0x9c, 0x1f, 0xa6, 0xcc, 0x06, 0x4b, 0x25, 0x89, \ - 0x4e, 0x96, 0x30, 0x22, 0x9c, 0x5c, 0x58, 0x4d, 0xc3, 0xda, 0xd0, 0x6e, \ - 0x50, 0x1e, 0x8c, 0x65, 0xf5, 0xd9, 0x17, 0x35, 0xa6, 0x58, 0x43, 0xb2, \ - 0x29, 0xb7, 0xa8, 0x5e, 0x35, 0xde, 0xf0, 0x60, 0x42, 0x1a, 0x01, 0xcb, \ - 0xcb, 0x0b, 0xd8, 0x0e, 0xc1, 0x90, 0xdf, 0xa1, 0xd2, 0x1a, 0xd1, 0x2c, \ - 0x02, 0xf4, 0x76, 0x41, 0xa4, 0xcb, 0x4b, 0x15, 0x98, 0x71, 0xf9, 0x35, \ - 0x7d, 0xb0, 0xe7, 0xe2, 0x34, 0x96, 0x91, 0xbe, 0x32, 0x67, 0x2d, 0x6b, \ - 0xd3, 0x55, 0x04, 0x8a, 0x01, 0x50, 0xb4, 0xe3, 0x62, 0x78, 0x6c, 0x11, \ - 0x15, 0xa5, 0x2a, 0x11, 0xc1, 0x49, 0x1c, 0x9b, 0xc4, 0x10, 0x65, 0x60, \ - 0x87, 0xd9, 0x1e, 0x69, 0x59, 0x4e, 0x8f, 0x6b, 0xeb, 0xc1, 0xfe, 0x6b, \ - 0xe2, 0x63, 0x78, 0x95, 0x6e, 0xe0, 0x2d, 0xd7, 0xa7, 0x37, 0xa8 \ + 0x30, 0x82, 0x03, 0x37, 0x30, 0x82, 0x02, 0x1f, 0xa0, 0x03, 0x02, 0x01, \ + 0x02, 0x02, 0x01, 0x02, 0x30, 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, \ + 0xf7, 0x0d, 0x01, 0x01, 0x05, 0x05, 0x00, 0x30, 0x3b, 0x31, 0x0b, 0x30, \ + 0x09, 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, \ + 0x30, 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, \ + 0x61, 0x72, 0x53, 0x53, 0x4c, 0x31, 0x19, 0x30, 0x17, 0x06, 0x03, 0x55, \ + 0x04, 0x03, 0x0c, 0x10, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, \ + 0x20, 0x54, 0x65, 0x73, 0x74, 0x20, 0x43, 0x41, 0x30, 0x1e, 0x17, 0x0d, \ + 0x31, 0x39, 0x30, 0x32, 0x31, 0x30, 0x31, 0x34, 0x34, 0x34, 0x30, 0x36, \ + 0x5a, 0x17, 0x0d, 0x32, 0x39, 0x30, 0x32, 0x31, 0x30, 0x31, 0x34, 0x34, \ + 0x34, 0x30, 0x36, 0x5a, 0x30, 0x34, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, \ + 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, \ + 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, \ + 0x53, 0x4c, 0x31, 0x12, 0x30, 0x10, 0x06, 0x03, 0x55, 0x04, 0x03, 0x0c, \ + 0x09, 0x6c, 0x6f, 0x63, 0x61, 0x6c, 0x68, 0x6f, 0x73, 0x74, 0x30, 0x82, \ + 0x01, 0x22, 0x30, 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, \ + 0x01, 0x01, 0x01, 0x05, 0x00, 0x03, 0x82, 0x01, 0x0f, 0x00, 0x30, 0x82, \ + 0x01, 0x0a, 0x02, 0x82, 0x01, 0x01, 0x00, 0xc1, 0x4d, 0xa3, 0xdd, 0xe7, \ + 0xcd, 0x1d, 0xd1, 0x04, 0xd7, 0x49, 0x72, 0xb8, 0x99, 0xac, 0x0e, 0x78, \ + 0xe4, 0x3a, 0x3c, 0x4a, 0xcf, 0x3a, 0x13, 0x16, 0xd0, 0x5a, 0xe4, 0xcd, \ + 0xa3, 0x00, 0x88, 0xa7, 0xee, 0x1e, 0x6b, 0x96, 0xa7, 0x52, 0xb4, 0x90, \ + 0xef, 0x2d, 0x72, 0x7a, 0x3e, 0x24, 0x9a, 0xfc, 0xb6, 0x34, 0xac, 0x24, \ + 0xf5, 0x77, 0xe0, 0x26, 0x64, 0x8c, 0x9c, 0xb0, 0x28, 0x7d, 0xa1, 0xda, \ + 0xea, 0x8c, 0xe6, 0xc9, 0x1c, 0x96, 0xbc, 0xfe, 0xc1, 0x04, 0x52, 0xb3, \ + 0x36, 0xd4, 0xa3, 0xfa, 0xe1, 0xb1, 0x76, 0xd8, 0x90, 0xc1, 0x61, 0xb4, \ + 0x66, 0x52, 0x36, 0xa2, 0x26, 0x53, 0xaa, 0xab, 0x74, 0x5e, 0x07, 0x7d, \ + 0x19, 0x82, 0xdb, 0x2a, 0xd8, 0x1f, 0xa0, 0xd9, 0x0d, 0x1c, 0x2d, 0x49, \ + 0x66, 0xf7, 0x5b, 0x25, 0x73, 0x46, 0xe8, 0x0b, 0x8a, 0x4f, 0x69, 0x0c, \ + 0xb5, 0x00, 0x90, 0xe1, 0xda, 0x82, 0x10, 0x66, 0x7d, 0xae, 0x54, 0x2b, \ + 0x8b, 0x65, 0x79, 0x91, 0xa1, 0xe2, 0x61, 0xc3, 0xcd, 0x40, 0x49, 0x08, \ + 0xee, 0x68, 0x0c, 0xf1, 0x8b, 0x86, 0xd2, 0x46, 0xbf, 0xd0, 0xb8, 0xaa, \ + 0x11, 0x03, 0x1e, 0x7f, 0x56, 0xa8, 0x1a, 0x1e, 0x44, 0x18, 0x0f, 0x0f, \ + 0x85, 0x8b, 0xda, 0x8b, 0x44, 0x5e, 0xe2, 0x18, 0xc6, 0x62, 0x2f, 0xc7, \ + 0x66, 0x8d, 0xfa, 0x5d, 0xd8, 0x7d, 0xf3, 0x27, 0x89, 0x29, 0x01, 0xc5, \ + 0x90, 0x0e, 0x3f, 0x27, 0xf1, 0x30, 0xc8, 0x4a, 0x0e, 0xef, 0xd6, 0xde, \ + 0xc7, 0xc7, 0x27, 0x6b, 0xc7, 0x05, 0x3d, 0x7a, 0xc4, 0x02, 0x3c, 0x9a, \ + 0x1d, 0x3e, 0x0f, 0xe8, 0x34, 0x98, 0x5b, 0xcb, 0x73, 0x4b, 0x52, 0x96, \ + 0xd8, 0x11, 0xa2, 0x2c, 0x80, 0x88, 0x69, 0x39, 0x5a, 0xd3, 0x0f, 0xb0, \ + 0xde, 0x59, 0x2f, 0x11, 0xc7, 0xf7, 0xea, 0x12, 0x01, 0x30, 0x97, 0x02, \ + 0x03, 0x01, 0x00, 0x01, 0xa3, 0x4d, 0x30, 0x4b, 0x30, 0x09, 0x06, 0x03, \ + 0x55, 0x1d, 0x13, 0x04, 0x02, 0x30, 0x00, 0x30, 0x1d, 0x06, 0x03, 0x55, \ + 0x1d, 0x0e, 0x04, 0x16, 0x04, 0x14, 0xa5, 0x05, 0xe8, 0x64, 0xb8, 0xdc, \ + 0xdf, 0x60, 0x0f, 0x50, 0x12, 0x4d, 0x60, 0xa8, 0x64, 0xaf, 0x4d, 0x8b, \ + 0x43, 0x93, 0x30, 0x1f, 0x06, 0x03, 0x55, 0x1d, 0x23, 0x04, 0x18, 0x30, \ + 0x16, 0x80, 0x14, 0xb4, 0x5a, 0xe4, 0xa5, 0xb3, 0xde, 0xd2, 0x52, 0xf6, \ + 0xb9, 0xd5, 0xa6, 0x95, 0x0f, 0xeb, 0x3e, 0xbc, 0xc7, 0xfd, 0xff, 0x30, \ + 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x05, \ + 0x05, 0x00, 0x03, 0x82, 0x01, 0x01, 0x00, 0x99, 0x25, 0x83, 0x74, 0x38, \ + 0x70, 0x1e, 0xef, 0xec, 0x1c, 0xec, 0xc4, 0xcf, 0xef, 0x2f, 0x22, 0x9c, \ + 0x70, 0xee, 0xa8, 0xa7, 0x4f, 0xe0, 0x67, 0x33, 0x38, 0x82, 0x1b, 0x8b, \ + 0xab, 0x66, 0x37, 0xda, 0x49, 0x74, 0xb0, 0xce, 0xa4, 0x48, 0xd5, 0x14, \ + 0x99, 0xdb, 0xae, 0xab, 0x7b, 0xbf, 0xf8, 0x69, 0x94, 0x64, 0xdd, 0x80, \ + 0x3b, 0xfe, 0xdc, 0xf8, 0x7c, 0x3b, 0x84, 0x31, 0x44, 0x22, 0xf6, 0x64, \ + 0xf7, 0xc6, 0x81, 0x1a, 0x30, 0x8b, 0xaa, 0x7d, 0xc3, 0x9a, 0x01, 0xc8, \ + 0xbf, 0xc4, 0xe8, 0x43, 0xae, 0xe7, 0x7a, 0x59, 0x50, 0xc7, 0x1d, 0x94, \ + 0x8f, 0x7d, 0x3d, 0x3d, 0xd8, 0x23, 0x36, 0x2f, 0xeb, 0xf4, 0x73, 0x9c, \ + 0x28, 0xd0, 0x18, 0x3d, 0xb0, 0x5c, 0x83, 0xa3, 0x09, 0x19, 0x65, 0xa3, \ + 0xd9, 0x32, 0x3a, 0xbc, 0xd6, 0x9c, 0x7a, 0x2a, 0x2c, 0xfc, 0x38, 0x4e, \ + 0x63, 0x1e, 0x55, 0xd2, 0x3e, 0x67, 0x7e, 0xa4, 0x89, 0xfe, 0x99, 0xd4, \ + 0xd2, 0x0f, 0x48, 0x82, 0x7d, 0x8b, 0x02, 0x18, 0x18, 0xa4, 0x62, 0x44, \ + 0x88, 0x43, 0x3d, 0xc1, 0x6e, 0xe1, 0x10, 0xc9, 0x30, 0x9a, 0x4d, 0x21, \ + 0xfe, 0xca, 0x99, 0xb2, 0xb2, 0x6c, 0x18, 0x7e, 0x58, 0xb0, 0x5f, 0xd5, \ + 0x4e, 0x14, 0xaa, 0xfc, 0x95, 0x4e, 0xd5, 0xed, 0xa6, 0x64, 0x7d, 0xaf, \ + 0xae, 0xec, 0x99, 0x28, 0x95, 0x41, 0xab, 0xef, 0x2d, 0x0c, 0xd6, 0x29, \ + 0x1e, 0x42, 0xba, 0xb5, 0x2c, 0x95, 0x61, 0x08, 0x73, 0x22, 0xdd, 0xd2, \ + 0xb4, 0xc2, 0x56, 0x28, 0xc9, 0x7f, 0xa3, 0x99, 0x36, 0x01, 0x8c, 0xfa, \ + 0xb5, 0x20, 0xb5, 0xeb, 0x8f, 0xb5, 0xa0, 0x6f, 0x8c, 0x2f, 0x72, 0xd6, \ + 0x83, 0xc5, 0xeb, 0x18, 0xa6, 0xbd, 0xd4, 0x7e, 0x14, 0x38, 0xa6, 0xa9, \ + 0x03, 0x08, 0x24, 0xd3, 0xee, 0x26, 0xd1, 0x3d, 0xb9, 0x70, 0xdb \ } /* END FILE */ @@ -966,71 +958,64 @@ /* BEGIN FILE string macro TEST_CLI_CRT_EC_PEM tests/data_files/cli2.crt */ #define TEST_CLI_CRT_EC_PEM \ "-----BEGIN CERTIFICATE-----\r\n" \ - "MIICLDCCAbKgAwIBAgIBDTAKBggqhkjOPQQDAjA+MQswCQYDVQQGEwJOTDERMA8G\r\n" \ - "A1UEChMIUG9sYXJTU0wxHDAaBgNVBAMTE1BvbGFyc3NsIFRlc3QgRUMgQ0EwHhcN\r\n" \ - "MTMwOTI0MTU1MjA0WhcNMjMwOTIyMTU1MjA0WjBBMQswCQYDVQQGEwJOTDERMA8G\r\n" \ - "A1UEChMIUG9sYXJTU0wxHzAdBgNVBAMTFlBvbGFyU1NMIFRlc3QgQ2xpZW50IDIw\r\n" \ - "WTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAARX5a6xc9/TrLuTuIH/Eq7u5lOszlVT\r\n" \ - "9jQOzC7jYyUL35ji81xgNpbA1RgUcOV/n9VLRRjlsGzVXPiWj4dwo+THo4GdMIGa\r\n" \ - "MAkGA1UdEwQCMAAwHQYDVR0OBBYEFHoAX4Zk/OBd5REQO7LmO8QmP8/iMG4GA1Ud\r\n" \ - "IwRnMGWAFJ1tICRJAT8ry3i1Gbx+JMnb+zZ8oUKkQDA+MQswCQYDVQQGEwJOTDER\r\n" \ - "MA8GA1UEChMIUG9sYXJTU0wxHDAaBgNVBAMTE1BvbGFyc3NsIFRlc3QgRUMgQ0GC\r\n" \ - "CQDBQ+J+YkPM6DAKBggqhkjOPQQDAgNoADBlAjBKZQ17IIOimbmoD/yN7o89u3BM\r\n" \ - "lgOsjnhw3fIOoLIWy2WOGsk/LGF++DzvrRzuNiACMQCd8iem1XS4JK7haj8xocpU\r\n" \ - "LwjQje5PDGHfd3h9tP38Qknu5bJqws0md2KOKHyeV0U=\r\n" \ + "MIIB3zCCAWOgAwIBAgIBDTAMBggqhkjOPQQDAgUAMD4xCzAJBgNVBAYTAk5MMREw\r\n" \ + "DwYDVQQKDAhQb2xhclNTTDEcMBoGA1UEAwwTUG9sYXJTU0wgVGVzdCBFQyBDQTAe\r\n" \ + "Fw0xOTAyMTAxNDQ0MDBaFw0yOTAyMTAxNDQ0MDBaMEExCzAJBgNVBAYTAk5MMREw\r\n" \ + "DwYDVQQKDAhQb2xhclNTTDEfMB0GA1UEAwwWUG9sYXJTU0wgVGVzdCBDbGllbnQg\r\n" \ + "MjBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABFflrrFz39Osu5O4gf8Sru7mU6zO\r\n" \ + "VVP2NA7MLuNjJQvfmOLzXGA2lsDVGBRw5X+f1UtFGOWwbNVc+JaPh3Cj5MejTTBL\r\n" \ + "MAkGA1UdEwQCMAAwHQYDVR0OBBYEFHoAX4Zk/OBd5REQO7LmO8QmP8/iMB8GA1Ud\r\n" \ + "IwQYMBaAFJ1tICRJAT8ry3i1Gbx+JMnb+zZ8MAwGCCqGSM49BAMCBQADaAAwZQIx\r\n" \ + "AMqme4DKMldUlplDET9Q6Eptre7uUWKhsLOF+zPkKDlfzpIkJYEFgcloDHGYw80u\r\n" \ + "IgIwNftyPXsabTqMM7iEHgVpX/GRozKklY9yQI/5eoA6gGW7Y+imuGR/oao5ySOb\r\n" \ + "a9Vk\r\n" \ "-----END CERTIFICATE-----\r\n" /* END FILE */ /* This is generated from tests/data_files/cli2.crt.der using `xxd -i`. */ /* BEGIN FILE binary macro TEST_CLI_CRT_EC_DER tests/data_files/cli2.crt.der */ #define TEST_CLI_CRT_EC_DER { \ - 0x30, 0x82, 0x02, 0x2c, 0x30, 0x82, 0x01, 0xb2, 0xa0, 0x03, 0x02, 0x01, \ - 0x02, 0x02, 0x01, 0x0d, 0x30, 0x0a, 0x06, 0x08, 0x2a, 0x86, 0x48, 0xce, \ - 0x3d, 0x04, 0x03, 0x02, 0x30, 0x3e, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, \ - 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, \ - 0x03, 0x55, 0x04, 0x0a, 0x13, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, \ - 0x53, 0x4c, 0x31, 0x1c, 0x30, 0x1a, 0x06, 0x03, 0x55, 0x04, 0x03, 0x13, \ - 0x13, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x73, 0x73, 0x6c, 0x20, 0x54, 0x65, \ - 0x73, 0x74, 0x20, 0x45, 0x43, 0x20, 0x43, 0x41, 0x30, 0x1e, 0x17, 0x0d, \ - 0x31, 0x33, 0x30, 0x39, 0x32, 0x34, 0x31, 0x35, 0x35, 0x32, 0x30, 0x34, \ - 0x5a, 0x17, 0x0d, 0x32, 0x33, 0x30, 0x39, 0x32, 0x32, 0x31, 0x35, 0x35, \ - 0x32, 0x30, 0x34, 0x5a, 0x30, 0x41, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, \ - 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, \ - 0x03, 0x55, 0x04, 0x0a, 0x13, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, \ - 0x53, 0x4c, 0x31, 0x1f, 0x30, 0x1d, 0x06, 0x03, 0x55, 0x04, 0x03, 0x13, \ - 0x16, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, 0x20, 0x54, 0x65, \ - 0x73, 0x74, 0x20, 0x43, 0x6c, 0x69, 0x65, 0x6e, 0x74, 0x20, 0x32, 0x30, \ - 0x59, 0x30, 0x13, 0x06, 0x07, 0x2a, 0x86, 0x48, 0xce, 0x3d, 0x02, 0x01, \ - 0x06, 0x08, 0x2a, 0x86, 0x48, 0xce, 0x3d, 0x03, 0x01, 0x07, 0x03, 0x42, \ - 0x00, 0x04, 0x57, 0xe5, 0xae, 0xb1, 0x73, 0xdf, 0xd3, 0xac, 0xbb, 0x93, \ - 0xb8, 0x81, 0xff, 0x12, 0xae, 0xee, 0xe6, 0x53, 0xac, 0xce, 0x55, 0x53, \ - 0xf6, 0x34, 0x0e, 0xcc, 0x2e, 0xe3, 0x63, 0x25, 0x0b, 0xdf, 0x98, 0xe2, \ - 0xf3, 0x5c, 0x60, 0x36, 0x96, 0xc0, 0xd5, 0x18, 0x14, 0x70, 0xe5, 0x7f, \ - 0x9f, 0xd5, 0x4b, 0x45, 0x18, 0xe5, 0xb0, 0x6c, 0xd5, 0x5c, 0xf8, 0x96, \ - 0x8f, 0x87, 0x70, 0xa3, 0xe4, 0xc7, 0xa3, 0x81, 0x9d, 0x30, 0x81, 0x9a, \ - 0x30, 0x09, 0x06, 0x03, 0x55, 0x1d, 0x13, 0x04, 0x02, 0x30, 0x00, 0x30, \ - 0x1d, 0x06, 0x03, 0x55, 0x1d, 0x0e, 0x04, 0x16, 0x04, 0x14, 0x7a, 0x00, \ - 0x5f, 0x86, 0x64, 0xfc, 0xe0, 0x5d, 0xe5, 0x11, 0x10, 0x3b, 0xb2, 0xe6, \ - 0x3b, 0xc4, 0x26, 0x3f, 0xcf, 0xe2, 0x30, 0x6e, 0x06, 0x03, 0x55, 0x1d, \ - 0x23, 0x04, 0x67, 0x30, 0x65, 0x80, 0x14, 0x9d, 0x6d, 0x20, 0x24, 0x49, \ - 0x01, 0x3f, 0x2b, 0xcb, 0x78, 0xb5, 0x19, 0xbc, 0x7e, 0x24, 0xc9, 0xdb, \ - 0xfb, 0x36, 0x7c, 0xa1, 0x42, 0xa4, 0x40, 0x30, 0x3e, 0x31, 0x0b, 0x30, \ - 0x09, 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, \ - 0x30, 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, 0x13, 0x08, 0x50, 0x6f, 0x6c, \ - 0x61, 0x72, 0x53, 0x53, 0x4c, 0x31, 0x1c, 0x30, 0x1a, 0x06, 0x03, 0x55, \ - 0x04, 0x03, 0x13, 0x13, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x73, 0x73, 0x6c, \ - 0x20, 0x54, 0x65, 0x73, 0x74, 0x20, 0x45, 0x43, 0x20, 0x43, 0x41, 0x82, \ - 0x09, 0x00, 0xc1, 0x43, 0xe2, 0x7e, 0x62, 0x43, 0xcc, 0xe8, 0x30, 0x0a, \ - 0x06, 0x08, 0x2a, 0x86, 0x48, 0xce, 0x3d, 0x04, 0x03, 0x02, 0x03, 0x68, \ - 0x00, 0x30, 0x65, 0x02, 0x30, 0x4a, 0x65, 0x0d, 0x7b, 0x20, 0x83, 0xa2, \ - 0x99, 0xb9, 0xa8, 0x0f, 0xfc, 0x8d, 0xee, 0x8f, 0x3d, 0xbb, 0x70, 0x4c, \ - 0x96, 0x03, 0xac, 0x8e, 0x78, 0x70, 0xdd, 0xf2, 0x0e, 0xa0, 0xb2, 0x16, \ - 0xcb, 0x65, 0x8e, 0x1a, 0xc9, 0x3f, 0x2c, 0x61, 0x7e, 0xf8, 0x3c, 0xef, \ - 0xad, 0x1c, 0xee, 0x36, 0x20, 0x02, 0x31, 0x00, 0x9d, 0xf2, 0x27, 0xa6, \ - 0xd5, 0x74, 0xb8, 0x24, 0xae, 0xe1, 0x6a, 0x3f, 0x31, 0xa1, 0xca, 0x54, \ - 0x2f, 0x08, 0xd0, 0x8d, 0xee, 0x4f, 0x0c, 0x61, 0xdf, 0x77, 0x78, 0x7d, \ - 0xb4, 0xfd, 0xfc, 0x42, 0x49, 0xee, 0xe5, 0xb2, 0x6a, 0xc2, 0xcd, 0x26, \ - 0x77, 0x62, 0x8e, 0x28, 0x7c, 0x9e, 0x57, 0x45 \ + 0x30, 0x82, 0x01, 0xdf, 0x30, 0x82, 0x01, 0x63, 0xa0, 0x03, 0x02, 0x01, \ + 0x02, 0x02, 0x01, 0x0d, 0x30, 0x0c, 0x06, 0x08, 0x2a, 0x86, 0x48, 0xce, \ + 0x3d, 0x04, 0x03, 0x02, 0x05, 0x00, 0x30, 0x3e, 0x31, 0x0b, 0x30, 0x09, \ + 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, \ + 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, 0x61, \ + 0x72, 0x53, 0x53, 0x4c, 0x31, 0x1c, 0x30, 0x1a, 0x06, 0x03, 0x55, 0x04, \ + 0x03, 0x0c, 0x13, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, 0x20, \ + 0x54, 0x65, 0x73, 0x74, 0x20, 0x45, 0x43, 0x20, 0x43, 0x41, 0x30, 0x1e, \ + 0x17, 0x0d, 0x31, 0x39, 0x30, 0x32, 0x31, 0x30, 0x31, 0x34, 0x34, 0x34, \ + 0x30, 0x30, 0x5a, 0x17, 0x0d, 0x32, 0x39, 0x30, 0x32, 0x31, 0x30, 0x31, \ + 0x34, 0x34, 0x34, 0x30, 0x30, 0x5a, 0x30, 0x41, 0x31, 0x0b, 0x30, 0x09, \ + 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, \ + 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, 0x61, \ + 0x72, 0x53, 0x53, 0x4c, 0x31, 0x1f, 0x30, 0x1d, 0x06, 0x03, 0x55, 0x04, \ + 0x03, 0x0c, 0x16, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, 0x20, \ + 0x54, 0x65, 0x73, 0x74, 0x20, 0x43, 0x6c, 0x69, 0x65, 0x6e, 0x74, 0x20, \ + 0x32, 0x30, 0x59, 0x30, 0x13, 0x06, 0x07, 0x2a, 0x86, 0x48, 0xce, 0x3d, \ + 0x02, 0x01, 0x06, 0x08, 0x2a, 0x86, 0x48, 0xce, 0x3d, 0x03, 0x01, 0x07, \ + 0x03, 0x42, 0x00, 0x04, 0x57, 0xe5, 0xae, 0xb1, 0x73, 0xdf, 0xd3, 0xac, \ + 0xbb, 0x93, 0xb8, 0x81, 0xff, 0x12, 0xae, 0xee, 0xe6, 0x53, 0xac, 0xce, \ + 0x55, 0x53, 0xf6, 0x34, 0x0e, 0xcc, 0x2e, 0xe3, 0x63, 0x25, 0x0b, 0xdf, \ + 0x98, 0xe2, 0xf3, 0x5c, 0x60, 0x36, 0x96, 0xc0, 0xd5, 0x18, 0x14, 0x70, \ + 0xe5, 0x7f, 0x9f, 0xd5, 0x4b, 0x45, 0x18, 0xe5, 0xb0, 0x6c, 0xd5, 0x5c, \ + 0xf8, 0x96, 0x8f, 0x87, 0x70, 0xa3, 0xe4, 0xc7, 0xa3, 0x4d, 0x30, 0x4b, \ + 0x30, 0x09, 0x06, 0x03, 0x55, 0x1d, 0x13, 0x04, 0x02, 0x30, 0x00, 0x30, \ + 0x1d, 0x06, 0x03, 0x55, 0x1d, 0x0e, 0x04, 0x16, 0x04, 0x14, 0x7a, 0x00, \ + 0x5f, 0x86, 0x64, 0xfc, 0xe0, 0x5d, 0xe5, 0x11, 0x10, 0x3b, 0xb2, 0xe6, \ + 0x3b, 0xc4, 0x26, 0x3f, 0xcf, 0xe2, 0x30, 0x1f, 0x06, 0x03, 0x55, 0x1d, \ + 0x23, 0x04, 0x18, 0x30, 0x16, 0x80, 0x14, 0x9d, 0x6d, 0x20, 0x24, 0x49, \ + 0x01, 0x3f, 0x2b, 0xcb, 0x78, 0xb5, 0x19, 0xbc, 0x7e, 0x24, 0xc9, 0xdb, \ + 0xfb, 0x36, 0x7c, 0x30, 0x0c, 0x06, 0x08, 0x2a, 0x86, 0x48, 0xce, 0x3d, \ + 0x04, 0x03, 0x02, 0x05, 0x00, 0x03, 0x68, 0x00, 0x30, 0x65, 0x02, 0x31, \ + 0x00, 0xca, 0xa6, 0x7b, 0x80, 0xca, 0x32, 0x57, 0x54, 0x96, 0x99, 0x43, \ + 0x11, 0x3f, 0x50, 0xe8, 0x4a, 0x6d, 0xad, 0xee, 0xee, 0x51, 0x62, 0xa1, \ + 0xb0, 0xb3, 0x85, 0xfb, 0x33, 0xe4, 0x28, 0x39, 0x5f, 0xce, 0x92, 0x24, \ + 0x25, 0x81, 0x05, 0x81, 0xc9, 0x68, 0x0c, 0x71, 0x98, 0xc3, 0xcd, 0x2e, \ + 0x22, 0x02, 0x30, 0x35, 0xfb, 0x72, 0x3d, 0x7b, 0x1a, 0x6d, 0x3a, 0x8c, \ + 0x33, 0xb8, 0x84, 0x1e, 0x05, 0x69, 0x5f, 0xf1, 0x91, 0xa3, 0x32, 0xa4, \ + 0x95, 0x8f, 0x72, 0x40, 0x8f, 0xf9, 0x7a, 0x80, 0x3a, 0x80, 0x65, 0xbb, \ + 0x63, 0xe8, 0xa6, 0xb8, 0x64, 0x7f, 0xa1, 0xaa, 0x39, 0xc9, 0x23, 0x9b, \ + 0x6b, 0xd5, 0x64 \ } /* END FILE */ @@ -1067,7 +1052,7 @@ "-----BEGIN CERTIFICATE-----\r\n" \ "MIIDPzCCAiegAwIBAgIBBDANBgkqhkiG9w0BAQsFADA7MQswCQYDVQQGEwJOTDER\r\n" \ "MA8GA1UECgwIUG9sYXJTU0wxGTAXBgNVBAMMEFBvbGFyU1NMIFRlc3QgQ0EwHhcN\r\n" \ - "MTEwMjEyMTQ0NDA2WhcNMjEwMjEyMTQ0NDA2WjA8MQswCQYDVQQGEwJOTDERMA8G\r\n" \ + "MTkwMjEwMTQ0NDA2WhcNMjkwMjEwMTQ0NDA2WjA8MQswCQYDVQQGEwJOTDERMA8G\r\n" \ "A1UECgwIUG9sYXJTU0wxGjAYBgNVBAMMEVBvbGFyU1NMIENsaWVudCAyMIIBIjAN\r\n" \ "BgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAyHTEzLn5tXnpRdkUYLB9u5Pyax6f\r\n" \ "M60Nj4o8VmXl3ETZzGaFB9X4J7BKNdBjngpuG7fa8H6r7gwQk4ZJGDTzqCrSV/Uu\r\n" \ @@ -1077,12 +1062,12 @@ "/DZrtenNLQNiTrM9AM+vdqBpVoNq0qjU51Bx5rU2BXcFbXvI5MT9TNUhXwIDAQAB\r\n" \ "o00wSzAJBgNVHRMEAjAAMB0GA1UdDgQWBBRxoQBzckAvVHZeM/xSj7zx3WtGITAf\r\n" \ "BgNVHSMEGDAWgBS0WuSls97SUva51aaVD+s+vMf9/zANBgkqhkiG9w0BAQsFAAOC\r\n" \ - "AQEAlHabem2Tu69VUN7EipwnQn1dIHdgvT5i+iQHpSxY1crPnBbAeSdAXwsVEqLQ\r\n" \ - "gOOIAQD5VIITNuoGgo4i+4OpNh9u7ZkpRHla+/swsfrFWRRbBNP5Bcu74AGLstwU\r\n" \ - "zM8gIkBiyfM1Q1qDQISV9trlCG6O8vh8dp/rbI3rfzo99BOHXgFCrzXjCuW4vDsF\r\n" \ - "r+Dao26bX3sJ6UnEWg1H3o2x6PpUcvQ36h71/bz4TEbbUUEpe02V4QWuL+wrhHJL\r\n" \ - "U7o3SVE3Og7jPF8sat0a50YUWhwEFI256m02KAXLg89ueUyYKEr6rNwhcvXJpvU9\r\n" \ - "giIVvd0Sbjjnn7NC4VDbcXV8vw==\r\n" \ + "AQEAXidv1d4pLlBiKWED95rMycBdgDcgyNqJxakFkRfRyA2y1mlyTn7uBXRkNLY5\r\n" \ + "ZFzK82GCjk2Q2OD4RZSCPAJJqLpHHU34t71ciffvy2KK81YvrxczRhMAE64i+qna\r\n" \ + "yP3Td2XuWJR05PVPoSemsNELs9gWttdnYy3ce+EY2Y0n7Rsi7982EeLIAA7H6ca4\r\n" \ + "2Es/NUH//JZJT32OP0doMxeDRA+vplkKqTLLWf7dX26LIriBkBaRCgR5Yv9LBPFc\r\n" \ + "NOtpzu/LbrY7QFXKJMI+JXDudCsOn8KCmiA4d6Emisqfh3V3485l7HEQNcvLTxlD\r\n" \ + "6zDQyi0/ykYUYZkwQTK1N2Nvlw==\r\n" \ "-----END CERTIFICATE-----\r\n" /* END FILE */ @@ -1090,76 +1075,76 @@ using `xxd -i.` */ /* BEGIN FILE binary macro TEST_CLI_CRT_RSA_DER tests/data_files/cli-rsa-sha256.crt.der */ #define TEST_CLI_CRT_RSA_DER { \ - 0x30, 0x82, 0x03, 0x3f, 0x30, 0x82, 0x02, 0x27, 0xa0, 0x03, 0x02, 0x01, \ - 0x02, 0x02, 0x01, 0x04, 0x30, 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, \ - 0xf7, 0x0d, 0x01, 0x01, 0x0b, 0x05, 0x00, 0x30, 0x3b, 0x31, 0x0b, 0x30, \ - 0x09, 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, \ - 0x30, 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, \ - 0x61, 0x72, 0x53, 0x53, 0x4c, 0x31, 0x19, 0x30, 0x17, 0x06, 0x03, 0x55, \ - 0x04, 0x03, 0x0c, 0x10, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, \ - 0x20, 0x54, 0x65, 0x73, 0x74, 0x20, 0x43, 0x41, 0x30, 0x1e, 0x17, 0x0d, \ - 0x31, 0x31, 0x30, 0x32, 0x31, 0x32, 0x31, 0x34, 0x34, 0x34, 0x30, 0x36, \ - 0x5a, 0x17, 0x0d, 0x32, 0x31, 0x30, 0x32, 0x31, 0x32, 0x31, 0x34, 0x34, \ - 0x34, 0x30, 0x36, 0x5a, 0x30, 0x3c, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, \ - 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, \ - 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, \ - 0x53, 0x4c, 0x31, 0x1a, 0x30, 0x18, 0x06, 0x03, 0x55, 0x04, 0x03, 0x0c, \ - 0x11, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, 0x20, 0x43, 0x6c, \ - 0x69, 0x65, 0x6e, 0x74, 0x20, 0x32, 0x30, 0x82, 0x01, 0x22, 0x30, 0x0d, \ - 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x01, 0x05, \ - 0x00, 0x03, 0x82, 0x01, 0x0f, 0x00, 0x30, 0x82, 0x01, 0x0a, 0x02, 0x82, \ - 0x01, 0x01, 0x00, 0xc8, 0x74, 0xc4, 0xcc, 0xb9, 0xf9, 0xb5, 0x79, 0xe9, \ - 0x45, 0xd9, 0x14, 0x60, 0xb0, 0x7d, 0xbb, 0x93, 0xf2, 0x6b, 0x1e, 0x9f, \ - 0x33, 0xad, 0x0d, 0x8f, 0x8a, 0x3c, 0x56, 0x65, 0xe5, 0xdc, 0x44, 0xd9, \ - 0xcc, 0x66, 0x85, 0x07, 0xd5, 0xf8, 0x27, 0xb0, 0x4a, 0x35, 0xd0, 0x63, \ - 0x9e, 0x0a, 0x6e, 0x1b, 0xb7, 0xda, 0xf0, 0x7e, 0xab, 0xee, 0x0c, 0x10, \ - 0x93, 0x86, 0x49, 0x18, 0x34, 0xf3, 0xa8, 0x2a, 0xd2, 0x57, 0xf5, 0x2e, \ - 0xd4, 0x2f, 0x77, 0x29, 0x84, 0x61, 0x4d, 0x82, 0x50, 0x8f, 0xa7, 0x95, \ - 0x48, 0x70, 0xf5, 0x6e, 0x4d, 0xb2, 0xd5, 0x13, 0xc3, 0xd2, 0x1a, 0xed, \ - 0xe6, 0x43, 0xea, 0x42, 0x14, 0xeb, 0x74, 0xea, 0xc0, 0xed, 0x1f, 0xd4, \ - 0x57, 0x4e, 0xa9, 0xf3, 0xa8, 0xed, 0xd2, 0xe0, 0xc1, 0x30, 0x71, 0x30, \ - 0x32, 0x30, 0xd5, 0xd3, 0xf6, 0x08, 0xd0, 0x56, 0x4f, 0x46, 0x8e, 0xf2, \ - 0x5f, 0xf9, 0x3d, 0x67, 0x91, 0x88, 0x30, 0x2e, 0x42, 0xb2, 0xdf, 0x7d, \ - 0xfb, 0xe5, 0x0c, 0x77, 0xff, 0xec, 0x31, 0xc0, 0x78, 0x8f, 0xbf, 0xc2, \ - 0x7f, 0xca, 0xad, 0x6c, 0x21, 0xd6, 0x8d, 0xd9, 0x8b, 0x6a, 0x8e, 0x6f, \ - 0xe0, 0x9b, 0xf8, 0x10, 0x56, 0xcc, 0xb3, 0x8e, 0x13, 0x15, 0xe6, 0x34, \ - 0x04, 0x66, 0xc7, 0xee, 0xf9, 0x36, 0x0e, 0x6a, 0x95, 0xf6, 0x09, 0x9a, \ - 0x06, 0x67, 0xf4, 0x65, 0x71, 0xf8, 0xca, 0xa4, 0xb1, 0x25, 0xe0, 0xfe, \ - 0x3c, 0x8b, 0x35, 0x04, 0x67, 0xba, 0xe0, 0x4f, 0x76, 0x85, 0xfc, 0x7f, \ - 0xfc, 0x36, 0x6b, 0xb5, 0xe9, 0xcd, 0x2d, 0x03, 0x62, 0x4e, 0xb3, 0x3d, \ - 0x00, 0xcf, 0xaf, 0x76, 0xa0, 0x69, 0x56, 0x83, 0x6a, 0xd2, 0xa8, 0xd4, \ - 0xe7, 0x50, 0x71, 0xe6, 0xb5, 0x36, 0x05, 0x77, 0x05, 0x6d, 0x7b, 0xc8, \ - 0xe4, 0xc4, 0xfd, 0x4c, 0xd5, 0x21, 0x5f, 0x02, 0x03, 0x01, 0x00, 0x01, \ - 0xa3, 0x4d, 0x30, 0x4b, 0x30, 0x09, 0x06, 0x03, 0x55, 0x1d, 0x13, 0x04, \ - 0x02, 0x30, 0x00, 0x30, 0x1d, 0x06, 0x03, 0x55, 0x1d, 0x0e, 0x04, 0x16, \ - 0x04, 0x14, 0x71, 0xa1, 0x00, 0x73, 0x72, 0x40, 0x2f, 0x54, 0x76, 0x5e, \ - 0x33, 0xfc, 0x52, 0x8f, 0xbc, 0xf1, 0xdd, 0x6b, 0x46, 0x21, 0x30, 0x1f, \ - 0x06, 0x03, 0x55, 0x1d, 0x23, 0x04, 0x18, 0x30, 0x16, 0x80, 0x14, 0xb4, \ - 0x5a, 0xe4, 0xa5, 0xb3, 0xde, 0xd2, 0x52, 0xf6, 0xb9, 0xd5, 0xa6, 0x95, \ - 0x0f, 0xeb, 0x3e, 0xbc, 0xc7, 0xfd, 0xff, 0x30, 0x0d, 0x06, 0x09, 0x2a, \ - 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x0b, 0x05, 0x00, 0x03, 0x82, \ - 0x01, 0x01, 0x00, 0x94, 0x76, 0x9b, 0x7a, 0x6d, 0x93, 0xbb, 0xaf, 0x55, \ - 0x50, 0xde, 0xc4, 0x8a, 0x9c, 0x27, 0x42, 0x7d, 0x5d, 0x20, 0x77, 0x60, \ - 0xbd, 0x3e, 0x62, 0xfa, 0x24, 0x07, 0xa5, 0x2c, 0x58, 0xd5, 0xca, 0xcf, \ - 0x9c, 0x16, 0xc0, 0x79, 0x27, 0x40, 0x5f, 0x0b, 0x15, 0x12, 0xa2, 0xd0, \ - 0x80, 0xe3, 0x88, 0x01, 0x00, 0xf9, 0x54, 0x82, 0x13, 0x36, 0xea, 0x06, \ - 0x82, 0x8e, 0x22, 0xfb, 0x83, 0xa9, 0x36, 0x1f, 0x6e, 0xed, 0x99, 0x29, \ - 0x44, 0x79, 0x5a, 0xfb, 0xfb, 0x30, 0xb1, 0xfa, 0xc5, 0x59, 0x14, 0x5b, \ - 0x04, 0xd3, 0xf9, 0x05, 0xcb, 0xbb, 0xe0, 0x01, 0x8b, 0xb2, 0xdc, 0x14, \ - 0xcc, 0xcf, 0x20, 0x22, 0x40, 0x62, 0xc9, 0xf3, 0x35, 0x43, 0x5a, 0x83, \ - 0x40, 0x84, 0x95, 0xf6, 0xda, 0xe5, 0x08, 0x6e, 0x8e, 0xf2, 0xf8, 0x7c, \ - 0x76, 0x9f, 0xeb, 0x6c, 0x8d, 0xeb, 0x7f, 0x3a, 0x3d, 0xf4, 0x13, 0x87, \ - 0x5e, 0x01, 0x42, 0xaf, 0x35, 0xe3, 0x0a, 0xe5, 0xb8, 0xbc, 0x3b, 0x05, \ - 0xaf, 0xe0, 0xda, 0xa3, 0x6e, 0x9b, 0x5f, 0x7b, 0x09, 0xe9, 0x49, 0xc4, \ - 0x5a, 0x0d, 0x47, 0xde, 0x8d, 0xb1, 0xe8, 0xfa, 0x54, 0x72, 0xf4, 0x37, \ - 0xea, 0x1e, 0xf5, 0xfd, 0xbc, 0xf8, 0x4c, 0x46, 0xdb, 0x51, 0x41, 0x29, \ - 0x7b, 0x4d, 0x95, 0xe1, 0x05, 0xae, 0x2f, 0xec, 0x2b, 0x84, 0x72, 0x4b, \ - 0x53, 0xba, 0x37, 0x49, 0x51, 0x37, 0x3a, 0x0e, 0xe3, 0x3c, 0x5f, 0x2c, \ - 0x6a, 0xdd, 0x1a, 0xe7, 0x46, 0x14, 0x5a, 0x1c, 0x04, 0x14, 0x8d, 0xb9, \ - 0xea, 0x6d, 0x36, 0x28, 0x05, 0xcb, 0x83, 0xcf, 0x6e, 0x79, 0x4c, 0x98, \ - 0x28, 0x4a, 0xfa, 0xac, 0xdc, 0x21, 0x72, 0xf5, 0xc9, 0xa6, 0xf5, 0x3d, \ - 0x82, 0x22, 0x15, 0xbd, 0xdd, 0x12, 0x6e, 0x38, 0xe7, 0x9f, 0xb3, 0x42, \ - 0xe1, 0x50, 0xdb, 0x71, 0x75, 0x7c, 0xbf \ + 0x30, 0x82, 0x03, 0x3f, 0x30, 0x82, 0x02, 0x27, 0xa0, 0x03, 0x02, 0x01, \ + 0x02, 0x02, 0x01, 0x04, 0x30, 0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, \ + 0xf7, 0x0d, 0x01, 0x01, 0x0b, 0x05, 0x00, 0x30, 0x3b, 0x31, 0x0b, 0x30, \ + 0x09, 0x06, 0x03, 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, \ + 0x30, 0x0f, 0x06, 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, \ + 0x61, 0x72, 0x53, 0x53, 0x4c, 0x31, 0x19, 0x30, 0x17, 0x06, 0x03, 0x55, \ + 0x04, 0x03, 0x0c, 0x10, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, \ + 0x20, 0x54, 0x65, 0x73, 0x74, 0x20, 0x43, 0x41, 0x30, 0x1e, 0x17, 0x0d, \ + 0x31, 0x39, 0x30, 0x32, 0x31, 0x30, 0x31, 0x34, 0x34, 0x34, 0x30, 0x36, \ + 0x5a, 0x17, 0x0d, 0x32, 0x39, 0x30, 0x32, 0x31, 0x30, 0x31, 0x34, 0x34, \ + 0x34, 0x30, 0x36, 0x5a, 0x30, 0x3c, 0x31, 0x0b, 0x30, 0x09, 0x06, 0x03, \ + 0x55, 0x04, 0x06, 0x13, 0x02, 0x4e, 0x4c, 0x31, 0x11, 0x30, 0x0f, 0x06, \ + 0x03, 0x55, 0x04, 0x0a, 0x0c, 0x08, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, \ + 0x53, 0x4c, 0x31, 0x1a, 0x30, 0x18, 0x06, 0x03, 0x55, 0x04, 0x03, 0x0c, \ + 0x11, 0x50, 0x6f, 0x6c, 0x61, 0x72, 0x53, 0x53, 0x4c, 0x20, 0x43, 0x6c, \ + 0x69, 0x65, 0x6e, 0x74, 0x20, 0x32, 0x30, 0x82, 0x01, 0x22, 0x30, 0x0d, \ + 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x01, 0x05, \ + 0x00, 0x03, 0x82, 0x01, 0x0f, 0x00, 0x30, 0x82, 0x01, 0x0a, 0x02, 0x82, \ + 0x01, 0x01, 0x00, 0xc8, 0x74, 0xc4, 0xcc, 0xb9, 0xf9, 0xb5, 0x79, 0xe9, \ + 0x45, 0xd9, 0x14, 0x60, 0xb0, 0x7d, 0xbb, 0x93, 0xf2, 0x6b, 0x1e, 0x9f, \ + 0x33, 0xad, 0x0d, 0x8f, 0x8a, 0x3c, 0x56, 0x65, 0xe5, 0xdc, 0x44, 0xd9, \ + 0xcc, 0x66, 0x85, 0x07, 0xd5, 0xf8, 0x27, 0xb0, 0x4a, 0x35, 0xd0, 0x63, \ + 0x9e, 0x0a, 0x6e, 0x1b, 0xb7, 0xda, 0xf0, 0x7e, 0xab, 0xee, 0x0c, 0x10, \ + 0x93, 0x86, 0x49, 0x18, 0x34, 0xf3, 0xa8, 0x2a, 0xd2, 0x57, 0xf5, 0x2e, \ + 0xd4, 0x2f, 0x77, 0x29, 0x84, 0x61, 0x4d, 0x82, 0x50, 0x8f, 0xa7, 0x95, \ + 0x48, 0x70, 0xf5, 0x6e, 0x4d, 0xb2, 0xd5, 0x13, 0xc3, 0xd2, 0x1a, 0xed, \ + 0xe6, 0x43, 0xea, 0x42, 0x14, 0xeb, 0x74, 0xea, 0xc0, 0xed, 0x1f, 0xd4, \ + 0x57, 0x4e, 0xa9, 0xf3, 0xa8, 0xed, 0xd2, 0xe0, 0xc1, 0x30, 0x71, 0x30, \ + 0x32, 0x30, 0xd5, 0xd3, 0xf6, 0x08, 0xd0, 0x56, 0x4f, 0x46, 0x8e, 0xf2, \ + 0x5f, 0xf9, 0x3d, 0x67, 0x91, 0x88, 0x30, 0x2e, 0x42, 0xb2, 0xdf, 0x7d, \ + 0xfb, 0xe5, 0x0c, 0x77, 0xff, 0xec, 0x31, 0xc0, 0x78, 0x8f, 0xbf, 0xc2, \ + 0x7f, 0xca, 0xad, 0x6c, 0x21, 0xd6, 0x8d, 0xd9, 0x8b, 0x6a, 0x8e, 0x6f, \ + 0xe0, 0x9b, 0xf8, 0x10, 0x56, 0xcc, 0xb3, 0x8e, 0x13, 0x15, 0xe6, 0x34, \ + 0x04, 0x66, 0xc7, 0xee, 0xf9, 0x36, 0x0e, 0x6a, 0x95, 0xf6, 0x09, 0x9a, \ + 0x06, 0x67, 0xf4, 0x65, 0x71, 0xf8, 0xca, 0xa4, 0xb1, 0x25, 0xe0, 0xfe, \ + 0x3c, 0x8b, 0x35, 0x04, 0x67, 0xba, 0xe0, 0x4f, 0x76, 0x85, 0xfc, 0x7f, \ + 0xfc, 0x36, 0x6b, 0xb5, 0xe9, 0xcd, 0x2d, 0x03, 0x62, 0x4e, 0xb3, 0x3d, \ + 0x00, 0xcf, 0xaf, 0x76, 0xa0, 0x69, 0x56, 0x83, 0x6a, 0xd2, 0xa8, 0xd4, \ + 0xe7, 0x50, 0x71, 0xe6, 0xb5, 0x36, 0x05, 0x77, 0x05, 0x6d, 0x7b, 0xc8, \ + 0xe4, 0xc4, 0xfd, 0x4c, 0xd5, 0x21, 0x5f, 0x02, 0x03, 0x01, 0x00, 0x01, \ + 0xa3, 0x4d, 0x30, 0x4b, 0x30, 0x09, 0x06, 0x03, 0x55, 0x1d, 0x13, 0x04, \ + 0x02, 0x30, 0x00, 0x30, 0x1d, 0x06, 0x03, 0x55, 0x1d, 0x0e, 0x04, 0x16, \ + 0x04, 0x14, 0x71, 0xa1, 0x00, 0x73, 0x72, 0x40, 0x2f, 0x54, 0x76, 0x5e, \ + 0x33, 0xfc, 0x52, 0x8f, 0xbc, 0xf1, 0xdd, 0x6b, 0x46, 0x21, 0x30, 0x1f, \ + 0x06, 0x03, 0x55, 0x1d, 0x23, 0x04, 0x18, 0x30, 0x16, 0x80, 0x14, 0xb4, \ + 0x5a, 0xe4, 0xa5, 0xb3, 0xde, 0xd2, 0x52, 0xf6, 0xb9, 0xd5, 0xa6, 0x95, \ + 0x0f, 0xeb, 0x3e, 0xbc, 0xc7, 0xfd, 0xff, 0x30, 0x0d, 0x06, 0x09, 0x2a, \ + 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x01, 0x0b, 0x05, 0x00, 0x03, 0x82, \ + 0x01, 0x01, 0x00, 0x5e, 0x27, 0x6f, 0xd5, 0xde, 0x29, 0x2e, 0x50, 0x62, \ + 0x29, 0x61, 0x03, 0xf7, 0x9a, 0xcc, 0xc9, 0xc0, 0x5d, 0x80, 0x37, 0x20, \ + 0xc8, 0xda, 0x89, 0xc5, 0xa9, 0x05, 0x91, 0x17, 0xd1, 0xc8, 0x0d, 0xb2, \ + 0xd6, 0x69, 0x72, 0x4e, 0x7e, 0xee, 0x05, 0x74, 0x64, 0x34, 0xb6, 0x39, \ + 0x64, 0x5c, 0xca, 0xf3, 0x61, 0x82, 0x8e, 0x4d, 0x90, 0xd8, 0xe0, 0xf8, \ + 0x45, 0x94, 0x82, 0x3c, 0x02, 0x49, 0xa8, 0xba, 0x47, 0x1d, 0x4d, 0xf8, \ + 0xb7, 0xbd, 0x5c, 0x89, 0xf7, 0xef, 0xcb, 0x62, 0x8a, 0xf3, 0x56, 0x2f, \ + 0xaf, 0x17, 0x33, 0x46, 0x13, 0x00, 0x13, 0xae, 0x22, 0xfa, 0xa9, 0xda, \ + 0xc8, 0xfd, 0xd3, 0x77, 0x65, 0xee, 0x58, 0x94, 0x74, 0xe4, 0xf5, 0x4f, \ + 0xa1, 0x27, 0xa6, 0xb0, 0xd1, 0x0b, 0xb3, 0xd8, 0x16, 0xb6, 0xd7, 0x67, \ + 0x63, 0x2d, 0xdc, 0x7b, 0xe1, 0x18, 0xd9, 0x8d, 0x27, 0xed, 0x1b, 0x22, \ + 0xef, 0xdf, 0x36, 0x11, 0xe2, 0xc8, 0x00, 0x0e, 0xc7, 0xe9, 0xc6, 0xb8, \ + 0xd8, 0x4b, 0x3f, 0x35, 0x41, 0xff, 0xfc, 0x96, 0x49, 0x4f, 0x7d, 0x8e, \ + 0x3f, 0x47, 0x68, 0x33, 0x17, 0x83, 0x44, 0x0f, 0xaf, 0xa6, 0x59, 0x0a, \ + 0xa9, 0x32, 0xcb, 0x59, 0xfe, 0xdd, 0x5f, 0x6e, 0x8b, 0x22, 0xb8, 0x81, \ + 0x90, 0x16, 0x91, 0x0a, 0x04, 0x79, 0x62, 0xff, 0x4b, 0x04, 0xf1, 0x5c, \ + 0x34, 0xeb, 0x69, 0xce, 0xef, 0xcb, 0x6e, 0xb6, 0x3b, 0x40, 0x55, 0xca, \ + 0x24, 0xc2, 0x3e, 0x25, 0x70, 0xee, 0x74, 0x2b, 0x0e, 0x9f, 0xc2, 0x82, \ + 0x9a, 0x20, 0x38, 0x77, 0xa1, 0x26, 0x8a, 0xca, 0x9f, 0x87, 0x75, 0x77, \ + 0xe3, 0xce, 0x65, 0xec, 0x71, 0x10, 0x35, 0xcb, 0xcb, 0x4f, 0x19, 0x43, \ + 0xeb, 0x30, 0xd0, 0xca, 0x2d, 0x3f, 0xca, 0x46, 0x14, 0x61, 0x99, 0x30, \ + 0x41, 0x32, 0xb5, 0x37, 0x63, 0x6f, 0x97 \ } /* END FILE */ diff --git a/thirdparty/mbedtls/library/ecdsa.c b/thirdparty/mbedtls/library/ecdsa.c index dc19384d61..2b4800642d 100644 --- a/thirdparty/mbedtls/library/ecdsa.c +++ b/thirdparty/mbedtls/library/ecdsa.c @@ -172,11 +172,11 @@ static void ecdsa_restart_det_free( mbedtls_ecdsa_restart_det_ctx *ctx ) } #endif /* MBEDTLS_ECDSA_DETERMINISTIC */ -#define ECDSA_RS_ECP &rs_ctx->ecp +#define ECDSA_RS_ECP ( rs_ctx == NULL ? NULL : &rs_ctx->ecp ) /* Utility macro for checking and updating ops budget */ #define ECDSA_BUDGET( ops ) \ - MBEDTLS_MPI_CHK( mbedtls_ecp_check_budget( grp, &rs_ctx->ecp, ops ) ); + MBEDTLS_MPI_CHK( mbedtls_ecp_check_budget( grp, ECDSA_RS_ECP, ops ) ); /* Call this when entering a function that needs its own sub-context */ #define ECDSA_RS_ENTER( SUB ) do { \ @@ -254,6 +254,8 @@ static int ecdsa_sign_restartable( mbedtls_ecp_group *grp, mbedtls_mpi *r, mbedtls_mpi *s, const mbedtls_mpi *d, const unsigned char *buf, size_t blen, int (*f_rng)(void *, unsigned char *, size_t), void *p_rng, + int (*f_rng_blind)(void *, unsigned char *, size_t), + void *p_rng_blind, mbedtls_ecdsa_restart_ctx *rs_ctx ) { int ret, key_tries, sign_tries; @@ -323,7 +325,9 @@ static int ecdsa_sign_restartable( mbedtls_ecp_group *grp, mul: #endif MBEDTLS_MPI_CHK( mbedtls_ecp_mul_restartable( grp, &R, pk, &grp->G, - f_rng, p_rng, ECDSA_RS_ECP ) ); + f_rng_blind, + p_rng_blind, + ECDSA_RS_ECP ) ); MBEDTLS_MPI_CHK( mbedtls_mpi_mod_mpi( pr, &R.X, &grp->N ) ); } while( mbedtls_mpi_cmp_int( pr, 0 ) == 0 ); @@ -349,7 +353,8 @@ modn: * Generate a random value to blind inv_mod in next step, * avoiding a potential timing leak. */ - MBEDTLS_MPI_CHK( mbedtls_ecp_gen_privkey( grp, &t, f_rng, p_rng ) ); + MBEDTLS_MPI_CHK( mbedtls_ecp_gen_privkey( grp, &t, f_rng_blind, + p_rng_blind ) ); /* * Step 6: compute s = (e + r * d) / k = t (e + rd) / (kt) mod n @@ -392,8 +397,9 @@ int mbedtls_ecdsa_sign( mbedtls_ecp_group *grp, mbedtls_mpi *r, mbedtls_mpi *s, ECDSA_VALIDATE_RET( f_rng != NULL ); ECDSA_VALIDATE_RET( buf != NULL || blen == 0 ); + /* Use the same RNG for both blinding and ephemeral key generation */ return( ecdsa_sign_restartable( grp, r, s, d, buf, blen, - f_rng, p_rng, NULL ) ); + f_rng, p_rng, f_rng, p_rng, NULL ) ); } #endif /* !MBEDTLS_ECDSA_SIGN_ALT */ @@ -405,6 +411,8 @@ static int ecdsa_sign_det_restartable( mbedtls_ecp_group *grp, mbedtls_mpi *r, mbedtls_mpi *s, const mbedtls_mpi *d, const unsigned char *buf, size_t blen, mbedtls_md_type_t md_alg, + int (*f_rng_blind)(void *, unsigned char *, size_t), + void *p_rng_blind, mbedtls_ecdsa_restart_ctx *rs_ctx ) { int ret; @@ -451,8 +459,70 @@ sign: ret = mbedtls_ecdsa_sign( grp, r, s, d, buf, blen, mbedtls_hmac_drbg_random, p_rng ); #else - ret = ecdsa_sign_restartable( grp, r, s, d, buf, blen, - mbedtls_hmac_drbg_random, p_rng, rs_ctx ); + if( f_rng_blind != NULL ) + ret = ecdsa_sign_restartable( grp, r, s, d, buf, blen, + mbedtls_hmac_drbg_random, p_rng, + f_rng_blind, p_rng_blind, rs_ctx ); + else + { + mbedtls_hmac_drbg_context *p_rng_blind_det; + +#if !defined(MBEDTLS_ECP_RESTARTABLE) + /* + * To avoid reusing rng_ctx and risking incorrect behavior we seed a + * second HMAC-DRBG with the same seed. We also apply a label to avoid + * reusing the bits of the ephemeral key for blinding and eliminate the + * risk that they leak this way. + */ + const char* blind_label = "BLINDING CONTEXT"; + mbedtls_hmac_drbg_context rng_ctx_blind; + + mbedtls_hmac_drbg_init( &rng_ctx_blind ); + p_rng_blind_det = &rng_ctx_blind; + + mbedtls_hmac_drbg_seed_buf( p_rng_blind_det, md_info, + data, 2 * grp_len ); + ret = mbedtls_hmac_drbg_update_ret( p_rng_blind_det, + (const unsigned char*) blind_label, + strlen( blind_label ) ); + if( ret != 0 ) + { + mbedtls_hmac_drbg_free( &rng_ctx_blind ); + goto cleanup; + } +#else + /* + * In the case of restartable computations we would either need to store + * the second RNG in the restart context too or set it up at every + * restart. The first option would penalize the correct application of + * the function and the second would defeat the purpose of the + * restartable feature. + * + * Therefore in this case we reuse the original RNG. This comes with the + * price that the resulting signature might not be a valid deterministic + * ECDSA signature with a very low probability (same magnitude as + * successfully guessing the private key). However even then it is still + * a valid ECDSA signature. + */ + p_rng_blind_det = p_rng; +#endif /* MBEDTLS_ECP_RESTARTABLE */ + + /* + * Since the output of the RNGs is always the same for the same key and + * message, this limits the efficiency of blinding and leaks information + * through side channels. After mbedtls_ecdsa_sign_det() is removed NULL + * won't be a valid value for f_rng_blind anymore. Therefore it should + * be checked by the caller and this branch and check can be removed. + */ + ret = ecdsa_sign_restartable( grp, r, s, d, buf, blen, + mbedtls_hmac_drbg_random, p_rng, + mbedtls_hmac_drbg_random, p_rng_blind_det, + rs_ctx ); + +#if !defined(MBEDTLS_ECP_RESTARTABLE) + mbedtls_hmac_drbg_free( &rng_ctx_blind ); +#endif + } #endif /* MBEDTLS_ECDSA_SIGN_ALT */ cleanup: @@ -465,19 +535,40 @@ cleanup: } /* - * Deterministic signature wrapper + * Deterministic signature wrappers */ -int mbedtls_ecdsa_sign_det( mbedtls_ecp_group *grp, mbedtls_mpi *r, mbedtls_mpi *s, - const mbedtls_mpi *d, const unsigned char *buf, size_t blen, - mbedtls_md_type_t md_alg ) +int mbedtls_ecdsa_sign_det( mbedtls_ecp_group *grp, mbedtls_mpi *r, + mbedtls_mpi *s, const mbedtls_mpi *d, + const unsigned char *buf, size_t blen, + mbedtls_md_type_t md_alg ) +{ + ECDSA_VALIDATE_RET( grp != NULL ); + ECDSA_VALIDATE_RET( r != NULL ); + ECDSA_VALIDATE_RET( s != NULL ); + ECDSA_VALIDATE_RET( d != NULL ); + ECDSA_VALIDATE_RET( buf != NULL || blen == 0 ); + + return( ecdsa_sign_det_restartable( grp, r, s, d, buf, blen, md_alg, + NULL, NULL, NULL ) ); +} + +int mbedtls_ecdsa_sign_det_ext( mbedtls_ecp_group *grp, mbedtls_mpi *r, + mbedtls_mpi *s, const mbedtls_mpi *d, + const unsigned char *buf, size_t blen, + mbedtls_md_type_t md_alg, + int (*f_rng_blind)(void *, unsigned char *, + size_t), + void *p_rng_blind ) { ECDSA_VALIDATE_RET( grp != NULL ); ECDSA_VALIDATE_RET( r != NULL ); ECDSA_VALIDATE_RET( s != NULL ); ECDSA_VALIDATE_RET( d != NULL ); ECDSA_VALIDATE_RET( buf != NULL || blen == 0 ); + ECDSA_VALIDATE_RET( f_rng_blind != NULL ); - return( ecdsa_sign_det_restartable( grp, r, s, d, buf, blen, md_alg, NULL ) ); + return( ecdsa_sign_det_restartable( grp, r, s, d, buf, blen, md_alg, + f_rng_blind, p_rng_blind, NULL ) ); } #endif /* MBEDTLS_ECDSA_DETERMINISTIC */ @@ -656,11 +747,9 @@ int mbedtls_ecdsa_write_signature_restartable( mbedtls_ecdsa_context *ctx, mbedtls_mpi_init( &s ); #if defined(MBEDTLS_ECDSA_DETERMINISTIC) - (void) f_rng; - (void) p_rng; - MBEDTLS_MPI_CHK( ecdsa_sign_det_restartable( &ctx->grp, &r, &s, &ctx->d, - hash, hlen, md_alg, rs_ctx ) ); + hash, hlen, md_alg, f_rng, + p_rng, rs_ctx ) ); #else (void) md_alg; @@ -668,8 +757,10 @@ int mbedtls_ecdsa_write_signature_restartable( mbedtls_ecdsa_context *ctx, MBEDTLS_MPI_CHK( mbedtls_ecdsa_sign( &ctx->grp, &r, &s, &ctx->d, hash, hlen, f_rng, p_rng ) ); #else + /* Use the same RNG for both blinding and ephemeral key generation */ MBEDTLS_MPI_CHK( ecdsa_sign_restartable( &ctx->grp, &r, &s, &ctx->d, - hash, hlen, f_rng, p_rng, rs_ctx ) ); + hash, hlen, f_rng, p_rng, f_rng, + p_rng, rs_ctx ) ); #endif /* MBEDTLS_ECDSA_SIGN_ALT */ #endif /* MBEDTLS_ECDSA_DETERMINISTIC */ diff --git a/thirdparty/mbedtls/library/ecjpake.c b/thirdparty/mbedtls/library/ecjpake.c index be941b14b1..1845c936ab 100644 --- a/thirdparty/mbedtls/library/ecjpake.c +++ b/thirdparty/mbedtls/library/ecjpake.c @@ -226,7 +226,7 @@ static int ecjpake_hash( const mbedtls_md_info_t *md_info, p += id_len; /* Compute hash */ - mbedtls_md( md_info, buf, p - buf, hash ); + MBEDTLS_MPI_CHK( mbedtls_md( md_info, buf, p - buf, hash ) ); /* Turn it into an integer mod n */ MBEDTLS_MPI_CHK( mbedtls_mpi_read_binary( h, hash, @@ -951,7 +951,7 @@ static const unsigned char ecjpake_test_pms[] = { 0xb4, 0x38, 0xf7, 0x19, 0xd3, 0xc4, 0xf3, 0x51 }; -/* Load my private keys and generate the correponding public keys */ +/* Load my private keys and generate the corresponding public keys */ static int ecjpake_test_load( mbedtls_ecjpake_context *ctx, const unsigned char *xm1, size_t len1, const unsigned char *xm2, size_t len2 ) diff --git a/thirdparty/mbedtls/library/entropy_poll.c b/thirdparty/mbedtls/library/entropy_poll.c index 4556f88a55..ba56b70f77 100644 --- a/thirdparty/mbedtls/library/entropy_poll.c +++ b/thirdparty/mbedtls/library/entropy_poll.c @@ -61,28 +61,43 @@ #define _WIN32_WINNT 0x0400 #endif #include <windows.h> -#include <wincrypt.h> +#include <bcrypt.h> +#if defined(_MSC_VER) && _MSC_VER <= 1600 +/* Visual Studio 2010 and earlier issue a warning when both <stdint.h> and + * <intsafe.h> are included, as they redefine a number of <TYPE>_MAX constants. + * These constants are guaranteed to be the same, though, so we suppress the + * warning when including intsafe.h. + */ +#pragma warning( push ) +#pragma warning( disable : 4005 ) +#endif +#include <intsafe.h> +#if defined(_MSC_VER) && _MSC_VER <= 1600 +#pragma warning( pop ) +#endif int mbedtls_platform_entropy_poll( void *data, unsigned char *output, size_t len, size_t *olen ) { - HCRYPTPROV provider; + ULONG len_as_ulong = 0; ((void) data); *olen = 0; - if( CryptAcquireContext( &provider, NULL, NULL, - PROV_RSA_FULL, CRYPT_VERIFYCONTEXT ) == FALSE ) + /* + * BCryptGenRandom takes ULONG for size, which is smaller than size_t on + * 64-bit Windows platforms. Ensure len's value can be safely converted into + * a ULONG. + */ + if ( FAILED( SizeTToULong( len, &len_as_ulong ) ) ) { return( MBEDTLS_ERR_ENTROPY_SOURCE_FAILED ); } - if( CryptGenRandom( provider, (DWORD) len, output ) == FALSE ) + if ( !BCRYPT_SUCCESS( BCryptGenRandom( NULL, output, len_as_ulong, BCRYPT_USE_SYSTEM_PREFERRED_RNG ) ) ) { - CryptReleaseContext( provider, 0 ); return( MBEDTLS_ERR_ENTROPY_SOURCE_FAILED ); } - CryptReleaseContext( provider, 0 ); *olen = len; return( 0 ); diff --git a/thirdparty/mbedtls/library/error.c b/thirdparty/mbedtls/library/error.c index 12312a0562..c596f0bcc5 100644 --- a/thirdparty/mbedtls/library/error.c +++ b/thirdparty/mbedtls/library/error.c @@ -567,7 +567,7 @@ void mbedtls_strerror( int ret, char *buf, size_t buflen ) if( use_ret == -(MBEDTLS_ERR_X509_BUFFER_TOO_SMALL) ) mbedtls_snprintf( buf, buflen, "X509 - Destination buffer is too small" ); if( use_ret == -(MBEDTLS_ERR_X509_FATAL_ERROR) ) - mbedtls_snprintf( buf, buflen, "X509 - A fatal error occured, eg the chain is too long or the vrfy callback failed" ); + mbedtls_snprintf( buf, buflen, "X509 - A fatal error occurred, eg the chain is too long or the vrfy callback failed" ); #endif /* MBEDTLS_X509_USE_C || MBEDTLS_X509_CREATE_C */ // END generated code diff --git a/thirdparty/mbedtls/library/havege.c b/thirdparty/mbedtls/library/havege.c index 54f897c6e7..c139e1db03 100644 --- a/thirdparty/mbedtls/library/havege.c +++ b/thirdparty/mbedtls/library/havege.c @@ -38,8 +38,19 @@ #include "mbedtls/timing.h" #include "mbedtls/platform_util.h" +#include <limits.h> #include <string.h> +/* If int isn't capable of storing 2^32 distinct values, the code of this + * module may cause a processor trap or a miscalculation. If int is more + * than 32 bits, the code may not calculate the intended values. */ +#if INT_MIN + 1 != -0x7fffffff +#error "The HAVEGE module requires int to be exactly 32 bits, with INT_MIN = -2^31." +#endif +#if UINT_MAX != 0xffffffff +#error "The HAVEGE module requires unsigned to be exactly 32 bits." +#endif + /* ------------------------------------------------------------------------ * On average, one iteration accesses two 8-word blocks in the havege WALK * table, and generates 16 words in the RES array. @@ -54,7 +65,7 @@ * ------------------------------------------------------------------------ */ -#define SWAP(X,Y) { int *T = (X); (X) = (Y); (Y) = T; } +#define SWAP(X,Y) { unsigned *T = (X); (X) = (Y); (Y) = T; } #define TST1_ENTER if( PTEST & 1 ) { PTEST ^= 3; PTEST >>= 1; #define TST2_ENTER if( PTEST & 1 ) { PTEST ^= 3; PTEST >>= 1; @@ -77,7 +88,7 @@ PTX = (PT1 >> 18) & 7; \ PT1 &= 0x1FFF; \ PT2 &= 0x1FFF; \ - CLK = (int) mbedtls_timing_hardclock(); \ + CLK = (unsigned) mbedtls_timing_hardclock(); \ \ i = 0; \ A = &WALK[PT1 ]; RES[i++] ^= *A; \ @@ -100,7 +111,7 @@ \ IN = (*A >> (5)) ^ (*A << (27)) ^ CLK; \ *A = (*B >> (6)) ^ (*B << (26)) ^ CLK; \ - *B = IN; CLK = (int) mbedtls_timing_hardclock(); \ + *B = IN; CLK = (unsigned) mbedtls_timing_hardclock(); \ *C = (*C >> (7)) ^ (*C << (25)) ^ CLK; \ *D = (*D >> (8)) ^ (*D << (24)) ^ CLK; \ \ @@ -151,19 +162,20 @@ PT1 ^= (PT2 ^ 0x10) & 0x10; \ \ for( n++, i = 0; i < 16; i++ ) \ - hs->pool[n % MBEDTLS_HAVEGE_COLLECT_SIZE] ^= RES[i]; + POOL[n % MBEDTLS_HAVEGE_COLLECT_SIZE] ^= RES[i]; /* * Entropy gathering function */ static void havege_fill( mbedtls_havege_state *hs ) { - int i, n = 0; - int U1, U2, *A, *B, *C, *D; - int PT1, PT2, *WALK, RES[16]; - int PTX, PTY, CLK, PTEST, IN; + unsigned i, n = 0; + unsigned U1, U2, *A, *B, *C, *D; + unsigned PT1, PT2, *WALK, *POOL, RES[16]; + unsigned PTX, PTY, CLK, PTEST, IN; - WALK = hs->WALK; + WALK = (unsigned *) hs->WALK; + POOL = (unsigned *) hs->pool; PT1 = hs->PT1; PT2 = hs->PT2; diff --git a/thirdparty/mbedtls/library/hmac_drbg.c b/thirdparty/mbedtls/library/hmac_drbg.c index c50330e7d8..50d88bd54b 100644 --- a/thirdparty/mbedtls/library/hmac_drbg.c +++ b/thirdparty/mbedtls/library/hmac_drbg.c @@ -149,20 +149,32 @@ int mbedtls_hmac_drbg_seed_buf( mbedtls_hmac_drbg_context *ctx, } /* - * HMAC_DRBG reseeding: 10.1.2.4 (arabic) + 9.2 (Roman) + * Internal function used both for seeding and reseeding the DRBG. + * Comments starting with arabic numbers refer to section 10.1.2.4 + * of SP800-90A, while roman numbers refer to section 9.2. */ -int mbedtls_hmac_drbg_reseed( mbedtls_hmac_drbg_context *ctx, - const unsigned char *additional, size_t len ) +static int hmac_drbg_reseed_core( mbedtls_hmac_drbg_context *ctx, + const unsigned char *additional, size_t len, + int use_nonce ) { unsigned char seed[MBEDTLS_HMAC_DRBG_MAX_SEED_INPUT]; - size_t seedlen; + size_t seedlen = 0; int ret; - /* III. Check input length */ - if( len > MBEDTLS_HMAC_DRBG_MAX_INPUT || - ctx->entropy_len + len > MBEDTLS_HMAC_DRBG_MAX_SEED_INPUT ) { - return( MBEDTLS_ERR_HMAC_DRBG_INPUT_TOO_BIG ); + size_t total_entropy_len; + + if( use_nonce == 0 ) + total_entropy_len = ctx->entropy_len; + else + total_entropy_len = ctx->entropy_len * 3 / 2; + + /* III. Check input length */ + if( len > MBEDTLS_HMAC_DRBG_MAX_INPUT || + total_entropy_len + len > MBEDTLS_HMAC_DRBG_MAX_SEED_INPUT ) + { + return( MBEDTLS_ERR_HMAC_DRBG_INPUT_TOO_BIG ); + } } memset( seed, 0, MBEDTLS_HMAC_DRBG_MAX_SEED_INPUT ); @@ -170,9 +182,32 @@ int mbedtls_hmac_drbg_reseed( mbedtls_hmac_drbg_context *ctx, /* IV. Gather entropy_len bytes of entropy for the seed */ if( ( ret = ctx->f_entropy( ctx->p_entropy, seed, ctx->entropy_len ) ) != 0 ) + { return( MBEDTLS_ERR_HMAC_DRBG_ENTROPY_SOURCE_FAILED ); + } + seedlen += ctx->entropy_len; + + /* For initial seeding, allow adding of nonce generated + * from the entropy source. See Sect 8.6.7 in SP800-90A. */ + if( use_nonce ) + { + /* Note: We don't merge the two calls to f_entropy() in order + * to avoid requesting too much entropy from f_entropy() + * at once. Specifically, if the underlying digest is not + * SHA-1, 3 / 2 * entropy_len is at least 36 Bytes, which + * is larger than the maximum of 32 Bytes that our own + * entropy source implementation can emit in a single + * call in configurations disabling SHA-512. */ + if( ( ret = ctx->f_entropy( ctx->p_entropy, + seed + seedlen, + ctx->entropy_len / 2 ) ) != 0 ) + { + return( MBEDTLS_ERR_HMAC_DRBG_ENTROPY_SOURCE_FAILED ); + } + + seedlen += ctx->entropy_len / 2; + } - seedlen = ctx->entropy_len; /* 1. Concatenate entropy and additional data if any */ if( additional != NULL && len != 0 ) @@ -195,7 +230,19 @@ exit: } /* + * HMAC_DRBG reseeding: 10.1.2.4 + 9.2 + */ +int mbedtls_hmac_drbg_reseed( mbedtls_hmac_drbg_context *ctx, + const unsigned char *additional, size_t len ) +{ + return( hmac_drbg_reseed_core( ctx, additional, len, 0 ) ); +} + +/* * HMAC_DRBG initialisation (10.1.2.3 + 9.1) + * + * The nonce is not passed as a separate parameter but extracted + * from the entropy source as suggested in 8.6.7. */ int mbedtls_hmac_drbg_seed( mbedtls_hmac_drbg_context *ctx, const mbedtls_md_info_t * md_info, @@ -205,7 +252,7 @@ int mbedtls_hmac_drbg_seed( mbedtls_hmac_drbg_context *ctx, size_t len ) { int ret; - size_t entropy_len, md_size; + size_t md_size; if( ( ret = mbedtls_md_setup( &ctx->md_ctx, md_info, 1 ) ) != 0 ) return( ret ); @@ -233,20 +280,15 @@ int mbedtls_hmac_drbg_seed( mbedtls_hmac_drbg_context *ctx, * * (This also matches the sizes used in the NIST test vectors.) */ - entropy_len = md_size <= 20 ? 16 : /* 160-bits hash -> 128 bits */ - md_size <= 28 ? 24 : /* 224-bits hash -> 192 bits */ - 32; /* better (256+) -> 256 bits */ - - /* - * For initialisation, use more entropy to emulate a nonce - * (Again, matches test vectors.) - */ - ctx->entropy_len = entropy_len * 3 / 2; + ctx->entropy_len = md_size <= 20 ? 16 : /* 160-bits hash -> 128 bits */ + md_size <= 28 ? 24 : /* 224-bits hash -> 192 bits */ + 32; /* better (256+) -> 256 bits */ - if( ( ret = mbedtls_hmac_drbg_reseed( ctx, custom, len ) ) != 0 ) + if( ( ret = hmac_drbg_reseed_core( ctx, custom, len, + 1 /* add nonce */ ) ) != 0 ) + { return( ret ); - - ctx->entropy_len = entropy_len; + } return( 0 ); } diff --git a/thirdparty/mbedtls/library/net_sockets.c b/thirdparty/mbedtls/library/net_sockets.c index 816b1303df..5d538bfd56 100644 --- a/thirdparty/mbedtls/library/net_sockets.c +++ b/thirdparty/mbedtls/library/net_sockets.c @@ -284,7 +284,7 @@ static int net_would_block( const mbedtls_net_context *ctx ) int err = errno; /* - * Never return 'WOULD BLOCK' on a non-blocking socket + * Never return 'WOULD BLOCK' on a blocking socket */ if( ( fcntl( ctx->fd, F_GETFL ) & O_NONBLOCK ) != O_NONBLOCK ) { diff --git a/thirdparty/mbedtls/library/pkwrite.c b/thirdparty/mbedtls/library/pkwrite.c index 8d1da2f757..03d14f2ff9 100644 --- a/thirdparty/mbedtls/library/pkwrite.c +++ b/thirdparty/mbedtls/library/pkwrite.c @@ -38,7 +38,9 @@ #include "mbedtls/rsa.h" #endif #if defined(MBEDTLS_ECP_C) +#include "mbedtls/bignum.h" #include "mbedtls/ecp.h" +#include "mbedtls/platform_util.h" #endif #if defined(MBEDTLS_ECDSA_C) #include "mbedtls/ecdsa.h" @@ -150,6 +152,26 @@ static int pk_write_ec_param( unsigned char **p, unsigned char *start, return( (int) len ); } + +/* + * privateKey OCTET STRING -- always of length ceil(log2(n)/8) + */ +static int pk_write_ec_private( unsigned char **p, unsigned char *start, + mbedtls_ecp_keypair *ec ) +{ + int ret; + size_t byte_length = ( ec->grp.pbits + 7 ) / 8; + unsigned char tmp[MBEDTLS_ECP_MAX_BYTES]; + + ret = mbedtls_mpi_write_binary( &ec->d, tmp, byte_length ); + if( ret != 0 ) + goto exit; + ret = mbedtls_asn1_write_octet_string( p, start, tmp, byte_length ); + +exit: + mbedtls_platform_zeroize( tmp, byte_length ); + return( ret ); +} #endif /* MBEDTLS_ECP_C */ int mbedtls_pk_write_pubkey( unsigned char **p, unsigned char *start, @@ -364,9 +386,8 @@ int mbedtls_pk_write_key_der( mbedtls_pk_context *key, unsigned char *buf, size_ MBEDTLS_ASN1_CONTEXT_SPECIFIC | MBEDTLS_ASN1_CONSTRUCTED | 0 ) ); len += par_len; - /* privateKey: write as MPI then fix tag */ - MBEDTLS_ASN1_CHK_ADD( len, mbedtls_asn1_write_mpi( &c, buf, &ec->d ) ); - *c = MBEDTLS_ASN1_OCTET_STRING; + /* privateKey */ + MBEDTLS_ASN1_CHK_ADD( len, pk_write_ec_private( &c, buf, ec ) ); /* version */ MBEDTLS_ASN1_CHK_ADD( len, mbedtls_asn1_write_int( &c, buf, 1 ) ); diff --git a/thirdparty/mbedtls/library/platform_util.c b/thirdparty/mbedtls/library/platform_util.c index 756e22679a..b1f745097c 100644 --- a/thirdparty/mbedtls/library/platform_util.c +++ b/thirdparty/mbedtls/library/platform_util.c @@ -72,7 +72,10 @@ static void * (* const volatile memset_func)( void *, int, size_t ) = memset; void mbedtls_platform_zeroize( void *buf, size_t len ) { - memset_func( buf, 0, len ); + MBEDTLS_INTERNAL_VALIDATE( len == 0 || buf != NULL ); + + if( len > 0 ) + memset_func( buf, 0, len ); } #endif /* MBEDTLS_PLATFORM_ZEROIZE_ALT */ diff --git a/thirdparty/mbedtls/library/ssl_srv.c b/thirdparty/mbedtls/library/ssl_srv.c index bc77f80203..5825970c43 100644 --- a/thirdparty/mbedtls/library/ssl_srv.c +++ b/thirdparty/mbedtls/library/ssl_srv.c @@ -1449,7 +1449,7 @@ read_record_header: */ /* - * Minimal length (with everything empty and extensions ommitted) is + * Minimal length (with everything empty and extensions omitted) is * 2 + 32 + 1 + 2 + 1 = 38 bytes. Check that first, so that we can * read at least up to session id length without worrying. */ diff --git a/thirdparty/mbedtls/library/ssl_tls.c b/thirdparty/mbedtls/library/ssl_tls.c index 38690fa664..b8f35fec5d 100644 --- a/thirdparty/mbedtls/library/ssl_tls.c +++ b/thirdparty/mbedtls/library/ssl_tls.c @@ -2606,7 +2606,7 @@ int mbedtls_ssl_fetch_input( mbedtls_ssl_context *ssl, size_t nb_want ) } /* - * A record can't be split accross datagrams. If we need to read but + * A record can't be split across datagrams. If we need to read but * are not at the beginning of a new record, the caller did something * wrong. */ @@ -9043,8 +9043,12 @@ static int ssl_preset_suiteb_hashes[] = { #if defined(MBEDTLS_ECP_C) static mbedtls_ecp_group_id ssl_preset_suiteb_curves[] = { +#if defined(MBEDTLS_ECP_DP_SECP256R1_ENABLED) MBEDTLS_ECP_DP_SECP256R1, +#endif +#if defined(MBEDTLS_ECP_DP_SECP384R1_ENABLED) MBEDTLS_ECP_DP_SECP384R1, +#endif MBEDTLS_ECP_DP_NONE }; #endif diff --git a/thirdparty/mbedtls/library/timing.c b/thirdparty/mbedtls/library/timing.c index 413d133fb6..009516a6e3 100644 --- a/thirdparty/mbedtls/library/timing.c +++ b/thirdparty/mbedtls/library/timing.c @@ -51,7 +51,6 @@ #if defined(_WIN32) && !defined(EFIX64) && !defined(EFI32) #include <windows.h> -#include <winbase.h> #include <process.h> struct _hr_time diff --git a/thirdparty/mbedtls/library/version_features.c b/thirdparty/mbedtls/library/version_features.c index 24143d052c..a99ee808d6 100644 --- a/thirdparty/mbedtls/library/version_features.c +++ b/thirdparty/mbedtls/library/version_features.c @@ -87,6 +87,9 @@ static const char *features[] = { #if defined(MBEDTLS_CHECK_PARAMS) "MBEDTLS_CHECK_PARAMS", #endif /* MBEDTLS_CHECK_PARAMS */ +#if defined(MBEDTLS_CHECK_PARAMS_ASSERT) + "MBEDTLS_CHECK_PARAMS_ASSERT", +#endif /* MBEDTLS_CHECK_PARAMS_ASSERT */ #if defined(MBEDTLS_TIMING_ALT) "MBEDTLS_TIMING_ALT", #endif /* MBEDTLS_TIMING_ALT */ diff --git a/thirdparty/mbedtls/library/x509.c b/thirdparty/mbedtls/library/x509.c index a562df7ca3..2e0b0e8f6c 100644 --- a/thirdparty/mbedtls/library/x509.c +++ b/thirdparty/mbedtls/library/x509.c @@ -123,7 +123,7 @@ int mbedtls_x509_get_alg_null( unsigned char **p, const unsigned char *end, } /* - * Parse an algorithm identifier with (optional) paramaters + * Parse an algorithm identifier with (optional) parameters */ int mbedtls_x509_get_alg( unsigned char **p, const unsigned char *end, mbedtls_x509_buf *alg, mbedtls_x509_buf *params ) diff --git a/thirdparty/mbedtls/library/x509_crt.c b/thirdparty/mbedtls/library/x509_crt.c index 97e1d72e3c..a3697f13f9 100644 --- a/thirdparty/mbedtls/library/x509_crt.c +++ b/thirdparty/mbedtls/library/x509_crt.c @@ -65,6 +65,19 @@ #if defined(_WIN32) && !defined(EFIX64) && !defined(EFI32) #include <windows.h> +#if defined(_MSC_VER) && _MSC_VER <= 1600 +/* Visual Studio 2010 and earlier issue a warning when both <stdint.h> and + * <intsafe.h> are included, as they redefine a number of <TYPE>_MAX constants. + * These constants are guaranteed to be the same, though, so we suppress the + * warning when including intsafe.h. + */ +#pragma warning( push ) +#pragma warning( disable : 4005 ) +#endif +#include <intsafe.h> +#if defined(_MSC_VER) && _MSC_VER <= 1600 +#pragma warning( pop ) +#endif #else #include <time.h> #endif @@ -1277,6 +1290,7 @@ int mbedtls_x509_crt_parse_path( mbedtls_x509_crt *chain, const char *path ) char filename[MAX_PATH]; char *p; size_t len = strlen( path ); + int lengthAsInt = 0; WIN32_FIND_DATAW file_data; HANDLE hFind; @@ -1291,7 +1305,18 @@ int mbedtls_x509_crt_parse_path( mbedtls_x509_crt *chain, const char *path ) p = filename + len; filename[len++] = '*'; - w_ret = MultiByteToWideChar( CP_ACP, 0, filename, (int)len, szDir, + if ( FAILED ( SizeTToInt( len, &lengthAsInt ) ) ) + return( MBEDTLS_ERR_X509_FILE_IO_ERROR ); + + /* + * Note this function uses the code page CP_ACP, and assumes the incoming + * string is encoded in ANSI, before translating it into Unicode. If the + * incoming string were changed to be UTF-8, then the length check needs to + * change to check the number of characters, not the number of bytes, in the + * incoming string are less than MAX_PATH to avoid a buffer overrun with + * MultiByteToWideChar(). + */ + w_ret = MultiByteToWideChar( CP_ACP, 0, filename, lengthAsInt, szDir, MAX_PATH - 3 ); if( w_ret == 0 ) return( MBEDTLS_ERR_X509_BAD_INPUT_DATA ); @@ -1308,8 +1333,11 @@ int mbedtls_x509_crt_parse_path( mbedtls_x509_crt *chain, const char *path ) if( file_data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY ) continue; + if ( FAILED( SizeTToInt( wcslen( file_data.cFileName ), &lengthAsInt ) ) ) + return( MBEDTLS_ERR_X509_FILE_IO_ERROR ); + w_ret = WideCharToMultiByte( CP_ACP, 0, file_data.cFileName, - lstrlenW( file_data.cFileName ), + lengthAsInt, p, (int) len - 1, NULL, NULL ); if( w_ret == 0 ) @@ -2087,15 +2115,13 @@ check_signature: continue; } - break; - } - - if( parent != NULL ) - { *r_parent = parent; *r_signature_is_good = signature_is_good; + + break; } - else + + if( parent == NULL ) { *r_parent = fallback_parent; *r_signature_is_good = fallback_signature_is_good; @@ -2236,7 +2262,7 @@ static int x509_crt_check_ee_locally_trusted( * Tests for (aspects of) this function should include at least: * - trusted EE * - EE -> trusted root - * - EE -> intermedate CA -> trusted root + * - EE -> intermediate CA -> trusted root * - if relevant: EE untrusted * - if relevant: EE -> intermediate, untrusted * with the aspect under test checked at each relevant level (EE, int, root). diff --git a/thirdparty/mbedtls/library/x509write_crt.c b/thirdparty/mbedtls/library/x509write_crt.c index 10497e752b..61d7ba44a0 100644 --- a/thirdparty/mbedtls/library/x509write_crt.c +++ b/thirdparty/mbedtls/library/x509write_crt.c @@ -45,6 +45,16 @@ #include "mbedtls/pem.h" #endif /* MBEDTLS_PEM_WRITE_C */ +/* + * For the currently used signature algorithms the buffer to store any signature + * must be at least of size MAX(MBEDTLS_ECDSA_MAX_LEN, MBEDTLS_MPI_MAX_SIZE) + */ +#if MBEDTLS_ECDSA_MAX_LEN > MBEDTLS_MPI_MAX_SIZE +#define SIGNATURE_MAX_SIZE MBEDTLS_ECDSA_MAX_LEN +#else +#define SIGNATURE_MAX_SIZE MBEDTLS_MPI_MAX_SIZE +#endif + void mbedtls_x509write_crt_init( mbedtls_x509write_cert *ctx ) { memset( ctx, 0, sizeof( mbedtls_x509write_cert ) ); @@ -334,7 +344,7 @@ int mbedtls_x509write_crt_der( mbedtls_x509write_cert *ctx, unsigned char *buf, size_t sig_oid_len = 0; unsigned char *c, *c2; unsigned char hash[64]; - unsigned char sig[MBEDTLS_MPI_MAX_SIZE]; + unsigned char sig[SIGNATURE_MAX_SIZE]; unsigned char tmp_buf[2048]; size_t sub_len = 0, pub_len = 0, sig_and_oid_len = 0, sig_len; size_t len = 0; diff --git a/thirdparty/mbedtls/library/x509write_csr.c b/thirdparty/mbedtls/library/x509write_csr.c index d70ba0ed92..b65a11c6aa 100644 --- a/thirdparty/mbedtls/library/x509write_csr.c +++ b/thirdparty/mbedtls/library/x509write_csr.c @@ -44,6 +44,16 @@ #include "mbedtls/pem.h" #endif +/* + * For the currently used signature algorithms the buffer to store any signature + * must be at least of size MAX(MBEDTLS_ECDSA_MAX_LEN, MBEDTLS_MPI_MAX_SIZE) + */ +#if MBEDTLS_ECDSA_MAX_LEN > MBEDTLS_MPI_MAX_SIZE +#define SIGNATURE_MAX_SIZE MBEDTLS_ECDSA_MAX_LEN +#else +#define SIGNATURE_MAX_SIZE MBEDTLS_MPI_MAX_SIZE +#endif + void mbedtls_x509write_csr_init( mbedtls_x509write_csr *ctx ) { memset( ctx, 0, sizeof( mbedtls_x509write_csr ) ); @@ -159,7 +169,7 @@ int mbedtls_x509write_csr_der( mbedtls_x509write_csr *ctx, unsigned char *buf, s size_t sig_oid_len = 0; unsigned char *c, *c2; unsigned char hash[64]; - unsigned char sig[MBEDTLS_MPI_MAX_SIZE]; + unsigned char sig[SIGNATURE_MAX_SIZE]; unsigned char tmp_buf[2048]; size_t pub_len = 0, sig_and_oid_len = 0, sig_len; size_t len = 0; diff --git a/thirdparty/mbedtls/1453.diff b/thirdparty/mbedtls/patches/1453.diff index b1c9c43ed2..b1c9c43ed2 100644 --- a/thirdparty/mbedtls/1453.diff +++ b/thirdparty/mbedtls/patches/1453.diff diff --git a/thirdparty/mbedtls/padlock.diff b/thirdparty/mbedtls/patches/padlock.diff index 6ace48891c..6ace48891c 100644 --- a/thirdparty/mbedtls/padlock.diff +++ b/thirdparty/mbedtls/patches/padlock.diff diff --git a/thirdparty/miniupnpc/LICENSE b/thirdparty/miniupnpc/LICENSE index 39e0345f8a..1460310752 100644 --- a/thirdparty/miniupnpc/LICENSE +++ b/thirdparty/miniupnpc/LICENSE @@ -24,4 +24,3 @@ INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - diff --git a/thirdparty/miniupnpc/miniupnpc/connecthostport.c b/thirdparty/miniupnpc/miniupnpc/connecthostport.c index a59dc82437..f3982e1a77 100644 --- a/thirdparty/miniupnpc/miniupnpc/connecthostport.c +++ b/thirdparty/miniupnpc/miniupnpc/connecthostport.c @@ -1,4 +1,4 @@ -/* $Id: connecthostport.c,v 1.21 2019/04/23 12:11:08 nanard Exp $ */ +/* $Id: connecthostport.c,v 1.22 2019/10/13 17:22:08 nanard Exp $ */ /* vim: tabstop=4 shiftwidth=4 noexpandtab * Project : miniupnp * Author : Thomas Bernard @@ -195,6 +195,10 @@ SOCKET connecthostport(const char * host, unsigned short port, { if(!ISINVALID(s)) closesocket(s); +#ifdef DEBUG + printf("ai_family=%d ai_socktype=%d ai_protocol=%d (PF_INET=%d, PF_INET6=%d)\n", + p->ai_family, p->ai_socktype, p->ai_protocol, PF_INET, PF_INET6); +#endif s = socket(p->ai_family, p->ai_socktype, p->ai_protocol); if(ISINVALID(s)) continue; diff --git a/thirdparty/miniupnpc/miniupnpc/miniupnpc.c b/thirdparty/miniupnpc/miniupnpc/miniupnpc.c index 3181d10eb6..95ab6cf56b 100644 --- a/thirdparty/miniupnpc/miniupnpc/miniupnpc.c +++ b/thirdparty/miniupnpc/miniupnpc/miniupnpc.c @@ -564,6 +564,7 @@ UPNP_GetValidIGD(struct UPNPDev * devlist, char * lanaddr, int lanaddrlen) { struct xml_desc { + char lanaddr[40]; char * xml; int size; int is_igd; @@ -573,7 +574,6 @@ UPNP_GetValidIGD(struct UPNPDev * devlist, int i; int state = -1; /* state 1 : IGD connected. State 2 : IGD. State 3 : anything */ char extIpAddr[16]; - char myLanAddr[40]; int status_code = -1; if(!devlist) @@ -596,7 +596,7 @@ UPNP_GetValidIGD(struct UPNPDev * devlist, /* we should choose an internet gateway device. * with st == urn:schemas-upnp-org:device:InternetGatewayDevice:1 */ desc[i].xml = miniwget_getaddr(dev->descURL, &(desc[i].size), - myLanAddr, sizeof(myLanAddr), + desc[i].lanaddr, sizeof(desc[i].lanaddr), dev->scope_id, &status_code); #ifdef DEBUG if(!desc[i].xml) @@ -613,8 +613,6 @@ UPNP_GetValidIGD(struct UPNPDev * devlist, "urn:schemas-upnp-org:service:WANCommonInterfaceConfig:")) { desc[i].is_igd = 1; - if(lanaddr) - strncpy(lanaddr, myLanAddr, lanaddrlen); } } } @@ -680,6 +678,8 @@ UPNP_GetValidIGD(struct UPNPDev * devlist, } state = 0; free_and_return: + if (lanaddr != NULL && state >= 1 && state <= 3 && i < ndev) + strncpy(lanaddr, desc[i].lanaddr, lanaddrlen); for(i = 0; i < ndev; i++) free(desc[i].xml); free(desc); @@ -713,4 +713,3 @@ UPNP_GetIGDFromUrl(const char * rootdescurl, return 0; } } - diff --git a/thirdparty/miniupnpc/miniupnpc/upnpc.c b/thirdparty/miniupnpc/miniupnpc/upnpc.c index 674c89beb0..4325658bee 100644 --- a/thirdparty/miniupnpc/miniupnpc/upnpc.c +++ b/thirdparty/miniupnpc/miniupnpc/upnpc.c @@ -250,6 +250,7 @@ static int SetRedirectAndTest(struct UPNPUrls * urls, const char * eport, const char * proto, const char * leaseDuration, + const char * remoteHost, const char * description, int addAny) { @@ -283,7 +284,7 @@ static int SetRedirectAndTest(struct UPNPUrls * urls, if (addAny) { r = UPNP_AddAnyPortMapping(urls->controlURL, data->first.servicetype, eport, iport, iaddr, description, - proto, 0, leaseDuration, reservedPort); + proto, remoteHost, leaseDuration, reservedPort); if(r==UPNPCOMMAND_SUCCESS) eport = reservedPort; else @@ -292,7 +293,7 @@ static int SetRedirectAndTest(struct UPNPUrls * urls, } else { r = UPNP_AddPortMapping(urls->controlURL, data->first.servicetype, eport, iport, iaddr, description, - proto, 0, leaseDuration); + proto, remoteHost, leaseDuration); if(r!=UPNPCOMMAND_SUCCESS) { printf("AddPortMapping(%s, %s, %s) failed with code %d (%s)\n", eport, iport, iaddr, r, strupnperror(r)); @@ -302,7 +303,7 @@ static int SetRedirectAndTest(struct UPNPUrls * urls, r = UPNP_GetSpecificPortMappingEntry(urls->controlURL, data->first.servicetype, - eport, proto, NULL/*remoteHost*/, + eport, proto, remoteHost, intClient, intPort, NULL/*desc*/, NULL/*enabled*/, duration); if(r!=UPNPCOMMAND_SUCCESS) { @@ -642,12 +643,12 @@ int main(int argc, char ** argv) || (command == 'U' && commandargc<2) || (command == 'D' && commandargc<1)) { - fprintf(stderr, "Usage :\t%s [options] -a ip port external_port protocol [duration]\n\t\tAdd port redirection\n", argv[0]); - fprintf(stderr, " \t%s [options] -d external_port protocol <remote host>\n\t\tDelete port redirection\n", argv[0]); + fprintf(stderr, "Usage :\t%s [options] -a ip port external_port protocol [duration] [remote host]\n\t\tAdd port redirection\n", argv[0]); + fprintf(stderr, " \t%s [options] -d external_port protocol [remote host]\n\t\tDelete port redirection\n", argv[0]); fprintf(stderr, " \t%s [options] -s\n\t\tGet Connection status\n", argv[0]); fprintf(stderr, " \t%s [options] -l\n\t\tList redirections\n", argv[0]); fprintf(stderr, " \t%s [options] -L\n\t\tList redirections (using GetListOfPortMappings (for IGD:2 only)\n", argv[0]); - fprintf(stderr, " \t%s [options] -n ip port external_port protocol [duration]\n\t\tAdd (any) port redirection allowing IGD to use alternative external_port (for IGD:2 only)\n", argv[0]); + fprintf(stderr, " \t%s [options] -n ip port external_port protocol [duration] [remote host]\n\t\tAdd (any) port redirection allowing IGD to use alternative external_port (for IGD:2 only)\n", argv[0]); fprintf(stderr, " \t%s [options] -N external_port_start external_port_end protocol [manage]\n\t\tDelete range of port redirections (for IGD:2 only)\n", argv[0]); fprintf(stderr, " \t%s [options] -r port1 [external_port1] protocol1 [port2 [external_port2] protocol2] [...]\n\t\tAdd all redirections to the current host\n", argv[0]); fprintf(stderr, " \t%s [options] -A remote_ip remote_port internal_ip internal_port protocol lease_time\n\t\tAdd Pinhole (for IGD:2 only)\n", argv[0]); @@ -734,7 +735,8 @@ int main(int argc, char ** argv) if (SetRedirectAndTest(&urls, &data, commandargv[0], commandargv[1], commandargv[2], commandargv[3], - (commandargc > 4)?commandargv[4]:"0", + (commandargc > 4)&is_int(commandargv[4])?commandargv[4]:"0", + (commandargc > 4)&!is_int(commandargv[4])?commandargv[4]:(commandargc > 5)?commandargv[5]:NULL, description, 0) < 0) retcode = 2; break; @@ -747,7 +749,8 @@ int main(int argc, char ** argv) if (SetRedirectAndTest(&urls, &data, commandargv[0], commandargv[1], commandargv[2], commandargv[3], - (commandargc > 4)?commandargv[4]:"0", + (commandargc > 4)&is_int(commandargv[4])?commandargv[4]:"0", + (commandargc > 4)&!is_int(commandargv[4])?commandargv[4]:(commandargc > 5)?commandargv[5]:NULL, description, 1) < 0) retcode = 2; break; @@ -775,7 +778,7 @@ int main(int argc, char ** argv) /* 2nd parameter is an integer : <port> <external_port> <protocol> */ if (SetRedirectAndTest(&urls, &data, lanaddr, commandargv[i], - commandargv[i+1], commandargv[i+2], "0", + commandargv[i+1], commandargv[i+2], "0", NULL, description, 0) < 0) retcode = 2; i+=3; /* 3 parameters parsed */ @@ -783,7 +786,7 @@ int main(int argc, char ** argv) /* 2nd parameter not an integer : <port> <protocol> */ if (SetRedirectAndTest(&urls, &data, lanaddr, commandargv[i], - commandargv[i], commandargv[i+1], "0", + commandargv[i], commandargv[i+1], "0", NULL, description, 0) < 0) retcode = 2; i+=2; /* 2 parameters parsed */ diff --git a/thirdparty/miniupnpc/miniupnpc/upnperrors.c b/thirdparty/miniupnpc/miniupnpc/upnperrors.c index 650af42557..4496e8622c 100644 --- a/thirdparty/miniupnpc/miniupnpc/upnperrors.c +++ b/thirdparty/miniupnpc/miniupnpc/upnperrors.c @@ -1,9 +1,10 @@ -/* $Id: upnperrors.c,v 1.9 2019/06/25 21:15:46 nanard Exp $ */ -/* Project : miniupnp +/* $Id: upnperrors.c,v 1.10 2019/08/24 08:49:53 nanard Exp $ */ +/* vim: tabstop=4 shiftwidth=4 noexpandtab + * Project : miniupnp * Author : Thomas BERNARD * copyright (c) 2007-2019 Thomas Bernard * All Right reserved. - * http://miniupnp.free.fr/ or http://miniupnp.tuxfamily.org/ + * http://miniupnp.free.fr/ or https://miniupnp.tuxfamily.org/ * This software is subjet to the conditions detailed in the * provided LICENCE file. */ #include <string.h> @@ -71,7 +72,7 @@ const char * strupnperror(int err) s = "ProtocolWildcardingNotAllowed"; break; case 708: - s = "WildcardNotPermittedInSrcIP"; + s = "InvalidLayer2Address"; break; case 709: s = "NoPacketSent"; diff --git a/thirdparty/misc/stb_vorbis.c b/thirdparty/misc/stb_vorbis.c index 71af404dae..4ab8880d5d 100644 --- a/thirdparty/misc/stb_vorbis.c +++ b/thirdparty/misc/stb_vorbis.c @@ -1,4 +1,4 @@ -// Ogg Vorbis audio decoder - v1.16 - public domain +// Ogg Vorbis audio decoder - v1.17 - public domain // http://nothings.org/stb_vorbis/ // // Original version written by Sean Barrett in 2007. @@ -30,9 +30,10 @@ // Tom Beaumont Ingo Leitgeb Nicolas Guillemot // Phillip Bennefall Rohit Thiago Goulart // manxorist@github saga musix github:infatum -// Timur Gagiev +// Timur Gagiev Maxwell Koo // // Partial history: +// 1.17 - 2019-07-08 - fix CVE-2019-13217..CVE-2019-13223 (by ForAllSecure) // 1.16 - 2019-03-04 - fix warnings // 1.15 - 2019-02-07 - explicit failure if Ogg Skeleton data is found // 1.14 - 2018-02-11 - delete bogus dealloca usage @@ -1202,8 +1203,10 @@ static int lookup1_values(int entries, int dim) int r = (int) floor(exp((float) log((float) entries) / dim)); if ((int) floor(pow((float) r+1, dim)) <= entries) // (int) cast for MinGW warning; ++r; // floor() to avoid _ftol() when non-CRT - assert(pow((float) r+1, dim) > entries); - assert((int) floor(pow((float) r, dim)) <= entries); // (int),floor() as above + if (pow((float) r+1, dim) <= entries) + return -1; + if ((int) floor(pow((float) r, dim)) > entries) + return -1; return r; } @@ -2013,7 +2016,7 @@ static __forceinline void draw_line(float *output, int x0, int y0, int x1, int y ady -= abs(base) * adx; if (x1 > n) x1 = n; if (x < x1) { - LINE_OP(output[x], inverse_db_table[y]); + LINE_OP(output[x], inverse_db_table[y&255]); for (++x; x < x1; ++x) { err += ady; if (err >= adx) { @@ -2021,7 +2024,7 @@ static __forceinline void draw_line(float *output, int x0, int y0, int x1, int y y += sy; } else y += base; - LINE_OP(output[x], inverse_db_table[y]); + LINE_OP(output[x], inverse_db_table[y&255]); } } } @@ -3048,7 +3051,6 @@ static float *get_window(vorb *f, int len) len <<= 1; if (len == f->blocksize_0) return f->window[0]; if (len == f->blocksize_1) return f->window[1]; - assert(0); return NULL; } @@ -3454,6 +3456,7 @@ static int vorbis_finish_frame(stb_vorbis *f, int len, int left, int right) if (f->previous_length) { int i,j, n = f->previous_length; float *w = get_window(f, n); + if (w == NULL) return 0; for (i=0; i < f->channels; ++i) { for (j=0; j < n; ++j) f->channel_buffers[i][left+j] = @@ -3695,6 +3698,7 @@ static int start_decoder(vorb *f) while (current_entry < c->entries) { int limit = c->entries - current_entry; int n = get_bits(f, ilog(limit)); + if (current_length >= 32) return error(f, VORBIS_invalid_setup); if (current_entry + n > (int) c->entries) { return error(f, VORBIS_invalid_setup); } memset(lengths + current_entry, current_length, n); current_entry += n; @@ -3798,7 +3802,9 @@ static int start_decoder(vorb *f) c->value_bits = get_bits(f, 4)+1; c->sequence_p = get_bits(f,1); if (c->lookup_type == 1) { - c->lookup_values = lookup1_values(c->entries, c->dimensions); + int values = lookup1_values(c->entries, c->dimensions); + if (values < 0) return error(f, VORBIS_invalid_setup); + c->lookup_values = (uint32) values; } else { c->lookup_values = c->entries * c->dimensions; } @@ -3934,6 +3940,9 @@ static int start_decoder(vorb *f) p[j].id = j; } qsort(p, g->values, sizeof(p[0]), point_compare); + for (j=0; j < g->values-1; ++j) + if (p[j].x == p[j+1].x) + return error(f, VORBIS_invalid_setup); for (j=0; j < g->values; ++j) g->sorted_order[j] = (uint8) p[j].id; // precompute the neighbors @@ -4020,6 +4029,7 @@ static int start_decoder(vorb *f) max_submaps = m->submaps; if (get_bits(f,1)) { m->coupling_steps = get_bits(f,8)+1; + if (m->coupling_steps > f->channels) return error(f, VORBIS_invalid_setup); for (k=0; k < m->coupling_steps; ++k) { m->chan[k].magnitude = get_bits(f, ilog(f->channels-1)); m->chan[k].angle = get_bits(f, ilog(f->channels-1)); @@ -5386,6 +5396,12 @@ int stb_vorbis_get_samples_float(stb_vorbis *f, int channels, float **buffer, in #endif // STB_VORBIS_NO_PULLDATA_API /* Version history + 1.17 - 2019-07-08 - fix CVE-2019-13217, -13218, -13219, -13220, -13221, -13222, -13223 + found with Mayhem by ForAllSecure + 1.16 - 2019-03-04 - fix warnings + 1.15 - 2019-02-07 - explicit failure if Ogg Skeleton data is found + 1.14 - 2018-02-11 - delete bogus dealloca usage + 1.13 - 2018-01-29 - fix truncation of last frame (hopefully) 1.12 - 2017-11-21 - limit residue begin/end to blocksize/2 to avoid large temp allocs in bad/corrupt files 1.11 - 2017-07-23 - fix MinGW compilation 1.10 - 2017-03-03 - more robust seeking; fix negative ilog(); clear error in open_memory diff --git a/thirdparty/nanosvg/nanosvg.h b/thirdparty/nanosvg/nanosvg.h index 8c8b061cd1..e5f6900614 100644 --- a/thirdparty/nanosvg/nanosvg.h +++ b/thirdparty/nanosvg/nanosvg.h @@ -1102,7 +1102,7 @@ static double nsvg__atof(const char* s) // Parse integer part if (nsvg__isdigit(*cur)) { // Parse digit sequence - intPart = (double)strtoll(cur, &end, 10); + intPart = strtoll(cur, &end, 10); if (cur != end) { res = (double)intPart; hasIntPart = 1; @@ -1130,7 +1130,7 @@ static double nsvg__atof(const char* s) // Parse optional exponent if (*cur == 'e' || *cur == 'E') { - int expPart = 0; + long expPart = 0; cur++; // skip 'E' expPart = strtol(cur, &end, 10); // Parse digit sequence with sign if (cur != end) { @@ -1168,7 +1168,7 @@ static const char* nsvg__parseNumber(const char* s, char* it, const int size) } } // exponent - if (*s == 'e' || *s == 'E') { + if ((*s == 'e' || *s == 'E') && (s[1] != 'm' && s[1] != 'x')) { if (i < last) it[i++] = *s; s++; if (*s == '-' || *s == '+') { diff --git a/thirdparty/opus/analysis.c b/thirdparty/opus/analysis.c index 663431a436..cb46dec582 100644 --- a/thirdparty/opus/analysis.c +++ b/thirdparty/opus/analysis.c @@ -29,20 +29,29 @@ #include "config.h" #endif +#define ANALYSIS_C + +#include <stdio.h> + +#include "mathops.h" #include "kiss_fft.h" #include "celt.h" #include "modes.h" #include "arch.h" #include "quant_bands.h" -#include <stdio.h> #include "analysis.h" #include "mlp.h" #include "stack_alloc.h" +#include "float_cast.h" #ifndef M_PI #define M_PI 3.141592653 #endif +#ifndef DISABLE_FLOAT_API + +#define TRANSITION_PENALTY 10 + static const float dct_table[128] = { 0.250000f, 0.250000f, 0.250000f, 0.250000f, 0.250000f, 0.250000f, 0.250000f, 0.250000f, 0.250000f, 0.250000f, 0.250000f, 0.250000f, 0.250000f, 0.250000f, 0.250000f, 0.250000f, @@ -96,52 +105,118 @@ static const float analysis_window[240] = { }; static const int tbands[NB_TBANDS+1] = { - 2, 4, 6, 8, 10, 12, 14, 16, 20, 24, 28, 32, 40, 48, 56, 68, 80, 96, 120 -}; - -static const int extra_bands[NB_TOT_BANDS+1] = { - 1, 2, 4, 6, 8, 10, 12, 14, 16, 20, 24, 28, 32, 40, 48, 56, 68, 80, 96, 120, 160, 200 + 4, 8, 12, 16, 20, 24, 28, 32, 40, 48, 56, 64, 80, 96, 112, 136, 160, 192, 240 }; -/*static const float tweight[NB_TBANDS+1] = { - .3, .4, .5, .6, .7, .8, .9, 1., 1., 1., 1., 1., 1., 1., .8, .7, .6, .5 -};*/ - #define NB_TONAL_SKIP_BANDS 9 -#define cA 0.43157974f -#define cB 0.67848403f -#define cC 0.08595542f -#define cE ((float)M_PI/2) -static OPUS_INLINE float fast_atan2f(float y, float x) { - float x2, y2; - /* Should avoid underflow on the values we'll get */ - if (ABS16(x)+ABS16(y)<1e-9f) +static opus_val32 silk_resampler_down2_hp( + opus_val32 *S, /* I/O State vector [ 2 ] */ + opus_val32 *out, /* O Output signal [ floor(len/2) ] */ + const opus_val32 *in, /* I Input signal [ len ] */ + int inLen /* I Number of input samples */ +) +{ + int k, len2 = inLen/2; + opus_val32 in32, out32, out32_hp, Y, X; + opus_val64 hp_ener = 0; + /* Internal variables and state are in Q10 format */ + for( k = 0; k < len2; k++ ) { + /* Convert to Q10 */ + in32 = in[ 2 * k ]; + + /* All-pass section for even input sample */ + Y = SUB32( in32, S[ 0 ] ); + X = MULT16_32_Q15(QCONST16(0.6074371f, 15), Y); + out32 = ADD32( S[ 0 ], X ); + S[ 0 ] = ADD32( in32, X ); + out32_hp = out32; + /* Convert to Q10 */ + in32 = in[ 2 * k + 1 ]; + + /* All-pass section for odd input sample, and add to output of previous section */ + Y = SUB32( in32, S[ 1 ] ); + X = MULT16_32_Q15(QCONST16(0.15063f, 15), Y); + out32 = ADD32( out32, S[ 1 ] ); + out32 = ADD32( out32, X ); + S[ 1 ] = ADD32( in32, X ); + + Y = SUB32( -in32, S[ 2 ] ); + X = MULT16_32_Q15(QCONST16(0.15063f, 15), Y); + out32_hp = ADD32( out32_hp, S[ 2 ] ); + out32_hp = ADD32( out32_hp, X ); + S[ 2 ] = ADD32( -in32, X ); + + hp_ener += out32_hp*(opus_val64)out32_hp; + /* Add, convert back to int16 and store to output */ + out[ k ] = HALF32(out32); + } +#ifdef FIXED_POINT + /* len2 can be up to 480, so we shift by 8 more to make it fit. */ + hp_ener = hp_ener >> (2*SIG_SHIFT + 8); +#endif + return (opus_val32)hp_ener; +} + +static opus_val32 downmix_and_resample(downmix_func downmix, const void *_x, opus_val32 *y, opus_val32 S[3], int subframe, int offset, int c1, int c2, int C, int Fs) +{ + VARDECL(opus_val32, tmp); + opus_val32 scale; + int j; + opus_val32 ret = 0; + SAVE_STACK; + + if (subframe==0) return 0; + if (Fs == 48000) { - x*=1e12f; - y*=1e12f; + subframe *= 2; + offset *= 2; + } else if (Fs == 16000) { + subframe = subframe*2/3; + offset = offset*2/3; } - x2 = x*x; - y2 = y*y; - if(x2<y2){ - float den = (y2 + cB*x2) * (y2 + cC*x2); - if (den!=0) - return -x*y*(y2 + cA*x2) / den + (y<0 ? -cE : cE); - else - return (y<0 ? -cE : cE); - }else{ - float den = (x2 + cB*y2) * (x2 + cC*y2); - if (den!=0) - return x*y*(x2 + cA*y2) / den + (y<0 ? -cE : cE) - (x*y<0 ? -cE : cE); - else - return (y<0 ? -cE : cE) - (x*y<0 ? -cE : cE); + ALLOC(tmp, subframe, opus_val32); + + downmix(_x, tmp, subframe, offset, c1, c2, C); +#ifdef FIXED_POINT + scale = (1<<SIG_SHIFT); +#else + scale = 1.f/32768; +#endif + if (c2==-2) + scale /= C; + else if (c2>-1) + scale /= 2; + for (j=0;j<subframe;j++) + tmp[j] *= scale; + if (Fs == 48000) + { + ret = silk_resampler_down2_hp(S, y, tmp, subframe); + } else if (Fs == 24000) { + OPUS_COPY(y, tmp, subframe); + } else if (Fs == 16000) { + VARDECL(opus_val32, tmp3x); + ALLOC(tmp3x, 3*subframe, opus_val32); + /* Don't do this at home! This resampler is horrible and it's only (barely) + usable for the purpose of the analysis because we don't care about all + the aliasing between 8 kHz and 12 kHz. */ + for (j=0;j<subframe;j++) + { + tmp3x[3*j] = tmp[j]; + tmp3x[3*j+1] = tmp[j]; + tmp3x[3*j+2] = tmp[j]; + } + silk_resampler_down2_hp(S, y, tmp3x, 3*subframe); } + RESTORE_STACK; + return ret; } -void tonality_analysis_init(TonalityAnalysisState *tonal) +void tonality_analysis_init(TonalityAnalysisState *tonal, opus_int32 Fs) { /* Initialize reusable fields. */ tonal->arch = opus_select_arch(); + tonal->Fs = Fs; /* Clear remaining fields. */ tonality_analysis_reset(tonal); } @@ -157,15 +232,34 @@ void tonality_get_info(TonalityAnalysisState *tonal, AnalysisInfo *info_out, int { int pos; int curr_lookahead; - float psum; + float tonality_max; + float tonality_avg; + int tonality_count; int i; + int pos0; + float prob_avg; + float prob_count; + float prob_min, prob_max; + float vad_prob; + int mpos, vpos; + int bandwidth_span; pos = tonal->read_pos; curr_lookahead = tonal->write_pos-tonal->read_pos; if (curr_lookahead<0) curr_lookahead += DETECT_SIZE; - if (len > 480 && pos != tonal->write_pos) + tonal->read_subframe += len/(tonal->Fs/400); + while (tonal->read_subframe>=8) + { + tonal->read_subframe -= 8; + tonal->read_pos++; + } + if (tonal->read_pos>=DETECT_SIZE) + tonal->read_pos-=DETECT_SIZE; + + /* On long frames, look at the second analysis window rather than the first. */ + if (len > tonal->Fs/50 && pos != tonal->write_pos) { pos++; if (pos==DETECT_SIZE) @@ -175,33 +269,178 @@ void tonality_get_info(TonalityAnalysisState *tonal, AnalysisInfo *info_out, int pos--; if (pos<0) pos = DETECT_SIZE-1; + pos0 = pos; OPUS_COPY(info_out, &tonal->info[pos], 1); - tonal->read_subframe += len/120; - while (tonal->read_subframe>=4) + if (!info_out->valid) + return; + tonality_max = tonality_avg = info_out->tonality; + tonality_count = 1; + /* Look at the neighbouring frames and pick largest bandwidth found (to be safe). */ + bandwidth_span = 6; + /* If possible, look ahead for a tone to compensate for the delay in the tone detector. */ + for (i=0;i<3;i++) { - tonal->read_subframe -= 4; - tonal->read_pos++; + pos++; + if (pos==DETECT_SIZE) + pos = 0; + if (pos == tonal->write_pos) + break; + tonality_max = MAX32(tonality_max, tonal->info[pos].tonality); + tonality_avg += tonal->info[pos].tonality; + tonality_count++; + info_out->bandwidth = IMAX(info_out->bandwidth, tonal->info[pos].bandwidth); + bandwidth_span--; } - if (tonal->read_pos>=DETECT_SIZE) - tonal->read_pos-=DETECT_SIZE; + pos = pos0; + /* Look back in time to see if any has a wider bandwidth than the current frame. */ + for (i=0;i<bandwidth_span;i++) + { + pos--; + if (pos < 0) + pos = DETECT_SIZE-1; + if (pos == tonal->write_pos) + break; + info_out->bandwidth = IMAX(info_out->bandwidth, tonal->info[pos].bandwidth); + } + info_out->tonality = MAX32(tonality_avg/tonality_count, tonality_max-.2f); + + mpos = vpos = pos0; + /* If we have enough look-ahead, compensate for the ~5-frame delay in the music prob and + ~1 frame delay in the VAD prob. */ + if (curr_lookahead > 15) + { + mpos += 5; + if (mpos>=DETECT_SIZE) + mpos -= DETECT_SIZE; + vpos += 1; + if (vpos>=DETECT_SIZE) + vpos -= DETECT_SIZE; + } + + /* The following calculations attempt to minimize a "badness function" + for the transition. When switching from speech to music, the badness + of switching at frame k is + b_k = S*v_k + \sum_{i=0}^{k-1} v_i*(p_i - T) + where + v_i is the activity probability (VAD) at frame i, + p_i is the music probability at frame i + T is the probability threshold for switching + S is the penalty for switching during active audio rather than silence + the current frame has index i=0 + + Rather than apply badness to directly decide when to switch, what we compute + instead is the threshold for which the optimal switching point is now. When + considering whether to switch now (frame 0) or at frame k, we have: + S*v_0 = S*v_k + \sum_{i=0}^{k-1} v_i*(p_i - T) + which gives us: + T = ( \sum_{i=0}^{k-1} v_i*p_i + S*(v_k-v_0) ) / ( \sum_{i=0}^{k-1} v_i ) + We take the min threshold across all positive values of k (up to the maximum + amount of lookahead we have) to give us the threshold for which the current + frame is the optimal switch point. + + The last step is that we need to consider whether we want to switch at all. + For that we use the average of the music probability over the entire window. + If the threshold is higher than that average we're not going to + switch, so we compute a min with the average as well. The result of all these + min operations is music_prob_min, which gives the threshold for switching to music + if we're currently encoding for speech. + + We do the exact opposite to compute music_prob_max which is used for switching + from music to speech. + */ + prob_min = 1.f; + prob_max = 0.f; + vad_prob = tonal->info[vpos].activity_probability; + prob_count = MAX16(.1f, vad_prob); + prob_avg = MAX16(.1f, vad_prob)*tonal->info[mpos].music_prob; + while (1) + { + float pos_vad; + mpos++; + if (mpos==DETECT_SIZE) + mpos = 0; + if (mpos == tonal->write_pos) + break; + vpos++; + if (vpos==DETECT_SIZE) + vpos = 0; + if (vpos == tonal->write_pos) + break; + pos_vad = tonal->info[vpos].activity_probability; + prob_min = MIN16((prob_avg - TRANSITION_PENALTY*(vad_prob - pos_vad))/prob_count, prob_min); + prob_max = MAX16((prob_avg + TRANSITION_PENALTY*(vad_prob - pos_vad))/prob_count, prob_max); + prob_count += MAX16(.1f, pos_vad); + prob_avg += MAX16(.1f, pos_vad)*tonal->info[mpos].music_prob; + } + info_out->music_prob = prob_avg/prob_count; + prob_min = MIN16(prob_avg/prob_count, prob_min); + prob_max = MAX16(prob_avg/prob_count, prob_max); + prob_min = MAX16(prob_min, 0.f); + prob_max = MIN16(prob_max, 1.f); + + /* If we don't have enough look-ahead, do our best to make a decent decision. */ + if (curr_lookahead < 10) + { + float pmin, pmax; + pmin = prob_min; + pmax = prob_max; + pos = pos0; + /* Look for min/max in the past. */ + for (i=0;i<IMIN(tonal->count-1, 15);i++) + { + pos--; + if (pos < 0) + pos = DETECT_SIZE-1; + pmin = MIN16(pmin, tonal->info[pos].music_prob); + pmax = MAX16(pmax, tonal->info[pos].music_prob); + } + /* Bias against switching on active audio. */ + pmin = MAX16(0.f, pmin - .1f*vad_prob); + pmax = MIN16(1.f, pmax + .1f*vad_prob); + prob_min += (1.f-.1f*curr_lookahead)*(pmin - prob_min); + prob_max += (1.f-.1f*curr_lookahead)*(pmax - prob_max); + } + info_out->music_prob_min = prob_min; + info_out->music_prob_max = prob_max; - /* Compensate for the delay in the features themselves. - FIXME: Need a better estimate the 10 I just made up */ - curr_lookahead = IMAX(curr_lookahead-10, 0); - - psum=0; - /* Summing the probability of transition patterns that involve music at - time (DETECT_SIZE-curr_lookahead-1) */ - for (i=0;i<DETECT_SIZE-curr_lookahead;i++) - psum += tonal->pmusic[i]; - for (;i<DETECT_SIZE;i++) - psum += tonal->pspeech[i]; - psum = psum*tonal->music_confidence + (1-psum)*tonal->speech_confidence; - /*printf("%f %f %f\n", psum, info_out->music_prob, info_out->tonality);*/ - - info_out->music_prob = psum; + /* printf("%f %f %f %f %f\n", prob_min, prob_max, prob_avg/prob_count, vad_prob, info_out->music_prob); */ } +static const float std_feature_bias[9] = { + 5.684947f, 3.475288f, 1.770634f, 1.599784f, 3.773215f, + 2.163313f, 1.260756f, 1.116868f, 1.918795f +}; + +#define LEAKAGE_OFFSET 2.5f +#define LEAKAGE_SLOPE 2.f + +#ifdef FIXED_POINT +/* For fixed-point, the input is +/-2^15 shifted up by SIG_SHIFT, so we need to + compensate for that in the energy. */ +#define SCALE_COMPENS (1.f/((opus_int32)1<<(15+SIG_SHIFT))) +#define SCALE_ENER(e) ((SCALE_COMPENS*SCALE_COMPENS)*(e)) +#else +#define SCALE_ENER(e) (e) +#endif + +#ifdef FIXED_POINT +static int is_digital_silence32(const opus_val32* pcm, int frame_size, int channels, int lsb_depth) +{ + int silence = 0; + opus_val32 sample_max = 0; +#ifdef MLP_TRAINING + return 0; +#endif + sample_max = celt_maxabs32(pcm, frame_size*channels); + + silence = (sample_max == 0); + (void)lsb_depth; + return silence; +} +#else +#define is_digital_silence32(pcm, frame_size, channels, lsb_depth) is_digital_silence(pcm, frame_size, channels, lsb_depth) +#endif + static void tonality_analysis(TonalityAnalysisState *tonal, const CELTMode *celt_mode, const void *x, int len, int offset, int c1, int c2, int C, int lsb_depth, downmix_func downmix) { int i, b; @@ -230,24 +469,50 @@ static void tonality_analysis(TonalityAnalysisState *tonal, const CELTMode *celt float alpha, alphaE, alphaE2; float frame_loudness; float bandwidth_mask; + int is_masked[NB_TBANDS+1]; int bandwidth=0; float maxE = 0; float noise_floor; int remaining; AnalysisInfo *info; + float hp_ener; + float tonality2[240]; + float midE[8]; + float spec_variability=0; + float band_log2[NB_TBANDS+1]; + float leakage_from[NB_TBANDS+1]; + float leakage_to[NB_TBANDS+1]; + float layer_out[MAX_NEURONS]; + float below_max_pitch; + float above_max_pitch; + int is_silence; SAVE_STACK; - tonal->last_transition++; - alpha = 1.f/IMIN(20, 1+tonal->count); - alphaE = 1.f/IMIN(50, 1+tonal->count); - alphaE2 = 1.f/IMIN(1000, 1+tonal->count); + if (!tonal->initialized) + { + tonal->mem_fill = 240; + tonal->initialized = 1; + } + alpha = 1.f/IMIN(10, 1+tonal->count); + alphaE = 1.f/IMIN(25, 1+tonal->count); + /* Noise floor related decay for bandwidth detection: -2.2 dB/second */ + alphaE2 = 1.f/IMIN(100, 1+tonal->count); + if (tonal->count <= 1) alphaE2 = 1; + + if (tonal->Fs == 48000) + { + /* len and offset are now at 24 kHz. */ + len/= 2; + offset /= 2; + } else if (tonal->Fs == 16000) { + len = 3*len/2; + offset = 3*offset/2; + } - if (tonal->count<4) - tonal->music_prob = .5; kfft = celt_mode->mdct.kfft[0]; - if (tonal->count==0) - tonal->mem_fill = 240; - downmix(x, &tonal->inmem[tonal->mem_fill], IMIN(len, ANALYSIS_BUF_SIZE-tonal->mem_fill), offset, c1, c2, C); + tonal->hp_ener_accum += (float)downmix_and_resample(downmix, x, + &tonal->inmem[tonal->mem_fill], tonal->downmix_state, + IMIN(len, ANALYSIS_BUF_SIZE-tonal->mem_fill), offset, c1, c2, C, tonal->Fs); if (tonal->mem_fill+len < ANALYSIS_BUF_SIZE) { tonal->mem_fill += len; @@ -255,10 +520,13 @@ static void tonality_analysis(TonalityAnalysisState *tonal, const CELTMode *celt RESTORE_STACK; return; } + hp_ener = tonal->hp_ener_accum; info = &tonal->info[tonal->write_pos++]; if (tonal->write_pos>=DETECT_SIZE) tonal->write_pos-=DETECT_SIZE; + is_silence = is_digital_silence32(tonal->inmem, ANALYSIS_BUF_SIZE, 1, lsb_depth); + ALLOC(in, 480, kiss_fft_cpx); ALLOC(out, 480, kiss_fft_cpx); ALLOC(tonality, 240, float); @@ -273,8 +541,20 @@ static void tonality_analysis(TonalityAnalysisState *tonal, const CELTMode *celt } OPUS_MOVE(tonal->inmem, tonal->inmem+ANALYSIS_BUF_SIZE-240, 240); remaining = len - (ANALYSIS_BUF_SIZE-tonal->mem_fill); - downmix(x, &tonal->inmem[240], remaining, offset+ANALYSIS_BUF_SIZE-tonal->mem_fill, c1, c2, C); + tonal->hp_ener_accum = (float)downmix_and_resample(downmix, x, + &tonal->inmem[240], tonal->downmix_state, remaining, + offset+ANALYSIS_BUF_SIZE-tonal->mem_fill, c1, c2, C, tonal->Fs); tonal->mem_fill = 240 + remaining; + if (is_silence) + { + /* On silence, copy the previous analysis. */ + int prev_pos = tonal->write_pos-2; + if (prev_pos < 0) + prev_pos += DETECT_SIZE; + OPUS_COPY(info, &tonal->info[prev_pos], 1); + RESTORE_STACK; + return; + } opus_fft(kfft, in, out, tonal->arch); #ifndef FIXED_POINT /* If there's any NaN on the input, the entire output will be NaN, so we only need to check one value. */ @@ -305,24 +585,31 @@ static void tonality_analysis(TonalityAnalysisState *tonal, const CELTMode *celt d_angle2 = angle2 - angle; d2_angle2 = d_angle2 - d_angle; - mod1 = d2_angle - (float)floor(.5+d2_angle); + mod1 = d2_angle - (float)float2int(d2_angle); noisiness[i] = ABS16(mod1); mod1 *= mod1; mod1 *= mod1; - mod2 = d2_angle2 - (float)floor(.5+d2_angle2); + mod2 = d2_angle2 - (float)float2int(d2_angle2); noisiness[i] += ABS16(mod2); mod2 *= mod2; mod2 *= mod2; - avg_mod = .25f*(d2A[i]+2.f*mod1+mod2); + avg_mod = .25f*(d2A[i]+mod1+2*mod2); + /* This introduces an extra delay of 2 frames in the detection. */ tonality[i] = 1.f/(1.f+40.f*16.f*pi4*avg_mod)-.015f; + /* No delay on this detection, but it's less reliable. */ + tonality2[i] = 1.f/(1.f+40.f*16.f*pi4*mod2)-.015f; A[i] = angle2; dA[i] = d_angle2; d2A[i] = mod2; } - + for (i=2;i<N2-1;i++) + { + float tt = MIN32(tonality2[i], MAX32(tonality2[i-1], tonality2[i+1])); + tonality[i] = .9f*MAX32(tonality[i], tt-.1f); + } frame_tonality = 0; max_frame_tonality = 0; /*tw_sum = 0;*/ @@ -339,6 +626,22 @@ static void tonality_analysis(TonalityAnalysisState *tonal, const CELTMode *celt } relativeE = 0; frame_loudness = 0; + /* The energy of the very first band is special because of DC. */ + { + float E = 0; + float X1r, X2r; + X1r = 2*(float)out[0].r; + X2r = 2*(float)out[0].i; + E = X1r*X1r + X2r*X2r; + for (i=1;i<4;i++) + { + float binE = out[i].r*(float)out[i].r + out[N-i].r*(float)out[N-i].r + + out[i].i*(float)out[i].i + out[N-i].i*(float)out[N-i].i; + E += binE; + } + E = SCALE_ENER(E); + band_log2[0] = .5f*1.442695f*(float)log(E+1e-10f); + } for (b=0;b<NB_TBANDS;b++) { float E=0, tE=0, nE=0; @@ -348,12 +651,9 @@ static void tonality_analysis(TonalityAnalysisState *tonal, const CELTMode *celt { float binE = out[i].r*(float)out[i].r + out[N-i].r*(float)out[N-i].r + out[i].i*(float)out[i].i + out[N-i].i*(float)out[N-i].i; -#ifdef FIXED_POINT - /* FIXME: It's probably best to change the BFCC filter initial state instead */ - binE *= 5.55e-17f; -#endif + binE = SCALE_ENER(binE); E += binE; - tE += binE*tonality[i]; + tE += binE*MAX32(0, tonality[i]); nE += binE*2.f*(.5f-noisiness[i]); } #ifndef FIXED_POINT @@ -371,14 +671,27 @@ static void tonality_analysis(TonalityAnalysisState *tonal, const CELTMode *celt frame_loudness += (float)sqrt(E+1e-10f); logE[b] = (float)log(E+1e-10f); - tonal->lowE[b] = MIN32(logE[b], tonal->lowE[b]+.01f); - tonal->highE[b] = MAX32(logE[b], tonal->highE[b]-.1f); - if (tonal->highE[b] < tonal->lowE[b]+1.f) + band_log2[b+1] = .5f*1.442695f*(float)log(E+1e-10f); + tonal->logE[tonal->E_count][b] = logE[b]; + if (tonal->count==0) + tonal->highE[b] = tonal->lowE[b] = logE[b]; + if (tonal->highE[b] > tonal->lowE[b] + 7.5) { - tonal->highE[b]+=.5f; - tonal->lowE[b]-=.5f; + if (tonal->highE[b] - logE[b] > logE[b] - tonal->lowE[b]) + tonal->highE[b] -= .01f; + else + tonal->lowE[b] += .01f; } - relativeE += (logE[b]-tonal->lowE[b])/(1e-15f+tonal->highE[b]-tonal->lowE[b]); + if (logE[b] > tonal->highE[b]) + { + tonal->highE[b] = logE[b]; + tonal->lowE[b] = MAX32(tonal->highE[b]-15, tonal->lowE[b]); + } else if (logE[b] < tonal->lowE[b]) + { + tonal->lowE[b] = logE[b]; + tonal->highE[b] = MIN32(tonal->lowE[b]+15, tonal->highE[b]); + } + relativeE += (logE[b]-tonal->lowE[b])/(1e-5f + (tonal->highE[b]-tonal->lowE[b])); L1=L2=0; for (i=0;i<NB_FRAMES;i++) @@ -410,45 +723,135 @@ static void tonality_analysis(TonalityAnalysisState *tonal, const CELTMode *celt tonal->prev_band_tonality[b] = band_tonality[b]; } + leakage_from[0] = band_log2[0]; + leakage_to[0] = band_log2[0] - LEAKAGE_OFFSET; + for (b=1;b<NB_TBANDS+1;b++) + { + float leak_slope = LEAKAGE_SLOPE*(tbands[b]-tbands[b-1])/4; + leakage_from[b] = MIN16(leakage_from[b-1]+leak_slope, band_log2[b]); + leakage_to[b] = MAX16(leakage_to[b-1]-leak_slope, band_log2[b]-LEAKAGE_OFFSET); + } + for (b=NB_TBANDS-2;b>=0;b--) + { + float leak_slope = LEAKAGE_SLOPE*(tbands[b+1]-tbands[b])/4; + leakage_from[b] = MIN16(leakage_from[b+1]+leak_slope, leakage_from[b]); + leakage_to[b] = MAX16(leakage_to[b+1]-leak_slope, leakage_to[b]); + } + celt_assert(NB_TBANDS+1 <= LEAK_BANDS); + for (b=0;b<NB_TBANDS+1;b++) + { + /* leak_boost[] is made up of two terms. The first, based on leakage_to[], + represents the boost needed to overcome the amount of analysis leakage + cause in a weaker band b by louder neighbouring bands. + The second, based on leakage_from[], applies to a loud band b for + which the quantization noise causes synthesis leakage to the weaker + neighbouring bands. */ + float boost = MAX16(0, leakage_to[b] - band_log2[b]) + + MAX16(0, band_log2[b] - (leakage_from[b]+LEAKAGE_OFFSET)); + info->leak_boost[b] = IMIN(255, (int)floor(.5 + 64.f*boost)); + } + for (;b<LEAK_BANDS;b++) info->leak_boost[b] = 0; + + for (i=0;i<NB_FRAMES;i++) + { + int j; + float mindist = 1e15f; + for (j=0;j<NB_FRAMES;j++) + { + int k; + float dist=0; + for (k=0;k<NB_TBANDS;k++) + { + float tmp; + tmp = tonal->logE[i][k] - tonal->logE[j][k]; + dist += tmp*tmp; + } + if (j!=i) + mindist = MIN32(mindist, dist); + } + spec_variability += mindist; + } + spec_variability = (float)sqrt(spec_variability/NB_FRAMES/NB_TBANDS); bandwidth_mask = 0; bandwidth = 0; maxE = 0; noise_floor = 5.7e-4f/(1<<(IMAX(0,lsb_depth-8))); -#ifdef FIXED_POINT - noise_floor *= 1<<(15+SIG_SHIFT); -#endif noise_floor *= noise_floor; - for (b=0;b<NB_TOT_BANDS;b++) + below_max_pitch=0; + above_max_pitch=0; + for (b=0;b<NB_TBANDS;b++) { float E=0; + float Em; int band_start, band_end; /* Keep a margin of 300 Hz for aliasing */ - band_start = extra_bands[b]; - band_end = extra_bands[b+1]; + band_start = tbands[b]; + band_end = tbands[b+1]; for (i=band_start;i<band_end;i++) { float binE = out[i].r*(float)out[i].r + out[N-i].r*(float)out[N-i].r + out[i].i*(float)out[i].i + out[N-i].i*(float)out[N-i].i; E += binE; } + E = SCALE_ENER(E); maxE = MAX32(maxE, E); + if (band_start < 64) + { + below_max_pitch += E; + } else { + above_max_pitch += E; + } tonal->meanE[b] = MAX32((1-alphaE2)*tonal->meanE[b], E); - E = MAX32(E, tonal->meanE[b]); - /* Use a simple follower with 13 dB/Bark slope for spreading function */ - bandwidth_mask = MAX32(.05f*bandwidth_mask, E); + Em = MAX32(E, tonal->meanE[b]); /* Consider the band "active" only if all these conditions are met: - 1) less than 10 dB below the simple follower - 2) less than 90 dB below the peak band (maximal masking possible considering + 1) less than 90 dB below the peak band (maximal masking possible considering both the ATH and the loudness-dependent slope of the spreading function) - 3) above the PCM quantization noise floor + 2) above the PCM quantization noise floor + We use b+1 because the first CELT band isn't included in tbands[] */ - if (E>.1*bandwidth_mask && E*1e9f > maxE && E > noise_floor*(band_end-band_start)) - bandwidth = b; + if (E*1e9f > maxE && (Em > 3*noise_floor*(band_end-band_start) || E > noise_floor*(band_end-band_start))) + bandwidth = b+1; + /* Check if the band is masked (see below). */ + is_masked[b] = E < (tonal->prev_bandwidth >= b+1 ? .01f : .05f)*bandwidth_mask; + /* Use a simple follower with 13 dB/Bark slope for spreading function. */ + bandwidth_mask = MAX32(.05f*bandwidth_mask, E); } + /* Special case for the last two bands, for which we don't have spectrum but only + the energy above 12 kHz. The difficulty here is that the high-pass we use + leaks some LF energy, so we need to increase the threshold without accidentally cutting + off the band. */ + if (tonal->Fs == 48000) { + float noise_ratio; + float Em; + float E = hp_ener*(1.f/(60*60)); + noise_ratio = tonal->prev_bandwidth==20 ? 10.f : 30.f; + +#ifdef FIXED_POINT + /* silk_resampler_down2_hp() shifted right by an extra 8 bits. */ + E *= 256.f*(1.f/Q15ONE)*(1.f/Q15ONE); +#endif + above_max_pitch += E; + tonal->meanE[b] = MAX32((1-alphaE2)*tonal->meanE[b], E); + Em = MAX32(E, tonal->meanE[b]); + if (Em > 3*noise_ratio*noise_floor*160 || E > noise_ratio*noise_floor*160) + bandwidth = 20; + /* Check if the band is masked (see below). */ + is_masked[b] = E < (tonal->prev_bandwidth == 20 ? .01f : .05f)*bandwidth_mask; + } + if (above_max_pitch > below_max_pitch) + info->max_pitch_ratio = below_max_pitch/above_max_pitch; + else + info->max_pitch_ratio = 1; + /* In some cases, resampling aliasing can create a small amount of energy in the first band + being cut. So if the last band is masked, we don't include it. */ + if (bandwidth == 20 && is_masked[NB_TBANDS]) + bandwidth-=2; + else if (bandwidth > 0 && bandwidth <= NB_TBANDS && is_masked[bandwidth-1]) + bandwidth--; if (tonal->count<=2) bandwidth = 20; frame_loudness = 20*(float)log10(frame_loudness); - tonal->Etracker = MAX32(tonal->Etracker-.03f, frame_loudness); + tonal->Etracker = MAX32(tonal->Etracker-.003f, frame_loudness); tonal->lowECount *= (1-alphaE); if (frame_loudness < tonal->Etracker-30) tonal->lowECount += alphaE; @@ -460,11 +863,18 @@ static void tonality_analysis(TonalityAnalysisState *tonal, const CELTMode *celt sum += dct_table[i*16+b]*logE[b]; BFCC[i] = sum; } + for (i=0;i<8;i++) + { + float sum=0; + for (b=0;b<16;b++) + sum += dct_table[i*16+b]*.5f*(tonal->highE[b]+tonal->lowE[b]); + midE[i] = sum; + } frame_stationarity /= NB_TBANDS; relativeE /= NB_TBANDS; if (tonal->count<10) - relativeE = .5; + relativeE = .5f; frame_noisiness /= NB_TBANDS; #if 1 info->activity = frame_noisiness + (1-frame_noisiness)*relativeE; @@ -479,7 +889,7 @@ static void tonality_analysis(TonalityAnalysisState *tonal, const CELTMode *celt info->tonality_slope = slope; tonal->E_count = (tonal->E_count+1)%NB_FRAMES; - tonal->count++; + tonal->count = IMIN(tonal->count+1, ANALYSIS_COUNT_MAX); info->tonality = frame_tonality; for (i=0;i<4;i++) @@ -498,6 +908,8 @@ static void tonality_analysis(TonalityAnalysisState *tonal, const CELTMode *celt for (i=0;i<9;i++) tonal->std[i] = (1-alpha)*tonal->std[i] + alpha*features[i]*features[i]; } + for (i=0;i<4;i++) + features[i] = BFCC[i]-midE[i]; for (i=0;i<8;i++) { @@ -507,136 +919,31 @@ static void tonality_analysis(TonalityAnalysisState *tonal, const CELTMode *celt tonal->mem[i] = BFCC[i]; } for (i=0;i<9;i++) - features[11+i] = (float)sqrt(tonal->std[i]); - features[20] = info->tonality; - features[21] = info->activity; - features[22] = frame_stationarity; - features[23] = info->tonality_slope; - features[24] = tonal->lowECount; - -#ifndef DISABLE_FLOAT_API - mlp_process(&net, features, frame_probs); - frame_probs[0] = .5f*(frame_probs[0]+1); - /* Curve fitting between the MLP probability and the actual probability */ - frame_probs[0] = .01f + 1.21f*frame_probs[0]*frame_probs[0] - .23f*(float)pow(frame_probs[0], 10); - /* Probability of active audio (as opposed to silence) */ - frame_probs[1] = .5f*frame_probs[1]+.5f; - /* Consider that silence has a 50-50 probability. */ - frame_probs[0] = frame_probs[1]*frame_probs[0] + (1-frame_probs[1])*.5f; - - /*printf("%f %f ", frame_probs[0], frame_probs[1]);*/ - { - /* Probability of state transition */ - float tau; - /* Represents independence of the MLP probabilities, where - beta=1 means fully independent. */ - float beta; - /* Denormalized probability of speech (p0) and music (p1) after update */ - float p0, p1; - /* Probabilities for "all speech" and "all music" */ - float s0, m0; - /* Probability sum for renormalisation */ - float psum; - /* Instantaneous probability of speech and music, with beta pre-applied. */ - float speech0; - float music0; - float p, q; - - /* One transition every 3 minutes of active audio */ - tau = .00005f*frame_probs[1]; - /* Adapt beta based on how "unexpected" the new prob is */ - p = MAX16(.05f,MIN16(.95f,frame_probs[0])); - q = MAX16(.05f,MIN16(.95f,tonal->music_prob)); - beta = .01f+.05f*ABS16(p-q)/(p*(1-q)+q*(1-p)); - /* p0 and p1 are the probabilities of speech and music at this frame - using only information from previous frame and applying the - state transition model */ - p0 = (1-tonal->music_prob)*(1-tau) + tonal->music_prob *tau; - p1 = tonal->music_prob *(1-tau) + (1-tonal->music_prob)*tau; - /* We apply the current probability with exponent beta to work around - the fact that the probability estimates aren't independent. */ - p0 *= (float)pow(1-frame_probs[0], beta); - p1 *= (float)pow(frame_probs[0], beta); - /* Normalise the probabilities to get the Marokv probability of music. */ - tonal->music_prob = p1/(p0+p1); - info->music_prob = tonal->music_prob; - - /* This chunk of code deals with delayed decision. */ - psum=1e-20f; - /* Instantaneous probability of speech and music, with beta pre-applied. */ - speech0 = (float)pow(1-frame_probs[0], beta); - music0 = (float)pow(frame_probs[0], beta); - if (tonal->count==1) - { - tonal->pspeech[0]=.5; - tonal->pmusic [0]=.5; - } - /* Updated probability of having only speech (s0) or only music (m0), - before considering the new observation. */ - s0 = tonal->pspeech[0] + tonal->pspeech[1]; - m0 = tonal->pmusic [0] + tonal->pmusic [1]; - /* Updates s0 and m0 with instantaneous probability. */ - tonal->pspeech[0] = s0*(1-tau)*speech0; - tonal->pmusic [0] = m0*(1-tau)*music0; - /* Propagate the transition probabilities */ - for (i=1;i<DETECT_SIZE-1;i++) - { - tonal->pspeech[i] = tonal->pspeech[i+1]*speech0; - tonal->pmusic [i] = tonal->pmusic [i+1]*music0; - } - /* Probability that the latest frame is speech, when all the previous ones were music. */ - tonal->pspeech[DETECT_SIZE-1] = m0*tau*speech0; - /* Probability that the latest frame is music, when all the previous ones were speech. */ - tonal->pmusic [DETECT_SIZE-1] = s0*tau*music0; - - /* Renormalise probabilities to 1 */ - for (i=0;i<DETECT_SIZE;i++) - psum += tonal->pspeech[i] + tonal->pmusic[i]; - psum = 1.f/psum; - for (i=0;i<DETECT_SIZE;i++) - { - tonal->pspeech[i] *= psum; - tonal->pmusic [i] *= psum; - } - psum = tonal->pmusic[0]; - for (i=1;i<DETECT_SIZE;i++) - psum += tonal->pspeech[i]; - - /* Estimate our confidence in the speech/music decisions */ - if (frame_probs[1]>.75) - { - if (tonal->music_prob>.9) - { - float adapt; - adapt = 1.f/(++tonal->music_confidence_count); - tonal->music_confidence_count = IMIN(tonal->music_confidence_count, 500); - tonal->music_confidence += adapt*MAX16(-.2f,frame_probs[0]-tonal->music_confidence); - } - if (tonal->music_prob<.1) - { - float adapt; - adapt = 1.f/(++tonal->speech_confidence_count); - tonal->speech_confidence_count = IMIN(tonal->speech_confidence_count, 500); - tonal->speech_confidence += adapt*MIN16(.2f,frame_probs[0]-tonal->speech_confidence); - } - } else { - if (tonal->music_confidence_count==0) - tonal->music_confidence = .9f; - if (tonal->speech_confidence_count==0) - tonal->speech_confidence = .1f; - } - } - if (tonal->last_music != (tonal->music_prob>.5f)) - tonal->last_transition=0; - tonal->last_music = tonal->music_prob>.5f; -#else - info->music_prob = 0; -#endif - /*for (i=0;i<25;i++) + features[11+i] = (float)sqrt(tonal->std[i]) - std_feature_bias[i]; + features[18] = spec_variability - 0.78f; + features[20] = info->tonality - 0.154723f; + features[21] = info->activity - 0.724643f; + features[22] = frame_stationarity - 0.743717f; + features[23] = info->tonality_slope + 0.069216f; + features[24] = tonal->lowECount - 0.067930f; + + compute_dense(&layer0, layer_out, features); + compute_gru(&layer1, tonal->rnn_state, layer_out); + compute_dense(&layer2, frame_probs, tonal->rnn_state); + + /* Probability of speech or music vs noise */ + info->activity_probability = frame_probs[1]; + info->music_prob = frame_probs[0]; + + /*printf("%f %f %f\n", frame_probs[0], frame_probs[1], info->music_prob);*/ +#ifdef MLP_TRAINING + for (i=0;i<25;i++) printf("%f ", features[i]); - printf("\n");*/ + printf("\n"); +#endif info->bandwidth = bandwidth; + tonal->prev_bandwidth = bandwidth; /*printf("%d %d\n", info->bandwidth, info->opus_bandwidth);*/ info->noisiness = frame_noisiness; info->valid = 1; @@ -650,23 +957,25 @@ void run_analysis(TonalityAnalysisState *analysis, const CELTMode *celt_mode, co int offset; int pcm_len; + analysis_frame_size -= analysis_frame_size&1; if (analysis_pcm != NULL) { /* Avoid overflow/wrap-around of the analysis buffer */ - analysis_frame_size = IMIN((DETECT_SIZE-5)*Fs/100, analysis_frame_size); + analysis_frame_size = IMIN((DETECT_SIZE-5)*Fs/50, analysis_frame_size); pcm_len = analysis_frame_size - analysis->analysis_offset; offset = analysis->analysis_offset; - do { - tonality_analysis(analysis, celt_mode, analysis_pcm, IMIN(480, pcm_len), offset, c1, c2, C, lsb_depth, downmix); - offset += 480; - pcm_len -= 480; - } while (pcm_len>0); + while (pcm_len>0) { + tonality_analysis(analysis, celt_mode, analysis_pcm, IMIN(Fs/50, pcm_len), offset, c1, c2, C, lsb_depth, downmix); + offset += Fs/50; + pcm_len -= Fs/50; + } analysis->analysis_offset = analysis_frame_size; analysis->analysis_offset -= frame_size; } - analysis_info->valid = 0; tonality_get_info(analysis, analysis_info, frame_size); } + +#endif /* DISABLE_FLOAT_API */ diff --git a/thirdparty/opus/analysis.h b/thirdparty/opus/analysis.h index 9eae56a525..0b66555f21 100644 --- a/thirdparty/opus/analysis.h +++ b/thirdparty/opus/analysis.h @@ -30,16 +30,24 @@ #include "celt.h" #include "opus_private.h" +#include "mlp.h" #define NB_FRAMES 8 #define NB_TBANDS 18 -#define NB_TOT_BANDS 21 -#define ANALYSIS_BUF_SIZE 720 /* 15 ms at 48 kHz */ +#define ANALYSIS_BUF_SIZE 720 /* 30 ms at 24 kHz */ -#define DETECT_SIZE 200 +/* At that point we can stop counting frames because it no longer matters. */ +#define ANALYSIS_COUNT_MAX 10000 + +#define DETECT_SIZE 100 + +/* Uncomment this to print the MLP features on stdout. */ +/*#define MLP_TRAINING*/ typedef struct { int arch; + int application; + opus_int32 Fs; #define TONALITY_ANALYSIS_RESET_START angle float angle[240]; float d_angle[240]; @@ -48,35 +56,27 @@ typedef struct { int mem_fill; /* number of usable samples in the buffer */ float prev_band_tonality[NB_TBANDS]; float prev_tonality; + int prev_bandwidth; float E[NB_FRAMES][NB_TBANDS]; + float logE[NB_FRAMES][NB_TBANDS]; float lowE[NB_TBANDS]; float highE[NB_TBANDS]; - float meanE[NB_TOT_BANDS]; + float meanE[NB_TBANDS+1]; float mem[32]; float cmean[8]; float std[9]; - float music_prob; float Etracker; float lowECount; int E_count; - int last_music; - int last_transition; int count; - float subframe_mem[3]; int analysis_offset; - /** Probability of having speech for time i to DETECT_SIZE-1 (and music before). - pspeech[0] is the probability that all frames in the window are speech. */ - float pspeech[DETECT_SIZE]; - /** Probability of having music for time i to DETECT_SIZE-1 (and speech before). - pmusic[0] is the probability that all frames in the window are music. */ - float pmusic[DETECT_SIZE]; - float speech_confidence; - float music_confidence; - int speech_confidence_count; - int music_confidence_count; int write_pos; int read_pos; int read_subframe; + float hp_ener_accum; + int initialized; + float rnn_state[MAX_NEURONS]; + opus_val32 downmix_state[3]; AnalysisInfo info[DETECT_SIZE]; } TonalityAnalysisState; @@ -86,7 +86,7 @@ typedef struct { * not be repeated every analysis step. No allocated memory is retained * by the state struct, so no cleanup call is required. */ -void tonality_analysis_init(TonalityAnalysisState *analysis); +void tonality_analysis_init(TonalityAnalysisState *analysis, opus_int32 Fs); /** Reset a TonalityAnalysisState stuct. * diff --git a/thirdparty/opus/celt/_kiss_fft_guts.h b/thirdparty/opus/celt/_kiss_fft_guts.h index 5e3d58fd66..17392b3e90 100644 --- a/thirdparty/opus/celt/_kiss_fft_guts.h +++ b/thirdparty/opus/celt/_kiss_fft_guts.h @@ -58,12 +58,12 @@ # define S_MUL(a,b) MULT16_32_Q15(b, a) # define C_MUL(m,a,b) \ - do{ (m).r = SUB32(S_MUL((a).r,(b).r) , S_MUL((a).i,(b).i)); \ - (m).i = ADD32(S_MUL((a).r,(b).i) , S_MUL((a).i,(b).r)); }while(0) + do{ (m).r = SUB32_ovflw(S_MUL((a).r,(b).r) , S_MUL((a).i,(b).i)); \ + (m).i = ADD32_ovflw(S_MUL((a).r,(b).i) , S_MUL((a).i,(b).r)); }while(0) # define C_MULC(m,a,b) \ - do{ (m).r = ADD32(S_MUL((a).r,(b).r) , S_MUL((a).i,(b).i)); \ - (m).i = SUB32(S_MUL((a).i,(b).r) , S_MUL((a).r,(b).i)); }while(0) + do{ (m).r = ADD32_ovflw(S_MUL((a).r,(b).r) , S_MUL((a).i,(b).i)); \ + (m).i = SUB32_ovflw(S_MUL((a).i,(b).r) , S_MUL((a).r,(b).i)); }while(0) # define C_MULBYSCALAR( c, s ) \ do{ (c).r = S_MUL( (c).r , s ) ;\ @@ -77,17 +77,17 @@ DIVSCALAR( (c).i , div); }while (0) #define C_ADD( res, a,b)\ - do {(res).r=ADD32((a).r,(b).r); (res).i=ADD32((a).i,(b).i); \ + do {(res).r=ADD32_ovflw((a).r,(b).r); (res).i=ADD32_ovflw((a).i,(b).i); \ }while(0) #define C_SUB( res, a,b)\ - do {(res).r=SUB32((a).r,(b).r); (res).i=SUB32((a).i,(b).i); \ + do {(res).r=SUB32_ovflw((a).r,(b).r); (res).i=SUB32_ovflw((a).i,(b).i); \ }while(0) #define C_ADDTO( res , a)\ - do {(res).r = ADD32((res).r, (a).r); (res).i = ADD32((res).i,(a).i);\ + do {(res).r = ADD32_ovflw((res).r, (a).r); (res).i = ADD32_ovflw((res).i,(a).i);\ }while(0) #define C_SUBFROM( res , a)\ - do {(res).r = ADD32((res).r,(a).r); (res).i = SUB32((res).i,(a).i); \ + do {(res).r = ADD32_ovflw((res).r,(a).r); (res).i = SUB32_ovflw((res).i,(a).i); \ }while(0) #if defined(OPUS_ARM_INLINE_ASM) diff --git a/thirdparty/opus/celt/arch.h b/thirdparty/opus/celt/arch.h index 8ceab5fe10..08b07db598 100644 --- a/thirdparty/opus/celt/arch.h +++ b/thirdparty/opus/celt/arch.h @@ -46,25 +46,50 @@ # endif # endif +#if OPUS_GNUC_PREREQ(3, 0) +#define opus_likely(x) (__builtin_expect(!!(x), 1)) +#define opus_unlikely(x) (__builtin_expect(!!(x), 0)) +#else +#define opus_likely(x) (!!(x)) +#define opus_unlikely(x) (!!(x)) +#endif + #define CELT_SIG_SCALE 32768.f -#define celt_fatal(str) _celt_fatal(str, __FILE__, __LINE__); -#ifdef ENABLE_ASSERTIONS +#define CELT_FATAL(str) celt_fatal(str, __FILE__, __LINE__); + +#if defined(ENABLE_ASSERTIONS) || defined(ENABLE_HARDENING) +#ifdef __GNUC__ +__attribute__((noreturn)) +#endif +void celt_fatal(const char *str, const char *file, int line); + +#if defined(CELT_C) && !defined(OVERRIDE_celt_fatal) #include <stdio.h> #include <stdlib.h> #ifdef __GNUC__ __attribute__((noreturn)) #endif -static OPUS_INLINE void _celt_fatal(const char *str, const char *file, int line) +void celt_fatal(const char *str, const char *file, int line) { fprintf (stderr, "Fatal (internal) error in %s, line %d: %s\n", file, line, str); abort(); } -#define celt_assert(cond) {if (!(cond)) {celt_fatal("assertion failed: " #cond);}} -#define celt_assert2(cond, message) {if (!(cond)) {celt_fatal("assertion failed: " #cond "\n" message);}} +#endif + +#define celt_assert(cond) {if (!(cond)) {CELT_FATAL("assertion failed: " #cond);}} +#define celt_assert2(cond, message) {if (!(cond)) {CELT_FATAL("assertion failed: " #cond "\n" message);}} +#define MUST_SUCCEED(call) celt_assert((call) == OPUS_OK) #else #define celt_assert(cond) #define celt_assert2(cond, message) +#define MUST_SUCCEED(call) do {if((call) != OPUS_OK) {RESTORE_STACK; return OPUS_INTERNAL_ERROR;} } while (0) +#endif + +#if defined(ENABLE_ASSERTIONS) +#define celt_sig_assert(cond) {if (!(cond)) {CELT_FATAL("signal assertion failed: " #cond);}} +#else +#define celt_sig_assert(cond) #endif #define IMUL32(a,b) ((a)*(b)) @@ -93,14 +118,20 @@ static OPUS_INLINE void _celt_fatal(const char *str, const char *file, int line) typedef opus_int16 opus_val16; typedef opus_int32 opus_val32; +typedef opus_int64 opus_val64; typedef opus_val32 celt_sig; typedef opus_val16 celt_norm; typedef opus_val32 celt_ener; +#define celt_isnan(x) 0 + #define Q15ONE 32767 #define SIG_SHIFT 12 +/* Safe saturation value for 32-bit signals. Should be less than + 2^31*(1-0.85) to avoid blowing up on DC at deemphasis.*/ +#define SIG_SAT (300000000) #define NORM_SCALING 16384 @@ -129,7 +160,7 @@ static OPUS_INLINE opus_int16 SAT16(opus_int32 x) { #ifdef OPUS_ARM_PRESUME_AARCH64_NEON_INTR #include "arm/fixed_arm64.h" -#elif OPUS_ARM_INLINE_EDSP +#elif defined (OPUS_ARM_INLINE_EDSP) #include "arm/fixed_armv5e.h" #elif defined (OPUS_ARM_INLINE_ASM) #include "arm/fixed_armv4.h" @@ -147,6 +178,7 @@ static OPUS_INLINE opus_int16 SAT16(opus_int32 x) { typedef float opus_val16; typedef float opus_val32; +typedef float opus_val64; typedef float celt_sig; typedef float celt_norm; @@ -186,6 +218,7 @@ static OPUS_INLINE int celt_isnan(float x) #define NEG16(x) (-(x)) #define NEG32(x) (-(x)) +#define NEG32_ovflw(x) (-(x)) #define EXTRACT16(x) (x) #define EXTEND32(x) (x) #define SHR16(a,shift) (a) @@ -202,6 +235,7 @@ static OPUS_INLINE int celt_isnan(float x) #define SATURATE16(x) (x) #define ROUND16(a,shift) (a) +#define SROUND16(a,shift) (a) #define HALF16(x) (.5f*(x)) #define HALF32(x) (.5f*(x)) @@ -209,6 +243,8 @@ static OPUS_INLINE int celt_isnan(float x) #define SUB16(a,b) ((a)-(b)) #define ADD32(a,b) ((a)+(b)) #define SUB32(a,b) ((a)-(b)) +#define ADD32_ovflw(a,b) ((a)+(b)) +#define SUB32_ovflw(a,b) ((a)-(b)) #define MULT16_16_16(a,b) ((a)*(b)) #define MULT16_16(a,b) ((opus_val32)(a)*(opus_val32)(b)) #define MAC16_16(c,a,b) ((c)+(opus_val32)(a)*(opus_val32)(b)) @@ -243,9 +279,9 @@ static OPUS_INLINE int celt_isnan(float x) #ifndef GLOBAL_STACK_SIZE #ifdef FIXED_POINT -#define GLOBAL_STACK_SIZE 100000 +#define GLOBAL_STACK_SIZE 120000 #else -#define GLOBAL_STACK_SIZE 100000 +#define GLOBAL_STACK_SIZE 120000 #endif #endif diff --git a/thirdparty/opus/celt/arm/arm2gnu.pl b/thirdparty/opus/celt/arm/arm2gnu.pl index 6c922ac819..a2895f7445 100755 --- a/thirdparty/opus/celt/arm/arm2gnu.pl +++ b/thirdparty/opus/celt/arm/arm2gnu.pl @@ -164,11 +164,11 @@ while (<>) { $prefix = ""; if ($proc) { - $prefix = $prefix.sprintf("\t.type\t%s, %%function; ",$proc) unless ($apple); + $prefix = $prefix.sprintf("\t.type\t%s, %%function", $proc) unless ($apple); # Make sure we $prefix isn't empty here (for the $apple case). # We handle mangling the label here, make sure it doesn't match # the label handling below (if $prefix would be empty). - $prefix = "; "; + $prefix = $prefix."; "; push(@proc_stack, $proc); s/^[A-Za-z_\.]\w+/$symprefix$&:/; } diff --git a/thirdparty/opus/celt/arm/arm_celt_map.c b/thirdparty/opus/celt/arm/arm_celt_map.c index 4d4d069a86..ca988b66f5 100644 --- a/thirdparty/opus/celt/arm/arm_celt_map.c +++ b/thirdparty/opus/celt/arm/arm_celt_map.c @@ -35,12 +35,29 @@ #if defined(OPUS_HAVE_RTCD) +# if defined(OPUS_ARM_MAY_HAVE_NEON_INTR) && !defined(OPUS_ARM_PRESUME_NEON_INTR) +opus_val32 (*const CELT_INNER_PROD_IMPL[OPUS_ARCHMASK+1])(const opus_val16 *x, const opus_val16 *y, int N) = { + celt_inner_prod_c, /* ARMv4 */ + celt_inner_prod_c, /* EDSP */ + celt_inner_prod_c, /* Media */ + celt_inner_prod_neon /* NEON */ +}; + +void (*const DUAL_INNER_PROD_IMPL[OPUS_ARCHMASK+1])(const opus_val16 *x, const opus_val16 *y01, const opus_val16 *y02, + int N, opus_val32 *xy1, opus_val32 *xy2) = { + dual_inner_prod_c, /* ARMv4 */ + dual_inner_prod_c, /* EDSP */ + dual_inner_prod_c, /* Media */ + dual_inner_prod_neon /* NEON */ +}; +# endif + # if defined(FIXED_POINT) # if ((defined(OPUS_ARM_MAY_HAVE_NEON) && !defined(OPUS_ARM_PRESUME_NEON)) || \ (defined(OPUS_ARM_MAY_HAVE_MEDIA) && !defined(OPUS_ARM_PRESUME_MEDIA)) || \ (defined(OPUS_ARM_MAY_HAVE_EDSP) && !defined(OPUS_ARM_PRESUME_EDSP))) opus_val32 (*const CELT_PITCH_XCORR_IMPL[OPUS_ARCHMASK+1])(const opus_val16 *, - const opus_val16 *, opus_val32 *, int , int) = { + const opus_val16 *, opus_val32 *, int, int, int) = { celt_pitch_xcorr_c, /* ARMv4 */ MAY_HAVE_EDSP(celt_pitch_xcorr), /* EDSP */ MAY_HAVE_MEDIA(celt_pitch_xcorr), /* Media */ @@ -51,7 +68,7 @@ opus_val32 (*const CELT_PITCH_XCORR_IMPL[OPUS_ARCHMASK+1])(const opus_val16 *, # else /* !FIXED_POINT */ # if defined(OPUS_ARM_MAY_HAVE_NEON_INTR) && !defined(OPUS_ARM_PRESUME_NEON_INTR) void (*const CELT_PITCH_XCORR_IMPL[OPUS_ARCHMASK+1])(const opus_val16 *, - const opus_val16 *, opus_val32 *, int, int) = { + const opus_val16 *, opus_val32 *, int, int, int) = { celt_pitch_xcorr_c, /* ARMv4 */ celt_pitch_xcorr_c, /* EDSP */ celt_pitch_xcorr_c, /* Media */ diff --git a/thirdparty/opus/celt/tests/test_unit_types.c b/thirdparty/opus/celt/arm/armopts.s index 67a0fb8ed3..fb9196072a 100644 --- a/thirdparty/opus/celt/tests/test_unit_types.c +++ b/thirdparty/opus/celt/arm/armopts.s @@ -1,5 +1,4 @@ -/* Copyright (c) 2008-2011 Xiph.Org Foundation - Written by Jean-Marc Valin */ +/* Copyright (C) 2013 Mozilla Corporation */ /* Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions @@ -25,26 +24,14 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif +; Set the following to 1 if we have EDSP instructions +; (LDRD/STRD, etc., ARMv5E and later). +OPUS_ARM_MAY_HAVE_EDSP * -#include "opus_types.h" -#include <stdio.h> +; Set the following to 1 if we have ARMv6 media instructions. +OPUS_ARM_MAY_HAVE_MEDIA * -int main(void) -{ - opus_int16 i = 1; - i <<= 14; - if (i>>14 != 1) - { - fprintf(stderr, "opus_int16 isn't 16 bits\n"); - return 1; - } - if (sizeof(opus_int16)*2 != sizeof(opus_int32)) - { - fprintf(stderr, "16*2 != 32\n"); - return 1; - } - return 0; -} +; Set the following to 1 if we have NEON (some ARMv7) +OPUS_ARM_MAY_HAVE_NEON * + +END diff --git a/thirdparty/opus/celt/arm/celt_ne10_fft.c b/thirdparty/opus/celt/arm/celt_fft_ne10.c index 42d96a7117..ea5fd7808b 100644 --- a/thirdparty/opus/celt/arm/celt_ne10_fft.c +++ b/thirdparty/opus/celt/arm/celt_fft_ne10.c @@ -1,7 +1,7 @@ /* Copyright (c) 2015 Xiph.Org Foundation Written by Viswanath Puttagunta */ /** - @file celt_ne10_fft.c + @file celt_fft_ne10.c @brief ARM Neon optimizations for fft using NE10 library */ @@ -36,7 +36,6 @@ #endif #endif -#include <NE10_init.h> #include <NE10_dsp.h> #include "os_support.h" #include "kiss_fft.h" diff --git a/thirdparty/opus/celt/arm/celt_ne10_mdct.c b/thirdparty/opus/celt/arm/celt_mdct_ne10.c index 293c3efd7a..3531d02d10 100644 --- a/thirdparty/opus/celt/arm/celt_ne10_mdct.c +++ b/thirdparty/opus/celt/arm/celt_mdct_ne10.c @@ -1,7 +1,7 @@ /* Copyright (c) 2015 Xiph.Org Foundation Written by Viswanath Puttagunta */ /** - @file celt_ne10_mdct.c + @file celt_mdct_ne10.c @brief ARM Neon optimizations for mdct using NE10 library */ diff --git a/thirdparty/opus/celt/arm/celt_neon_intr.c b/thirdparty/opus/celt/arm/celt_neon_intr.c index 47bbe3dc22..effda769d0 100644 --- a/thirdparty/opus/celt/arm/celt_neon_intr.c +++ b/thirdparty/opus/celt/arm/celt_neon_intr.c @@ -191,121 +191,21 @@ static void xcorr_kernel_neon_float(const float32_t *x, const float32_t *y, vst1q_f32(sum, SUMM); } -/* - * Function: xcorr_kernel_neon_float_process1 - * --------------------------------- - * Computes single correlation values and stores in *sum - */ -static void xcorr_kernel_neon_float_process1(const float32_t *x, - const float32_t *y, float32_t *sum, int len) { - float32x4_t XX[4]; - float32x4_t YY[4]; - float32x2_t XX_2; - float32x2_t YY_2; - float32x4_t SUMM; - float32x2_t SUMM_2[2]; - const float32_t *xi = x; - const float32_t *yi = y; - - SUMM = vdupq_n_f32(0); - - /* Work on 16 values per iteration */ - while (len >= 16) { - XX[0] = vld1q_f32(xi); - xi += 4; - XX[1] = vld1q_f32(xi); - xi += 4; - XX[2] = vld1q_f32(xi); - xi += 4; - XX[3] = vld1q_f32(xi); - xi += 4; - - YY[0] = vld1q_f32(yi); - yi += 4; - YY[1] = vld1q_f32(yi); - yi += 4; - YY[2] = vld1q_f32(yi); - yi += 4; - YY[3] = vld1q_f32(yi); - yi += 4; - - SUMM = vmlaq_f32(SUMM, YY[0], XX[0]); - SUMM = vmlaq_f32(SUMM, YY[1], XX[1]); - SUMM = vmlaq_f32(SUMM, YY[2], XX[2]); - SUMM = vmlaq_f32(SUMM, YY[3], XX[3]); - len -= 16; - } - - /* Work on 8 values */ - if (len >= 8) { - XX[0] = vld1q_f32(xi); - xi += 4; - XX[1] = vld1q_f32(xi); - xi += 4; - - YY[0] = vld1q_f32(yi); - yi += 4; - YY[1] = vld1q_f32(yi); - yi += 4; - - SUMM = vmlaq_f32(SUMM, YY[0], XX[0]); - SUMM = vmlaq_f32(SUMM, YY[1], XX[1]); - len -= 8; - } - - /* Work on 4 values */ - if (len >= 4) { - XX[0] = vld1q_f32(xi); - xi += 4; - YY[0] = vld1q_f32(yi); - yi += 4; - SUMM = vmlaq_f32(SUMM, YY[0], XX[0]); - len -= 4; - } - - /* Start accumulating results */ - SUMM_2[0] = vget_low_f32(SUMM); - if (len >= 2) { - /* While at it, consume 2 more values if available */ - XX_2 = vld1_f32(xi); - xi += 2; - YY_2 = vld1_f32(yi); - yi += 2; - SUMM_2[0] = vmla_f32(SUMM_2[0], YY_2, XX_2); - len -= 2; - } - SUMM_2[1] = vget_high_f32(SUMM); - SUMM_2[0] = vadd_f32(SUMM_2[0], SUMM_2[1]); - SUMM_2[0] = vpadd_f32(SUMM_2[0], SUMM_2[0]); - /* Ok, now we have result accumulated in SUMM_2[0].0 */ - - if (len > 0) { - /* Case when you have one value left */ - XX_2 = vld1_dup_f32(xi); - YY_2 = vld1_dup_f32(yi); - SUMM_2[0] = vmla_f32(SUMM_2[0], XX_2, YY_2); - } - - vst1_lane_f32(sum, SUMM_2[0], 0); -} - void celt_pitch_xcorr_float_neon(const opus_val16 *_x, const opus_val16 *_y, - opus_val32 *xcorr, int len, int max_pitch) { + opus_val32 *xcorr, int len, int max_pitch, int arch) { int i; + (void)arch; celt_assert(max_pitch > 0); - celt_assert((((unsigned char *)_x-(unsigned char *)NULL)&3)==0); + celt_sig_assert((((unsigned char *)_x-(unsigned char *)NULL)&3)==0); for (i = 0; i < (max_pitch-3); i += 4) { xcorr_kernel_neon_float((const float32_t *)_x, (const float32_t *)_y+i, (float32_t *)xcorr+i, len); } - /* In case max_pitch isn't multiple of 4 - * compute single correlation value per iteration - */ + /* In case max_pitch isn't a multiple of 4, do non-unrolled version. */ for (; i < max_pitch; i++) { - xcorr_kernel_neon_float_process1((const float32_t *)_x, - (const float32_t *)_y+i, (float32_t *)xcorr+i, len); + xcorr[i] = celt_inner_prod_neon(_x, _y+i, len); } } #endif diff --git a/thirdparty/opus/celt/arm/celt_pitch_xcorr_arm-gnu.S b/thirdparty/opus/celt/arm/celt_pitch_xcorr_arm-gnu.S index 5b2ee55a10..10668e54a5 100644 --- a/thirdparty/opus/celt/arm/celt_pitch_xcorr_arm-gnu.S +++ b/thirdparty/opus/celt/arm/celt_pitch_xcorr_arm-gnu.S @@ -44,7 +44,7 @@ .if OPUS_ARM_MAY_HAVE_NEON @ Compute sum[k]=sum(x[j]*y[j+k],j=0...len-1), k=0...3 -; xcorr_kernel_neon: @ PROC + .type xcorr_kernel_neon, %function; xcorr_kernel_neon: @ PROC xcorr_kernel_neon_start: @ input: @ r3 = int len @@ -156,8 +156,8 @@ xcorr_kernel_neon_process1: .size xcorr_kernel_neon, .-xcorr_kernel_neon @ ENDP @ opus_val32 celt_pitch_xcorr_neon(opus_val16 *_x, opus_val16 *_y, -@ opus_val32 *xcorr, int len, int max_pitch) -; celt_pitch_xcorr_neon: @ PROC +@ opus_val32 *xcorr, int len, int max_pitch, int arch) + .type celt_pitch_xcorr_neon, %function; celt_pitch_xcorr_neon: @ PROC @ input: @ r0 = opus_val16 *_x @ r1 = opus_val16 *_y @@ -171,6 +171,8 @@ xcorr_kernel_neon_process1: @ r6 = int max_pitch @ r12 = int j @ q15 = int maxcorr[4] (q15 is not used by xcorr_kernel_neon()) + @ ignored: + @ int arch STMFD sp!, {r4-r6, lr} LDR r6, [sp, #16] VMOV.S32 q15, #1 @@ -260,7 +262,7 @@ celt_pitch_xcorr_neon_done: @ This will get used on ARMv7 devices without NEON, so it has been optimized @ to take advantage of dual-issuing where possible. -; xcorr_kernel_edsp: @ PROC + .type xcorr_kernel_edsp, %function; xcorr_kernel_edsp: @ PROC xcorr_kernel_edsp_start: @ input: @ r3 = int len @@ -344,7 +346,7 @@ xcorr_kernel_edsp_done: LDMFD sp!, {r2,r4,r5,pc} .size xcorr_kernel_edsp, .-xcorr_kernel_edsp @ ENDP -; celt_pitch_xcorr_edsp: @ PROC + .type celt_pitch_xcorr_edsp, %function; celt_pitch_xcorr_edsp: @ PROC @ input: @ r0 = opus_val16 *_x (must be 32-bit aligned) @ r1 = opus_val16 *_y (only needs to be 16-bit aligned) @@ -361,6 +363,8 @@ xcorr_kernel_edsp_done: @ r9 = opus_val32 sum3 @ r1 = int max_pitch @ r12 = int j + @ ignored: + @ int arch STMFD sp!, {r4-r11, lr} MOV r5, r1 LDR r1, [sp, #36] diff --git a/thirdparty/opus/celt/arm/celt_pitch_xcorr_arm.s b/thirdparty/opus/celt/arm/celt_pitch_xcorr_arm.s index f96e0a88bb..6e873afc37 100644 --- a/thirdparty/opus/celt/arm/celt_pitch_xcorr_arm.s +++ b/thirdparty/opus/celt/arm/celt_pitch_xcorr_arm.s @@ -153,7 +153,7 @@ xcorr_kernel_neon_process1 ENDP ; opus_val32 celt_pitch_xcorr_neon(opus_val16 *_x, opus_val16 *_y, -; opus_val32 *xcorr, int len, int max_pitch) +; opus_val32 *xcorr, int len, int max_pitch, int arch) celt_pitch_xcorr_neon PROC ; input: ; r0 = opus_val16 *_x @@ -168,6 +168,8 @@ celt_pitch_xcorr_neon PROC ; r6 = int max_pitch ; r12 = int j ; q15 = int maxcorr[4] (q15 is not used by xcorr_kernel_neon()) + ; ignored: + ; int arch STMFD sp!, {r4-r6, lr} LDR r6, [sp, #16] VMOV.S32 q15, #1 @@ -358,6 +360,8 @@ celt_pitch_xcorr_edsp PROC ; r9 = opus_val32 sum3 ; r1 = int max_pitch ; r12 = int j + ; ignored: + ; int arch STMFD sp!, {r4-r11, lr} MOV r5, r1 LDR r1, [sp, #36] diff --git a/thirdparty/opus/celt/arm/fft_arm.h b/thirdparty/opus/celt/arm/fft_arm.h index 0cb55d8e22..0b78175f3a 100644 --- a/thirdparty/opus/celt/arm/fft_arm.h +++ b/thirdparty/opus/celt/arm/fft_arm.h @@ -34,7 +34,6 @@ #if !defined(FFT_ARM_H) #define FFT_ARM_H -#include "config.h" #include "kiss_fft.h" #if defined(HAVE_ARM_NE10) diff --git a/thirdparty/opus/celt/arm/fixed_armv4.h b/thirdparty/opus/celt/arm/fixed_armv4.h index efb3b1896a..d84888a772 100644 --- a/thirdparty/opus/celt/arm/fixed_armv4.h +++ b/thirdparty/opus/celt/arm/fixed_armv4.h @@ -37,7 +37,7 @@ static OPUS_INLINE opus_val32 MULT16_32_Q16_armv4(opus_val16 a, opus_val32 b) "#MULT16_32_Q16\n\t" "smull %0, %1, %2, %3\n\t" : "=&r"(rd_lo), "=&r"(rd_hi) - : "%r"(b),"r"(a<<16) + : "%r"(b),"r"(SHL32(a,16)) ); return rd_hi; } @@ -54,10 +54,10 @@ static OPUS_INLINE opus_val32 MULT16_32_Q15_armv4(opus_val16 a, opus_val32 b) "#MULT16_32_Q15\n\t" "smull %0, %1, %2, %3\n\t" : "=&r"(rd_lo), "=&r"(rd_hi) - : "%r"(b), "r"(a<<16) + : "%r"(b), "r"(SHL32(a,16)) ); /*We intentionally don't OR in the high bit of rd_lo for speed.*/ - return rd_hi<<1; + return SHL32(rd_hi,1); } #define MULT16_32_Q15(a, b) (MULT16_32_Q15_armv4(a, b)) diff --git a/thirdparty/opus/celt/arm/fixed_armv5e.h b/thirdparty/opus/celt/arm/fixed_armv5e.h index 36a6321101..6bf73cbace 100644 --- a/thirdparty/opus/celt/arm/fixed_armv5e.h +++ b/thirdparty/opus/celt/arm/fixed_armv5e.h @@ -59,7 +59,7 @@ static OPUS_INLINE opus_val32 MULT16_32_Q15_armv5e(opus_val16 a, opus_val32 b) : "=r"(res) : "r"(b), "r"(a) ); - return res<<1; + return SHL32(res,1); } #define MULT16_32_Q15(a, b) (MULT16_32_Q15_armv5e(a, b)) @@ -76,7 +76,7 @@ static OPUS_INLINE opus_val32 MAC16_32_Q15_armv5e(opus_val32 c, opus_val16 a, "#MAC16_32_Q15\n\t" "smlawb %0, %1, %2, %3;\n" : "=r"(res) - : "r"(b<<1), "r"(a), "r"(c) + : "r"(SHL32(b,1)), "r"(a), "r"(c) ); return res; } diff --git a/thirdparty/opus/celt/arm/mdct_arm.h b/thirdparty/opus/celt/arm/mdct_arm.h index 49cbb44576..14200bac4b 100644 --- a/thirdparty/opus/celt/arm/mdct_arm.h +++ b/thirdparty/opus/celt/arm/mdct_arm.h @@ -33,7 +33,6 @@ #if !defined(MDCT_ARM_H) #define MDCT_ARM_H -#include "config.h" #include "mdct.h" #if defined(HAVE_ARM_NE10) diff --git a/thirdparty/opus/celt/arm/pitch_arm.h b/thirdparty/opus/celt/arm/pitch_arm.h index 14331169ee..bed8b04eac 100644 --- a/thirdparty/opus/celt/arm/pitch_arm.h +++ b/thirdparty/opus/celt/arm/pitch_arm.h @@ -30,11 +30,47 @@ # include "armcpu.h" +# if defined(OPUS_ARM_MAY_HAVE_NEON_INTR) +opus_val32 celt_inner_prod_neon(const opus_val16 *x, const opus_val16 *y, int N); +void dual_inner_prod_neon(const opus_val16 *x, const opus_val16 *y01, + const opus_val16 *y02, int N, opus_val32 *xy1, opus_val32 *xy2); + +# if !defined(OPUS_HAVE_RTCD) && defined(OPUS_ARM_PRESUME_NEON) +# define OVERRIDE_CELT_INNER_PROD (1) +# define OVERRIDE_DUAL_INNER_PROD (1) +# define celt_inner_prod(x, y, N, arch) ((void)(arch), PRESUME_NEON(celt_inner_prod)(x, y, N)) +# define dual_inner_prod(x, y01, y02, N, xy1, xy2, arch) ((void)(arch), PRESUME_NEON(dual_inner_prod)(x, y01, y02, N, xy1, xy2)) +# endif +# endif + +# if !defined(OVERRIDE_CELT_INNER_PROD) +# if defined(OPUS_HAVE_RTCD) && (defined(OPUS_ARM_MAY_HAVE_NEON_INTR) && !defined(OPUS_ARM_PRESUME_NEON_INTR)) +extern opus_val32 (*const CELT_INNER_PROD_IMPL[OPUS_ARCHMASK+1])(const opus_val16 *x, const opus_val16 *y, int N); +# define OVERRIDE_CELT_INNER_PROD (1) +# define celt_inner_prod(x, y, N, arch) ((*CELT_INNER_PROD_IMPL[(arch)&OPUS_ARCHMASK])(x, y, N)) +# elif defined(OPUS_ARM_PRESUME_NEON_INTR) +# define OVERRIDE_CELT_INNER_PROD (1) +# define celt_inner_prod(x, y, N, arch) ((void)(arch), celt_inner_prod_neon(x, y, N)) +# endif +# endif + +# if !defined(OVERRIDE_DUAL_INNER_PROD) +# if defined(OPUS_HAVE_RTCD) && (defined(OPUS_ARM_MAY_HAVE_NEON_INTR) && !defined(OPUS_ARM_PRESUME_NEON_INTR)) +extern void (*const DUAL_INNER_PROD_IMPL[OPUS_ARCHMASK+1])(const opus_val16 *x, + const opus_val16 *y01, const opus_val16 *y02, int N, opus_val32 *xy1, opus_val32 *xy2); +# define OVERRIDE_DUAL_INNER_PROD (1) +# define dual_inner_prod(x, y01, y02, N, xy1, xy2, arch) ((*DUAL_INNER_PROD_IMPL[(arch)&OPUS_ARCHMASK])(x, y01, y02, N, xy1, xy2)) +# elif defined(OPUS_ARM_PRESUME_NEON_INTR) +# define OVERRIDE_DUAL_INNER_PROD (1) +# define dual_inner_prod(x, y01, y02, N, xy1, xy2, arch) ((void)(arch), dual_inner_prod_neon(x, y01, y02, N, xy1, xy2)) +# endif +# endif + # if defined(FIXED_POINT) # if defined(OPUS_ARM_MAY_HAVE_NEON) opus_val32 celt_pitch_xcorr_neon(const opus_val16 *_x, const opus_val16 *_y, - opus_val32 *xcorr, int len, int max_pitch); + opus_val32 *xcorr, int len, int max_pitch, int arch); # endif # if defined(OPUS_ARM_MAY_HAVE_MEDIA) @@ -43,7 +79,7 @@ opus_val32 celt_pitch_xcorr_neon(const opus_val16 *_x, const opus_val16 *_y, # if defined(OPUS_ARM_MAY_HAVE_EDSP) opus_val32 celt_pitch_xcorr_edsp(const opus_val16 *_x, const opus_val16 *_y, - opus_val32 *xcorr, int len, int max_pitch); + opus_val32 *xcorr, int len, int max_pitch, int arch); # endif # if defined(OPUS_HAVE_RTCD) && \ @@ -52,18 +88,17 @@ opus_val32 celt_pitch_xcorr_edsp(const opus_val16 *_x, const opus_val16 *_y, (defined(OPUS_ARM_MAY_HAVE_EDSP) && !defined(OPUS_ARM_PRESUME_EDSP))) extern opus_val32 (*const CELT_PITCH_XCORR_IMPL[OPUS_ARCHMASK+1])(const opus_val16 *, - const opus_val16 *, opus_val32 *, int, int); + const opus_val16 *, opus_val32 *, int, int, int); # define OVERRIDE_PITCH_XCORR (1) # define celt_pitch_xcorr(_x, _y, xcorr, len, max_pitch, arch) \ ((*CELT_PITCH_XCORR_IMPL[(arch)&OPUS_ARCHMASK])(_x, _y, \ - xcorr, len, max_pitch)) + xcorr, len, max_pitch, arch)) # elif defined(OPUS_ARM_PRESUME_EDSP) || \ defined(OPUS_ARM_PRESUME_MEDIA) || \ defined(OPUS_ARM_PRESUME_NEON) # define OVERRIDE_PITCH_XCORR (1) -# define celt_pitch_xcorr(_x, _y, xcorr, len, max_pitch, arch) \ - ((void)(arch),PRESUME_NEON(celt_pitch_xcorr)(_x, _y, xcorr, len, max_pitch)) +# define celt_pitch_xcorr (PRESUME_NEON(celt_pitch_xcorr)) # endif @@ -99,25 +134,24 @@ extern void (*const XCORR_KERNEL_IMPL[OPUS_ARCHMASK + 1])( /* Float case */ #if defined(OPUS_ARM_MAY_HAVE_NEON_INTR) void celt_pitch_xcorr_float_neon(const opus_val16 *_x, const opus_val16 *_y, - opus_val32 *xcorr, int len, int max_pitch); + opus_val32 *xcorr, int len, int max_pitch, int arch); #endif # if defined(OPUS_HAVE_RTCD) && \ (defined(OPUS_ARM_MAY_HAVE_NEON_INTR) && !defined(OPUS_ARM_PRESUME_NEON_INTR)) extern void (*const CELT_PITCH_XCORR_IMPL[OPUS_ARCHMASK+1])(const opus_val16 *, - const opus_val16 *, opus_val32 *, int, int); + const opus_val16 *, opus_val32 *, int, int, int); # define OVERRIDE_PITCH_XCORR (1) # define celt_pitch_xcorr(_x, _y, xcorr, len, max_pitch, arch) \ ((*CELT_PITCH_XCORR_IMPL[(arch)&OPUS_ARCHMASK])(_x, _y, \ - xcorr, len, max_pitch)) + xcorr, len, max_pitch, arch)) # elif defined(OPUS_ARM_PRESUME_NEON_INTR) # define OVERRIDE_PITCH_XCORR (1) -# define celt_pitch_xcorr(_x, _y, xcorr, len, max_pitch, arch) \ - ((void)(arch),celt_pitch_xcorr_float_neon(_x, _y, xcorr, len, max_pitch)) +# define celt_pitch_xcorr celt_pitch_xcorr_float_neon # endif diff --git a/thirdparty/opus/celt/arm/pitch_neon_intr.c b/thirdparty/opus/celt/arm/pitch_neon_intr.c new file mode 100644 index 0000000000..1ac38c433a --- /dev/null +++ b/thirdparty/opus/celt/arm/pitch_neon_intr.c @@ -0,0 +1,290 @@ +/*********************************************************************** +Copyright (c) 2017 Google Inc. +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +- Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. +- Neither the name of Internet Society, IETF or IETF Trust, nor the +names of specific contributors, may be used to endorse or promote +products derived from this software without specific prior written +permission. +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. +***********************************************************************/ + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#include <arm_neon.h> +#include "pitch.h" + +#ifdef FIXED_POINT + +opus_val32 celt_inner_prod_neon(const opus_val16 *x, const opus_val16 *y, int N) +{ + int i; + opus_val32 xy; + int16x8_t x_s16x8, y_s16x8; + int32x4_t xy_s32x4 = vdupq_n_s32(0); + int64x2_t xy_s64x2; + int64x1_t xy_s64x1; + + for (i = 0; i < N - 7; i += 8) { + x_s16x8 = vld1q_s16(&x[i]); + y_s16x8 = vld1q_s16(&y[i]); + xy_s32x4 = vmlal_s16(xy_s32x4, vget_low_s16 (x_s16x8), vget_low_s16 (y_s16x8)); + xy_s32x4 = vmlal_s16(xy_s32x4, vget_high_s16(x_s16x8), vget_high_s16(y_s16x8)); + } + + if (N - i >= 4) { + const int16x4_t x_s16x4 = vld1_s16(&x[i]); + const int16x4_t y_s16x4 = vld1_s16(&y[i]); + xy_s32x4 = vmlal_s16(xy_s32x4, x_s16x4, y_s16x4); + i += 4; + } + + xy_s64x2 = vpaddlq_s32(xy_s32x4); + xy_s64x1 = vadd_s64(vget_low_s64(xy_s64x2), vget_high_s64(xy_s64x2)); + xy = vget_lane_s32(vreinterpret_s32_s64(xy_s64x1), 0); + + for (; i < N; i++) { + xy = MAC16_16(xy, x[i], y[i]); + } + +#ifdef OPUS_CHECK_ASM + celt_assert(celt_inner_prod_c(x, y, N) == xy); +#endif + + return xy; +} + +void dual_inner_prod_neon(const opus_val16 *x, const opus_val16 *y01, const opus_val16 *y02, + int N, opus_val32 *xy1, opus_val32 *xy2) +{ + int i; + opus_val32 xy01, xy02; + int16x8_t x_s16x8, y01_s16x8, y02_s16x8; + int32x4_t xy01_s32x4 = vdupq_n_s32(0); + int32x4_t xy02_s32x4 = vdupq_n_s32(0); + int64x2_t xy01_s64x2, xy02_s64x2; + int64x1_t xy01_s64x1, xy02_s64x1; + + for (i = 0; i < N - 7; i += 8) { + x_s16x8 = vld1q_s16(&x[i]); + y01_s16x8 = vld1q_s16(&y01[i]); + y02_s16x8 = vld1q_s16(&y02[i]); + xy01_s32x4 = vmlal_s16(xy01_s32x4, vget_low_s16 (x_s16x8), vget_low_s16 (y01_s16x8)); + xy02_s32x4 = vmlal_s16(xy02_s32x4, vget_low_s16 (x_s16x8), vget_low_s16 (y02_s16x8)); + xy01_s32x4 = vmlal_s16(xy01_s32x4, vget_high_s16(x_s16x8), vget_high_s16(y01_s16x8)); + xy02_s32x4 = vmlal_s16(xy02_s32x4, vget_high_s16(x_s16x8), vget_high_s16(y02_s16x8)); + } + + if (N - i >= 4) { + const int16x4_t x_s16x4 = vld1_s16(&x[i]); + const int16x4_t y01_s16x4 = vld1_s16(&y01[i]); + const int16x4_t y02_s16x4 = vld1_s16(&y02[i]); + xy01_s32x4 = vmlal_s16(xy01_s32x4, x_s16x4, y01_s16x4); + xy02_s32x4 = vmlal_s16(xy02_s32x4, x_s16x4, y02_s16x4); + i += 4; + } + + xy01_s64x2 = vpaddlq_s32(xy01_s32x4); + xy02_s64x2 = vpaddlq_s32(xy02_s32x4); + xy01_s64x1 = vadd_s64(vget_low_s64(xy01_s64x2), vget_high_s64(xy01_s64x2)); + xy02_s64x1 = vadd_s64(vget_low_s64(xy02_s64x2), vget_high_s64(xy02_s64x2)); + xy01 = vget_lane_s32(vreinterpret_s32_s64(xy01_s64x1), 0); + xy02 = vget_lane_s32(vreinterpret_s32_s64(xy02_s64x1), 0); + + for (; i < N; i++) { + xy01 = MAC16_16(xy01, x[i], y01[i]); + xy02 = MAC16_16(xy02, x[i], y02[i]); + } + *xy1 = xy01; + *xy2 = xy02; + +#ifdef OPUS_CHECK_ASM + { + opus_val32 xy1_c, xy2_c; + dual_inner_prod_c(x, y01, y02, N, &xy1_c, &xy2_c); + celt_assert(xy1_c == *xy1); + celt_assert(xy2_c == *xy2); + } +#endif +} + +#else /* !FIXED_POINT */ + +/* ========================================================================== */ + +#ifdef OPUS_CHECK_ASM + +/* This part of code simulates floating-point NEON operations. */ + +/* celt_inner_prod_neon_float_c_simulation() simulates the floating-point */ +/* operations of celt_inner_prod_neon(), and both functions should have bit */ +/* exact output. */ +static opus_val32 celt_inner_prod_neon_float_c_simulation(const opus_val16 *x, const opus_val16 *y, int N) +{ + int i; + opus_val32 xy, xy0 = 0, xy1 = 0, xy2 = 0, xy3 = 0; + for (i = 0; i < N - 3; i += 4) { + xy0 = MAC16_16(xy0, x[i + 0], y[i + 0]); + xy1 = MAC16_16(xy1, x[i + 1], y[i + 1]); + xy2 = MAC16_16(xy2, x[i + 2], y[i + 2]); + xy3 = MAC16_16(xy3, x[i + 3], y[i + 3]); + } + xy0 += xy2; + xy1 += xy3; + xy = xy0 + xy1; + for (; i < N; i++) { + xy = MAC16_16(xy, x[i], y[i]); + } + return xy; +} + +/* dual_inner_prod_neon_float_c_simulation() simulates the floating-point */ +/* operations of dual_inner_prod_neon(), and both functions should have bit */ +/* exact output. */ +static void dual_inner_prod_neon_float_c_simulation(const opus_val16 *x, const opus_val16 *y01, const opus_val16 *y02, + int N, opus_val32 *xy1, opus_val32 *xy2) +{ + int i; + opus_val32 xy01, xy02, xy01_0 = 0, xy01_1 = 0, xy01_2 = 0, xy01_3 = 0, xy02_0 = 0, xy02_1 = 0, xy02_2 = 0, xy02_3 = 0; + for (i = 0; i < N - 3; i += 4) { + xy01_0 = MAC16_16(xy01_0, x[i + 0], y01[i + 0]); + xy01_1 = MAC16_16(xy01_1, x[i + 1], y01[i + 1]); + xy01_2 = MAC16_16(xy01_2, x[i + 2], y01[i + 2]); + xy01_3 = MAC16_16(xy01_3, x[i + 3], y01[i + 3]); + xy02_0 = MAC16_16(xy02_0, x[i + 0], y02[i + 0]); + xy02_1 = MAC16_16(xy02_1, x[i + 1], y02[i + 1]); + xy02_2 = MAC16_16(xy02_2, x[i + 2], y02[i + 2]); + xy02_3 = MAC16_16(xy02_3, x[i + 3], y02[i + 3]); + } + xy01_0 += xy01_2; + xy02_0 += xy02_2; + xy01_1 += xy01_3; + xy02_1 += xy02_3; + xy01 = xy01_0 + xy01_1; + xy02 = xy02_0 + xy02_1; + for (; i < N; i++) { + xy01 = MAC16_16(xy01, x[i], y01[i]); + xy02 = MAC16_16(xy02, x[i], y02[i]); + } + *xy1 = xy01; + *xy2 = xy02; +} + +#endif /* OPUS_CHECK_ASM */ + +/* ========================================================================== */ + +opus_val32 celt_inner_prod_neon(const opus_val16 *x, const opus_val16 *y, int N) +{ + int i; + opus_val32 xy; + float32x4_t xy_f32x4 = vdupq_n_f32(0); + float32x2_t xy_f32x2; + + for (i = 0; i < N - 7; i += 8) { + float32x4_t x_f32x4, y_f32x4; + x_f32x4 = vld1q_f32(&x[i]); + y_f32x4 = vld1q_f32(&y[i]); + xy_f32x4 = vmlaq_f32(xy_f32x4, x_f32x4, y_f32x4); + x_f32x4 = vld1q_f32(&x[i + 4]); + y_f32x4 = vld1q_f32(&y[i + 4]); + xy_f32x4 = vmlaq_f32(xy_f32x4, x_f32x4, y_f32x4); + } + + if (N - i >= 4) { + const float32x4_t x_f32x4 = vld1q_f32(&x[i]); + const float32x4_t y_f32x4 = vld1q_f32(&y[i]); + xy_f32x4 = vmlaq_f32(xy_f32x4, x_f32x4, y_f32x4); + i += 4; + } + + xy_f32x2 = vadd_f32(vget_low_f32(xy_f32x4), vget_high_f32(xy_f32x4)); + xy_f32x2 = vpadd_f32(xy_f32x2, xy_f32x2); + xy = vget_lane_f32(xy_f32x2, 0); + + for (; i < N; i++) { + xy = MAC16_16(xy, x[i], y[i]); + } + +#ifdef OPUS_CHECK_ASM + celt_assert(ABS32(celt_inner_prod_neon_float_c_simulation(x, y, N) - xy) <= VERY_SMALL); +#endif + + return xy; +} + +void dual_inner_prod_neon(const opus_val16 *x, const opus_val16 *y01, const opus_val16 *y02, + int N, opus_val32 *xy1, opus_val32 *xy2) +{ + int i; + opus_val32 xy01, xy02; + float32x4_t xy01_f32x4 = vdupq_n_f32(0); + float32x4_t xy02_f32x4 = vdupq_n_f32(0); + float32x2_t xy01_f32x2, xy02_f32x2; + + for (i = 0; i < N - 7; i += 8) { + float32x4_t x_f32x4, y01_f32x4, y02_f32x4; + x_f32x4 = vld1q_f32(&x[i]); + y01_f32x4 = vld1q_f32(&y01[i]); + y02_f32x4 = vld1q_f32(&y02[i]); + xy01_f32x4 = vmlaq_f32(xy01_f32x4, x_f32x4, y01_f32x4); + xy02_f32x4 = vmlaq_f32(xy02_f32x4, x_f32x4, y02_f32x4); + x_f32x4 = vld1q_f32(&x[i + 4]); + y01_f32x4 = vld1q_f32(&y01[i + 4]); + y02_f32x4 = vld1q_f32(&y02[i + 4]); + xy01_f32x4 = vmlaq_f32(xy01_f32x4, x_f32x4, y01_f32x4); + xy02_f32x4 = vmlaq_f32(xy02_f32x4, x_f32x4, y02_f32x4); + } + + if (N - i >= 4) { + const float32x4_t x_f32x4 = vld1q_f32(&x[i]); + const float32x4_t y01_f32x4 = vld1q_f32(&y01[i]); + const float32x4_t y02_f32x4 = vld1q_f32(&y02[i]); + xy01_f32x4 = vmlaq_f32(xy01_f32x4, x_f32x4, y01_f32x4); + xy02_f32x4 = vmlaq_f32(xy02_f32x4, x_f32x4, y02_f32x4); + i += 4; + } + + xy01_f32x2 = vadd_f32(vget_low_f32(xy01_f32x4), vget_high_f32(xy01_f32x4)); + xy02_f32x2 = vadd_f32(vget_low_f32(xy02_f32x4), vget_high_f32(xy02_f32x4)); + xy01_f32x2 = vpadd_f32(xy01_f32x2, xy01_f32x2); + xy02_f32x2 = vpadd_f32(xy02_f32x2, xy02_f32x2); + xy01 = vget_lane_f32(xy01_f32x2, 0); + xy02 = vget_lane_f32(xy02_f32x2, 0); + + for (; i < N; i++) { + xy01 = MAC16_16(xy01, x[i], y01[i]); + xy02 = MAC16_16(xy02, x[i], y02[i]); + } + *xy1 = xy01; + *xy2 = xy02; + +#ifdef OPUS_CHECK_ASM + { + opus_val32 xy1_c, xy2_c; + dual_inner_prod_neon_float_c_simulation(x, y01, y02, N, &xy1_c, &xy2_c); + celt_assert(ABS32(xy1_c - *xy1) <= VERY_SMALL); + celt_assert(ABS32(xy2_c - *xy2) <= VERY_SMALL); + } +#endif +} + +#endif /* FIXED_POINT */ diff --git a/thirdparty/opus/celt/bands.c b/thirdparty/opus/celt/bands.c index 87eaa6c031..2702963c37 100644 --- a/thirdparty/opus/celt/bands.c +++ b/thirdparty/opus/celt/bands.c @@ -65,19 +65,19 @@ opus_uint32 celt_lcg_rand(opus_uint32 seed) /* This is a cos() approximation designed to be bit-exact on any platform. Bit exactness with this approximation is important because it has an impact on the bit allocation */ -static opus_int16 bitexact_cos(opus_int16 x) +opus_int16 bitexact_cos(opus_int16 x) { opus_int32 tmp; opus_int16 x2; tmp = (4096+((opus_int32)(x)*(x)))>>13; - celt_assert(tmp<=32767); + celt_sig_assert(tmp<=32767); x2 = tmp; x2 = (32767-x2) + FRAC_MUL16(x2, (-7651 + FRAC_MUL16(x2, (8277 + FRAC_MUL16(-626, x2))))); - celt_assert(x2<=32766); + celt_sig_assert(x2<=32766); return 1+x2; } -static int bitexact_log2tan(int isin,int icos) +int bitexact_log2tan(int isin,int icos) { int lc; int ls; @@ -92,10 +92,11 @@ static int bitexact_log2tan(int isin,int icos) #ifdef FIXED_POINT /* Compute the amplitude (sqrt energy) in each of the bands */ -void compute_band_energies(const CELTMode *m, const celt_sig *X, celt_ener *bandE, int end, int C, int LM) +void compute_band_energies(const CELTMode *m, const celt_sig *X, celt_ener *bandE, int end, int C, int LM, int arch) { int i, c, N; const opus_int16 *eBands = m->eBands; + (void)arch; N = m->shortMdctSize<<LM; c=0; do { for (i=0;i<end;i++) @@ -155,7 +156,7 @@ void normalise_bands(const CELTMode *m, const celt_sig * OPUS_RESTRICT freq, cel #else /* FIXED_POINT */ /* Compute the amplitude (sqrt energy) in each of the bands */ -void compute_band_energies(const CELTMode *m, const celt_sig *X, celt_ener *bandE, int end, int C, int LM) +void compute_band_energies(const CELTMode *m, const celt_sig *X, celt_ener *bandE, int end, int C, int LM, int arch) { int i, c, N; const opus_int16 *eBands = m->eBands; @@ -164,7 +165,7 @@ void compute_band_energies(const CELTMode *m, const celt_sig *X, celt_ener *band for (i=0;i<end;i++) { opus_val32 sum; - sum = 1e-27f + celt_inner_prod_c(&X[c*N+(eBands[i]<<LM)], &X[c*N+(eBands[i]<<LM)], (eBands[i+1]-eBands[i])<<LM); + sum = 1e-27f + celt_inner_prod(&X[c*N+(eBands[i]<<LM)], &X[c*N+(eBands[i]<<LM)], (eBands[i+1]-eBands[i])<<LM, arch); bandE[i+c*m->nbEBands] = celt_sqrt(sum); /*printf ("%f ", bandE[i+c*m->nbEBands]);*/ } @@ -224,9 +225,9 @@ void denormalise_bands(const CELTMode *m, const celt_norm * OPUS_RESTRICT X, #endif j=M*eBands[i]; band_end = M*eBands[i+1]; - lg = ADD16(bandLogE[i], SHL16((opus_val16)eMeans[i],6)); + lg = SATURATE16(ADD32(bandLogE[i], SHL32((opus_val32)eMeans[i],6))); #ifndef FIXED_POINT - g = celt_exp2(lg); + g = celt_exp2(MIN32(32.f, lg)); #else /* Handle the integer part of the log energy */ shift = 16-(lg>>DB_SHIFT); @@ -241,12 +242,12 @@ void denormalise_bands(const CELTMode *m, const celt_norm * OPUS_RESTRICT X, /* Handle extreme gains with negative shift. */ if (shift<0) { - /* For shift < -2 we'd be likely to overflow, so we're capping - the gain here. This shouldn't happen unless the bitstream is - already corrupted. */ - if (shift < -2) + /* For shift <= -2 and g > 16384 we'd be likely to overflow, so we're + capping the gain here, which is equivalent to a cap of 18 on lg. + This shouldn't trigger unless the bitstream is already corrupted. */ + if (shift <= -2) { - g = 32767; + g = 16384; shift = -2; } do { @@ -281,7 +282,7 @@ void anti_collapse(const CELTMode *m, celt_norm *X_, unsigned char *collapse_mas N0 = m->eBands[i+1]-m->eBands[i]; /* depth in 1/8 bits */ - celt_assert(pulses[i]>=0); + celt_sig_assert(pulses[i]>=0); depth = celt_udiv(1+pulses[i], (m->eBands[i+1]-m->eBands[i]))>>LM; #ifdef FIXED_POINT @@ -360,6 +361,30 @@ void anti_collapse(const CELTMode *m, celt_norm *X_, unsigned char *collapse_mas } } +/* Compute the weights to use for optimizing normalized distortion across + channels. We use the amplitude to weight square distortion, which means + that we use the square root of the value we would have been using if we + wanted to minimize the MSE in the non-normalized domain. This roughly + corresponds to some quick-and-dirty perceptual experiments I ran to + measure inter-aural masking (there doesn't seem to be any published data + on the topic). */ +static void compute_channel_weights(celt_ener Ex, celt_ener Ey, opus_val16 w[2]) +{ + celt_ener minE; +#ifdef FIXED_POINT + int shift; +#endif + minE = MIN32(Ex, Ey); + /* Adjustment to make the weights a bit more conservative. */ + Ex = ADD32(Ex, minE/3); + Ey = ADD32(Ey, minE/3); +#ifdef FIXED_POINT + shift = celt_ilog2(EPSILON+MAX32(Ex, Ey))-14; +#endif + w[0] = VSHR32(Ex, shift); + w[1] = VSHR32(Ey, shift); +} + static void intensity_stereo(const CELTMode *m, celt_norm * OPUS_RESTRICT X, const celt_norm * OPUS_RESTRICT Y, const celt_ener *bandE, int bandID, int N) { int i = bandID; @@ -453,7 +478,7 @@ static void stereo_merge(celt_norm * OPUS_RESTRICT X, celt_norm * OPUS_RESTRICT /* Decide whether we should spread the pulses in the current frame */ int spreading_decision(const CELTMode *m, const celt_norm *X, int *average, int last_decision, int *hf_average, int *tapset_decision, int update_hf, - int end, int C, int M) + int end, int C, int M, const int *spread_weight) { int i, c, N0; int sum = 0, nbBands=0; @@ -494,8 +519,8 @@ int spreading_decision(const CELTMode *m, const celt_norm *X, int *average, if (i>m->nbEBands-4) hf_sum += celt_udiv(32*(tcount[1]+tcount[0]), N); tmp = (2*tcount[2] >= N) + (2*tcount[1] >= N) + (2*tcount[0] >= N); - sum += tmp*256; - nbBands++; + sum += tmp*spread_weight[i]; + nbBands+=spread_weight[i]; } } while (++c<C); @@ -519,7 +544,7 @@ int spreading_decision(const CELTMode *m, const celt_norm *X, int *average, /*printf("%d %d %d\n", hf_sum, *hf_average, *tapset_decision);*/ celt_assert(nbBands>0); /* end has to be non-zero */ celt_assert(sum>=0); - sum = celt_udiv(sum, nbBands); + sum = celt_udiv((opus_int32)sum<<8, nbBands); /* Recursive averaging */ sum = (sum+*average)>>1; *average = sum; @@ -647,6 +672,7 @@ static int compute_qn(int N, int b, int offset, int pulse_cap, int stereo) struct band_ctx { int encode; + int resynth; const CELTMode *m; int i; int intensity; @@ -657,6 +683,9 @@ struct band_ctx { const celt_ener *bandE; opus_uint32 seed; int arch; + int theta_round; + int disable_inv; + int avoid_split_noise; }; struct split_ctx { @@ -714,8 +743,35 @@ static void compute_theta(struct band_ctx *ctx, struct split_ctx *sctx, if (qn!=1) { if (encode) - itheta = (itheta*(opus_int32)qn+8192)>>14; - + { + if (!stereo || ctx->theta_round == 0) + { + itheta = (itheta*(opus_int32)qn+8192)>>14; + if (!stereo && ctx->avoid_split_noise && itheta > 0 && itheta < qn) + { + /* Check if the selected value of theta will cause the bit allocation + to inject noise on one side. If so, make sure the energy of that side + is zero. */ + int unquantized = celt_udiv((opus_int32)itheta*16384, qn); + imid = bitexact_cos((opus_int16)unquantized); + iside = bitexact_cos((opus_int16)(16384-unquantized)); + delta = FRAC_MUL16((N-1)<<7,bitexact_log2tan(iside,imid)); + if (delta > *b) + itheta = qn; + else if (delta < -*b) + itheta = 0; + } + } else { + int down; + /* Bias quantization towards itheta=0 and itheta=16384. */ + int bias = itheta > 8192 ? 32767/qn : -32767/qn; + down = IMIN(qn-1, IMAX(0, (itheta*(opus_int32)qn + bias)>>14)); + if (ctx->theta_round < 0) + itheta = down; + else + itheta = down+1; + } + } /* Entropy coding of the angle. We use a uniform pdf for the time split, a step for stereo, and a triangular one for the rest. */ if (stereo && N>2) @@ -793,7 +849,7 @@ static void compute_theta(struct band_ctx *ctx, struct split_ctx *sctx, } else if (stereo) { if (encode) { - inv = itheta > 8192; + inv = itheta > 8192 && !ctx->disable_inv; if (inv) { int j; @@ -810,6 +866,9 @@ static void compute_theta(struct band_ctx *ctx, struct split_ctx *sctx, inv = ec_dec_bit_logp(ec, 2); } else inv = 0; + /* inv flag override to avoid problems with downmixing. */ + if (ctx->disable_inv) + inv = 0; itheta = 0; } qalloc = ec_tell_frac(ec) - tell; @@ -845,11 +904,6 @@ static void compute_theta(struct band_ctx *ctx, struct split_ctx *sctx, static unsigned quant_band_n1(struct band_ctx *ctx, celt_norm *X, celt_norm *Y, int b, celt_norm *lowband_out) { -#ifdef RESYNTH - int resynth = 1; -#else - int resynth = !ctx->encode; -#endif int c; int stereo; celt_norm *x = X; @@ -874,7 +928,7 @@ static unsigned quant_band_n1(struct band_ctx *ctx, celt_norm *X, celt_norm *Y, ctx->remaining_bits -= 1<<BITRES; b-=1<<BITRES; } - if (resynth) + if (ctx->resynth) x[0] = sign ? -NORM_SCALING : NORM_SCALING; x = Y; } while (++c<1+stereo); @@ -899,11 +953,6 @@ static unsigned quant_partition(struct band_ctx *ctx, celt_norm *X, int B0=B; opus_val16 mid=0, side=0; unsigned cm=0; -#ifdef RESYNTH - int resynth = 1; -#else - int resynth = !ctx->encode; -#endif celt_norm *Y=NULL; int encode; const CELTMode *m; @@ -935,8 +984,7 @@ static unsigned quant_partition(struct band_ctx *ctx, celt_norm *X, fill = (fill&1)|(fill<<1); B = (B+1)>>1; - compute_theta(ctx, &sctx, X, Y, N, &b, B, B0, - LM, 0, &fill); + compute_theta(ctx, &sctx, X, Y, N, &b, B, B0, LM, 0, &fill); imid = sctx.imid; iside = sctx.iside; delta = sctx.delta; @@ -970,24 +1018,20 @@ static unsigned quant_partition(struct band_ctx *ctx, celt_norm *X, rebalance = ctx->remaining_bits; if (mbits >= sbits) { - cm = quant_partition(ctx, X, N, mbits, B, - lowband, LM, + cm = quant_partition(ctx, X, N, mbits, B, lowband, LM, MULT16_16_P15(gain,mid), fill); rebalance = mbits - (rebalance-ctx->remaining_bits); if (rebalance > 3<<BITRES && itheta!=0) sbits += rebalance - (3<<BITRES); - cm |= quant_partition(ctx, Y, N, sbits, B, - next_lowband2, LM, + cm |= quant_partition(ctx, Y, N, sbits, B, next_lowband2, LM, MULT16_16_P15(gain,side), fill>>B)<<(B0>>1); } else { - cm = quant_partition(ctx, Y, N, sbits, B, - next_lowband2, LM, + cm = quant_partition(ctx, Y, N, sbits, B, next_lowband2, LM, MULT16_16_P15(gain,side), fill>>B)<<(B0>>1); rebalance = sbits - (rebalance-ctx->remaining_bits); if (rebalance > 3<<BITRES && itheta!=16384) mbits += rebalance - (3<<BITRES); - cm |= quant_partition(ctx, X, N, mbits, B, - lowband, LM, + cm |= quant_partition(ctx, X, N, mbits, B, lowband, LM, MULT16_16_P15(gain,mid), fill); } } else { @@ -1012,18 +1056,14 @@ static unsigned quant_partition(struct band_ctx *ctx, celt_norm *X, /* Finally do the actual quantization */ if (encode) { - cm = alg_quant(X, N, K, spread, B, ec -#ifdef RESYNTH - , gain -#endif - ); + cm = alg_quant(X, N, K, spread, B, ec, gain, ctx->resynth, ctx->arch); } else { cm = alg_unquant(X, N, K, spread, B, ec, gain); } } else { /* If there's no pulse, fill the band anyway */ int j; - if (resynth) + if (ctx->resynth) { unsigned cm_mask; /* B can be as large as 16, so this shift might overflow an int on a @@ -1080,11 +1120,6 @@ static unsigned quant_band(struct band_ctx *ctx, celt_norm *X, int recombine=0; int longBlocks; unsigned cm=0; -#ifdef RESYNTH - int resynth = 1; -#else - int resynth = !ctx->encode; -#endif int k; int encode; int tf_change; @@ -1151,11 +1186,10 @@ static unsigned quant_band(struct band_ctx *ctx, celt_norm *X, deinterleave_hadamard(lowband, N_B>>recombine, B0<<recombine, longBlocks); } - cm = quant_partition(ctx, X, N, b, B, lowband, - LM, gain, fill); + cm = quant_partition(ctx, X, N, b, B, lowband, LM, gain, fill); /* This code is used by the decoder and by the resynthesis-enabled encoder */ - if (resynth) + if (ctx->resynth) { /* Undo the sample reorganization going from time order to frequency order */ if (B0>1) @@ -1208,11 +1242,6 @@ static unsigned quant_band_stereo(struct band_ctx *ctx, celt_norm *X, celt_norm int inv = 0; opus_val16 mid=0, side=0; unsigned cm=0; -#ifdef RESYNTH - int resynth = 1; -#else - int resynth = !ctx->encode; -#endif int mbits, sbits, delta; int itheta; int qalloc; @@ -1232,8 +1261,7 @@ static unsigned quant_band_stereo(struct band_ctx *ctx, celt_norm *X, celt_norm orig_fill = fill; - compute_theta(ctx, &sctx, X, Y, N, &b, B, B, - LM, 1, &fill); + compute_theta(ctx, &sctx, X, Y, N, &b, B, B, LM, 1, &fill); inv = sctx.inv; imid = sctx.imid; iside = sctx.iside; @@ -1281,13 +1309,13 @@ static unsigned quant_band_stereo(struct band_ctx *ctx, celt_norm *X, celt_norm sign = 1-2*sign; /* We use orig_fill here because we want to fold the side, but if itheta==16384, we'll have cleared the low bits of fill. */ - cm = quant_band(ctx, x2, N, mbits, B, lowband, - LM, lowband_out, Q15ONE, lowband_scratch, orig_fill); + cm = quant_band(ctx, x2, N, mbits, B, lowband, LM, lowband_out, Q15ONE, + lowband_scratch, orig_fill); /* We don't split N=2 bands, so cm is either 1 or 0 (for a fold-collapse), and there's no need to worry about mixing with the other channel. */ y2[0] = -sign*x2[1]; y2[1] = sign*x2[0]; - if (resynth) + if (ctx->resynth) { celt_norm tmp; X[0] = MULT16_16_Q15(mid, X[0]); @@ -1314,38 +1342,32 @@ static unsigned quant_band_stereo(struct band_ctx *ctx, celt_norm *X, celt_norm { /* In stereo mode, we do not apply a scaling to the mid because we need the normalized mid for folding later. */ - cm = quant_band(ctx, X, N, mbits, B, - lowband, LM, lowband_out, - Q15ONE, lowband_scratch, fill); + cm = quant_band(ctx, X, N, mbits, B, lowband, LM, lowband_out, Q15ONE, + lowband_scratch, fill); rebalance = mbits - (rebalance-ctx->remaining_bits); if (rebalance > 3<<BITRES && itheta!=0) sbits += rebalance - (3<<BITRES); /* For a stereo split, the high bits of fill are always zero, so no folding will be done to the side. */ - cm |= quant_band(ctx, Y, N, sbits, B, - NULL, LM, NULL, - side, NULL, fill>>B); + cm |= quant_band(ctx, Y, N, sbits, B, NULL, LM, NULL, side, NULL, fill>>B); } else { /* For a stereo split, the high bits of fill are always zero, so no folding will be done to the side. */ - cm = quant_band(ctx, Y, N, sbits, B, - NULL, LM, NULL, - side, NULL, fill>>B); + cm = quant_band(ctx, Y, N, sbits, B, NULL, LM, NULL, side, NULL, fill>>B); rebalance = sbits - (rebalance-ctx->remaining_bits); if (rebalance > 3<<BITRES && itheta!=16384) mbits += rebalance - (3<<BITRES); /* In stereo mode, we do not apply a scaling to the mid because we need the normalized mid for folding later. */ - cm |= quant_band(ctx, X, N, mbits, B, - lowband, LM, lowband_out, - Q15ONE, lowband_scratch, fill); + cm |= quant_band(ctx, X, N, mbits, B, lowband, LM, lowband_out, Q15ONE, + lowband_scratch, fill); } } /* This code is used by the decoder and by the resynthesis-enabled encoder */ - if (resynth) + if (ctx->resynth) { if (N!=2) stereo_merge(X, Y, mid, N, ctx->arch); @@ -1359,19 +1381,38 @@ static unsigned quant_band_stereo(struct band_ctx *ctx, celt_norm *X, celt_norm return cm; } +static void special_hybrid_folding(const CELTMode *m, celt_norm *norm, celt_norm *norm2, int start, int M, int dual_stereo) +{ + int n1, n2; + const opus_int16 * OPUS_RESTRICT eBands = m->eBands; + n1 = M*(eBands[start+1]-eBands[start]); + n2 = M*(eBands[start+2]-eBands[start+1]); + /* Duplicate enough of the first band folding data to be able to fold the second band. + Copies no data for CELT-only mode. */ + OPUS_COPY(&norm[n1], &norm[2*n1 - n2], n2-n1); + if (dual_stereo) + OPUS_COPY(&norm2[n1], &norm2[2*n1 - n2], n2-n1); +} void quant_all_bands(int encode, const CELTMode *m, int start, int end, celt_norm *X_, celt_norm *Y_, unsigned char *collapse_masks, const celt_ener *bandE, int *pulses, int shortBlocks, int spread, int dual_stereo, int intensity, int *tf_res, opus_int32 total_bits, opus_int32 balance, ec_ctx *ec, int LM, int codedBands, - opus_uint32 *seed, int arch) + opus_uint32 *seed, int complexity, int arch, int disable_inv) { int i; opus_int32 remaining_bits; const opus_int16 * OPUS_RESTRICT eBands = m->eBands; celt_norm * OPUS_RESTRICT norm, * OPUS_RESTRICT norm2; VARDECL(celt_norm, _norm); + VARDECL(celt_norm, _lowband_scratch); + VARDECL(celt_norm, X_save); + VARDECL(celt_norm, Y_save); + VARDECL(celt_norm, X_save2); + VARDECL(celt_norm, Y_save2); + VARDECL(celt_norm, norm_save2); + int resynth_alloc; celt_norm *lowband_scratch; int B; int M; @@ -1379,10 +1420,11 @@ void quant_all_bands(int encode, const CELTMode *m, int start, int end, int update_lowband = 1; int C = Y_ != NULL ? 2 : 1; int norm_offset; + int theta_rdo = encode && Y_!=NULL && !dual_stereo && complexity>=8; #ifdef RESYNTH int resynth = 1; #else - int resynth = !encode; + int resynth = !encode || theta_rdo; #endif struct band_ctx ctx; SAVE_STACK; @@ -1395,9 +1437,24 @@ void quant_all_bands(int encode, const CELTMode *m, int start, int end, ALLOC(_norm, C*(M*eBands[m->nbEBands-1]-norm_offset), celt_norm); norm = _norm; norm2 = norm + M*eBands[m->nbEBands-1]-norm_offset; - /* We can use the last band as scratch space because we don't need that - scratch space for the last band. */ - lowband_scratch = X_+M*eBands[m->nbEBands-1]; + + /* For decoding, we can use the last band as scratch space because we don't need that + scratch space for the last band and we don't care about the data there until we're + decoding the last band. */ + if (encode && resynth) + resynth_alloc = M*(eBands[m->nbEBands]-eBands[m->nbEBands-1]); + else + resynth_alloc = ALLOC_NONE; + ALLOC(_lowband_scratch, resynth_alloc, celt_norm); + if (encode && resynth) + lowband_scratch = _lowband_scratch; + else + lowband_scratch = X_+M*eBands[m->nbEBands-1]; + ALLOC(X_save, resynth_alloc, celt_norm); + ALLOC(Y_save, resynth_alloc, celt_norm); + ALLOC(X_save2, resynth_alloc, celt_norm); + ALLOC(Y_save2, resynth_alloc, celt_norm); + ALLOC(norm_save2, resynth_alloc, celt_norm); lowband_offset = 0; ctx.bandE = bandE; @@ -1408,6 +1465,11 @@ void quant_all_bands(int encode, const CELTMode *m, int start, int end, ctx.seed = *seed; ctx.spread = spread; ctx.arch = arch; + ctx.disable_inv = disable_inv; + ctx.resynth = resynth; + ctx.theta_round = 0; + /* Avoid injecting noise in the first band on transients. */ + ctx.avoid_split_noise = B > 1; for (i=start;i<end;i++) { opus_int32 tell; @@ -1430,6 +1492,7 @@ void quant_all_bands(int encode, const CELTMode *m, int start, int end, else Y = NULL; N = M*eBands[i+1]-M*eBands[i]; + celt_assert(N > 0); tell = ec_tell_frac(ec); /* Compute how many bits we want to allocate to this band */ @@ -1445,8 +1508,15 @@ void quant_all_bands(int encode, const CELTMode *m, int start, int end, b = 0; } +#ifndef DISABLE_UPDATE_DRAFT + if (resynth && (M*eBands[i]-N >= M*eBands[start] || i==start+1) && (update_lowband || lowband_offset==0)) + lowband_offset = i; + if (i == start+1) + special_hybrid_folding(m, norm, norm2, start, M, dual_stereo); +#else if (resynth && M*eBands[i]-N >= M*eBands[start] && (update_lowband || lowband_offset==0)) lowband_offset = i; +#endif tf_change = tf_res[i]; ctx.tf_change = tf_change; @@ -1457,7 +1527,7 @@ void quant_all_bands(int encode, const CELTMode *m, int start, int end, Y = norm; lowband_scratch = NULL; } - if (i==end-1) + if (last && !theta_rdo) lowband_scratch = NULL; /* Get a conservative estimate of the collapse_mask's for the bands we're @@ -1472,7 +1542,11 @@ void quant_all_bands(int encode, const CELTMode *m, int start, int end, fold_start = lowband_offset; while(M*eBands[--fold_start] > effective_lowband+norm_offset); fold_end = lowband_offset-1; +#ifndef DISABLE_UPDATE_DRAFT + while(++fold_end < i && M*eBands[fold_end] < effective_lowband+norm_offset+N); +#else while(M*eBands[++fold_end] < effective_lowband+norm_offset+N); +#endif x_cm = y_cm = 0; fold_i = fold_start; do { x_cm |= collapse_masks[fold_i*C+0]; @@ -1505,13 +1579,79 @@ void quant_all_bands(int encode, const CELTMode *m, int start, int end, } else { if (Y!=NULL) { - x_cm = quant_band_stereo(&ctx, X, Y, N, b, B, - effective_lowband != -1 ? norm+effective_lowband : NULL, LM, - last?NULL:norm+M*eBands[i]-norm_offset, lowband_scratch, x_cm|y_cm); + if (theta_rdo && i < intensity) + { + ec_ctx ec_save, ec_save2; + struct band_ctx ctx_save, ctx_save2; + opus_val32 dist0, dist1; + unsigned cm, cm2; + int nstart_bytes, nend_bytes, save_bytes; + unsigned char *bytes_buf; + unsigned char bytes_save[1275]; + opus_val16 w[2]; + compute_channel_weights(bandE[i], bandE[i+m->nbEBands], w); + /* Make a copy. */ + cm = x_cm|y_cm; + ec_save = *ec; + ctx_save = ctx; + OPUS_COPY(X_save, X, N); + OPUS_COPY(Y_save, Y, N); + /* Encode and round down. */ + ctx.theta_round = -1; + x_cm = quant_band_stereo(&ctx, X, Y, N, b, B, + effective_lowband != -1 ? norm+effective_lowband : NULL, LM, + last?NULL:norm+M*eBands[i]-norm_offset, lowband_scratch, cm); + dist0 = MULT16_32_Q15(w[0], celt_inner_prod(X_save, X, N, arch)) + MULT16_32_Q15(w[1], celt_inner_prod(Y_save, Y, N, arch)); + + /* Save first result. */ + cm2 = x_cm; + ec_save2 = *ec; + ctx_save2 = ctx; + OPUS_COPY(X_save2, X, N); + OPUS_COPY(Y_save2, Y, N); + if (!last) + OPUS_COPY(norm_save2, norm+M*eBands[i]-norm_offset, N); + nstart_bytes = ec_save.offs; + nend_bytes = ec_save.storage; + bytes_buf = ec_save.buf+nstart_bytes; + save_bytes = nend_bytes-nstart_bytes; + OPUS_COPY(bytes_save, bytes_buf, save_bytes); + + /* Restore */ + *ec = ec_save; + ctx = ctx_save; + OPUS_COPY(X, X_save, N); + OPUS_COPY(Y, Y_save, N); +#ifndef DISABLE_UPDATE_DRAFT + if (i == start+1) + special_hybrid_folding(m, norm, norm2, start, M, dual_stereo); +#endif + /* Encode and round up. */ + ctx.theta_round = 1; + x_cm = quant_band_stereo(&ctx, X, Y, N, b, B, + effective_lowband != -1 ? norm+effective_lowband : NULL, LM, + last?NULL:norm+M*eBands[i]-norm_offset, lowband_scratch, cm); + dist1 = MULT16_32_Q15(w[0], celt_inner_prod(X_save, X, N, arch)) + MULT16_32_Q15(w[1], celt_inner_prod(Y_save, Y, N, arch)); + if (dist0 >= dist1) { + x_cm = cm2; + *ec = ec_save2; + ctx = ctx_save2; + OPUS_COPY(X, X_save2, N); + OPUS_COPY(Y, Y_save2, N); + if (!last) + OPUS_COPY(norm+M*eBands[i]-norm_offset, norm_save2, N); + OPUS_COPY(bytes_buf, bytes_save, save_bytes); + } + } else { + ctx.theta_round = 0; + x_cm = quant_band_stereo(&ctx, X, Y, N, b, B, + effective_lowband != -1 ? norm+effective_lowband : NULL, LM, + last?NULL:norm+M*eBands[i]-norm_offset, lowband_scratch, x_cm|y_cm); + } } else { x_cm = quant_band(&ctx, X, N, b, B, effective_lowband != -1 ? norm+effective_lowband : NULL, LM, - last?NULL:norm+M*eBands[i]-norm_offset, Q15ONE, lowband_scratch, x_cm|y_cm); + last?NULL:norm+M*eBands[i]-norm_offset, Q15ONE, lowband_scratch, x_cm|y_cm); } y_cm = x_cm; } @@ -1521,6 +1661,9 @@ void quant_all_bands(int encode, const CELTMode *m, int start, int end, /* Update the folding position only as long as we have 1 bit/sample depth. */ update_lowband = b>(N<<BITRES); + /* We only need to avoid noise on a split for the first band. After that, we + have folding. */ + ctx.avoid_split_noise = 0; } *seed = ctx.seed; diff --git a/thirdparty/opus/celt/bands.h b/thirdparty/opus/celt/bands.h index e8bef4bad0..422b32cf75 100644 --- a/thirdparty/opus/celt/bands.h +++ b/thirdparty/opus/celt/bands.h @@ -36,12 +36,15 @@ #include "entdec.h" #include "rate.h" +opus_int16 bitexact_cos(opus_int16 x); +int bitexact_log2tan(int isin,int icos); + /** Compute the amplitude (sqrt energy) in each of the bands * @param m Mode data * @param X Spectrum * @param bandE Square root of the energy for each band (returned) */ -void compute_band_energies(const CELTMode *m, const celt_sig *X, celt_ener *bandE, int end, int C, int LM); +void compute_band_energies(const CELTMode *m, const celt_sig *X, celt_ener *bandE, int end, int C, int LM, int arch); /*void compute_noise_energies(const CELTMode *m, const celt_sig *X, const opus_val16 *tonality, celt_ener *bandE);*/ @@ -69,7 +72,7 @@ void denormalise_bands(const CELTMode *m, const celt_norm * OPUS_RESTRICT X, int spreading_decision(const CELTMode *m, const celt_norm *X, int *average, int last_decision, int *hf_average, int *tapset_decision, int update_hf, - int end, int C, int M); + int end, int C, int M, const int *spread_weight); #ifdef MEASURE_NORM_MSE void measure_norm_mse(const CELTMode *m, float *X, float *X0, float *bandE, float *bandE0, int M, int N, int C); @@ -105,7 +108,7 @@ void quant_all_bands(int encode, const CELTMode *m, int start, int end, const celt_ener *bandE, int *pulses, int shortBlocks, int spread, int dual_stereo, int intensity, int *tf_res, opus_int32 total_bits, opus_int32 balance, ec_ctx *ec, int M, int codedBands, opus_uint32 *seed, - int arch); + int complexity, int arch, int disable_inv); void anti_collapse(const CELTMode *m, celt_norm *X_, unsigned char *collapse_masks, int LM, int C, int size, int start, diff --git a/thirdparty/opus/celt/celt.c b/thirdparty/opus/celt/celt.c index b121c51a1f..9ce234695c 100644 --- a/thirdparty/opus/celt/celt.c +++ b/thirdparty/opus/celt/celt.c @@ -111,26 +111,31 @@ void comb_filter_const_c(opus_val32 *y, opus_val32 *x, int T, int N, t = MAC16_32_Q16(x[i], g10, x2); t = MAC16_32_Q16(t, g11, ADD32(x1,x3)); t = MAC16_32_Q16(t, g12, ADD32(x0,x4)); + t = SATURATE(t, SIG_SAT); y[i] = t; x4=SHL32(x[i-T+3],1); t = MAC16_32_Q16(x[i+1], g10, x1); t = MAC16_32_Q16(t, g11, ADD32(x0,x2)); t = MAC16_32_Q16(t, g12, ADD32(x4,x3)); + t = SATURATE(t, SIG_SAT); y[i+1] = t; x3=SHL32(x[i-T+4],1); t = MAC16_32_Q16(x[i+2], g10, x0); t = MAC16_32_Q16(t, g11, ADD32(x4,x1)); t = MAC16_32_Q16(t, g12, ADD32(x3,x2)); + t = SATURATE(t, SIG_SAT); y[i+2] = t; x2=SHL32(x[i-T+5],1); t = MAC16_32_Q16(x[i+3], g10, x4); t = MAC16_32_Q16(t, g11, ADD32(x3,x0)); t = MAC16_32_Q16(t, g12, ADD32(x2,x1)); + t = SATURATE(t, SIG_SAT); y[i+3] = t; x1=SHL32(x[i-T+6],1); t = MAC16_32_Q16(x[i+4], g10, x3); t = MAC16_32_Q16(t, g11, ADD32(x2,x4)); t = MAC16_32_Q16(t, g12, ADD32(x1,x0)); + t = SATURATE(t, SIG_SAT); y[i+4] = t; } #ifdef CUSTOM_MODES @@ -141,6 +146,7 @@ void comb_filter_const_c(opus_val32 *y, opus_val32 *x, int T, int N, t = MAC16_32_Q16(x[i], g10, x2); t = MAC16_32_Q16(t, g11, ADD32(x1,x3)); t = MAC16_32_Q16(t, g12, ADD32(x0,x4)); + t = SATURATE(t, SIG_SAT); y[i] = t; x4=x3; x3=x2; @@ -169,6 +175,7 @@ void comb_filter_const_c(opus_val32 *y, opus_val32 *x, int T, int N, + MULT16_32_Q15(g10,x2) + MULT16_32_Q15(g11,ADD32(x1,x3)) + MULT16_32_Q15(g12,ADD32(x0,x4)); + y[i] = SATURATE(y[i], SIG_SAT); x4=x3; x3=x2; x2=x1; @@ -200,6 +207,10 @@ void comb_filter(opus_val32 *y, opus_val32 *x, int T0, int T1, int N, OPUS_MOVE(y, x, N); return; } + /* When the gain is zero, T0 and/or T1 is set to zero. We need + to have then be at least 2 to avoid processing garbage data. */ + T0 = IMAX(T0, COMBFILTER_MINPERIOD); + T1 = IMAX(T1, COMBFILTER_MINPERIOD); g00 = MULT16_16_P15(g0, gains[tapset0][0]); g01 = MULT16_16_P15(g0, gains[tapset0][1]); g02 = MULT16_16_P15(g0, gains[tapset0][2]); @@ -225,6 +236,7 @@ void comb_filter(opus_val32 *y, opus_val32 *x, int T0, int T1, int N, + MULT16_32_Q15(MULT16_16_Q15(f,g10),x2) + MULT16_32_Q15(MULT16_16_Q15(f,g11),ADD32(x1,x3)) + MULT16_32_Q15(MULT16_16_Q15(f,g12),ADD32(x0,x4)); + y[i] = SATURATE(y[i], SIG_SAT); x4=x3; x3=x2; x2=x1; @@ -244,11 +256,16 @@ void comb_filter(opus_val32 *y, opus_val32 *x, int T0, int T1, int N, } #endif /* OVERRIDE_comb_filter */ +/* TF change table. Positive values mean better frequency resolution (longer + effective window), whereas negative values mean better time resolution + (shorter effective window). The second index is computed as: + 4*isTransient + 2*tf_select + per_band_flag */ const signed char tf_select_table[4][8] = { - {0, -1, 0, -1, 0,-1, 0,-1}, - {0, -1, 0, -2, 1, 0, 1,-1}, - {0, -2, 0, -3, 2, 0, 1,-1}, - {0, -2, 0, -3, 3, 0, 1,-1}, + /*isTransient=0 isTransient=1 */ + {0, -1, 0, -1, 0,-1, 0,-1}, /* 2.5 ms */ + {0, -1, 0, -2, 1, 0, 1,-1}, /* 5 ms */ + {0, -2, 0, -3, 2, 0, 1,-1}, /* 10 ms */ + {0, -2, 0, -3, 3, 0, 1,-1}, /* 20 ms */ }; diff --git a/thirdparty/opus/celt/celt.h b/thirdparty/opus/celt/celt.h index d1f7eb690d..24b6b2b520 100644 --- a/thirdparty/opus/celt/celt.h +++ b/thirdparty/opus/celt/celt.h @@ -50,6 +50,8 @@ extern "C" { #define CELTDecoder OpusCustomDecoder #define CELTMode OpusCustomMode +#define LEAK_BANDS 19 + typedef struct { int valid; float tonality; @@ -57,17 +59,27 @@ typedef struct { float noisiness; float activity; float music_prob; - int bandwidth; -}AnalysisInfo; + float music_prob_min; + float music_prob_max; + int bandwidth; + float activity_probability; + float max_pitch_ratio; + /* Store as Q6 char to save space. */ + unsigned char leak_boost[LEAK_BANDS]; +} AnalysisInfo; + +typedef struct { + int signalType; + int offset; +} SILKInfo; #define __celt_check_mode_ptr_ptr(ptr) ((ptr) + ((ptr) - (const CELTMode**)(ptr))) #define __celt_check_analysis_ptr(ptr) ((ptr) + ((ptr) - (const AnalysisInfo*)(ptr))) -/* Encoder/decoder Requests */ +#define __celt_check_silkinfo_ptr(ptr) ((ptr) + ((ptr) - (const SILKInfo*)(ptr))) -/* Expose this option again when variable framesize actually works */ -#define OPUS_FRAMESIZE_VARIABLE 5010 /**< Optimize the frame size dynamically */ +/* Encoder/decoder Requests */ #define CELT_SET_PREDICTION_REQUEST 10002 @@ -116,6 +128,9 @@ typedef struct { #define OPUS_SET_ENERGY_MASK_REQUEST 10026 #define OPUS_SET_ENERGY_MASK(x) OPUS_SET_ENERGY_MASK_REQUEST, __opus_check_val16_ptr(x) +#define CELT_SET_SILK_INFO_REQUEST 10028 +#define CELT_SET_SILK_INFO(x) CELT_SET_SILK_INFO_REQUEST, __celt_check_silkinfo_ptr(x) + /* Encoder stuff */ int celt_encoder_get_size(int channels); @@ -194,6 +209,13 @@ static OPUS_INLINE int fromOpus(unsigned char c) extern const signed char tf_select_table[4][8]; +#if defined(ENABLE_HARDENING) || defined(ENABLE_ASSERTIONS) +void validate_celt_decoder(CELTDecoder *st); +#define VALIDATE_CELT_DECODER(st) validate_celt_decoder(st) +#else +#define VALIDATE_CELT_DECODER(st) +#endif + int resampling_factor(opus_int32 rate); void celt_preemphasis(const opus_val16 * OPUS_RESTRICT pcmp, celt_sig * OPUS_RESTRICT inp, diff --git a/thirdparty/opus/celt/celt_decoder.c b/thirdparty/opus/celt/celt_decoder.c index b978bb34d1..e6efce9358 100644 --- a/thirdparty/opus/celt/celt_decoder.c +++ b/thirdparty/opus/celt/celt_decoder.c @@ -51,6 +51,14 @@ #include "celt_lpc.h" #include "vq.h" +/* The maximum pitch lag to allow in the pitch-based PLC. It's possible to save + CPU time in the PLC pitch search by making this smaller than MAX_PERIOD. The + current value corresponds to a pitch of 66.67 Hz. */ +#define PLC_PITCH_LAG_MAX (720) +/* The minimum pitch lag to allow in the pitch-based PLC. This corresponds to a + pitch of 480 Hz. */ +#define PLC_PITCH_LAG_MIN (100) + #if defined(SMALL_FOOTPRINT) && defined(FIXED_POINT) #define NORM_ALIASING_HACK #endif @@ -73,6 +81,7 @@ struct OpusCustomDecoder { int downsample; int start, end; int signalling; + int disable_inv; int arch; /* Everything beyond this point gets cleared on a reset */ @@ -100,6 +109,38 @@ struct OpusCustomDecoder { /* opus_val16 backgroundLogE[], Size = 2*mode->nbEBands */ }; +#if defined(ENABLE_HARDENING) || defined(ENABLE_ASSERTIONS) +/* Make basic checks on the CELT state to ensure we don't end + up writing all over memory. */ +void validate_celt_decoder(CELTDecoder *st) +{ +#ifndef CUSTOM_MODES + celt_assert(st->mode == opus_custom_mode_create(48000, 960, NULL)); + celt_assert(st->overlap == 120); +#endif + celt_assert(st->channels == 1 || st->channels == 2); + celt_assert(st->stream_channels == 1 || st->stream_channels == 2); + celt_assert(st->downsample > 0); + celt_assert(st->start == 0 || st->start == 17); + celt_assert(st->start < st->end); + celt_assert(st->end <= 21); +#ifdef OPUS_ARCHMASK + celt_assert(st->arch >= 0); + celt_assert(st->arch <= OPUS_ARCHMASK); +#endif + celt_assert(st->last_pitch_index <= PLC_PITCH_LAG_MAX); + celt_assert(st->last_pitch_index >= PLC_PITCH_LAG_MIN || st->last_pitch_index == 0); + celt_assert(st->postfilter_period < MAX_PERIOD); + celt_assert(st->postfilter_period >= COMBFILTER_MINPERIOD || st->postfilter_period == 0); + celt_assert(st->postfilter_period_old < MAX_PERIOD); + celt_assert(st->postfilter_period_old >= COMBFILTER_MINPERIOD || st->postfilter_period_old == 0); + celt_assert(st->postfilter_tapset <= 2); + celt_assert(st->postfilter_tapset >= 0); + celt_assert(st->postfilter_tapset_old <= 2); + celt_assert(st->postfilter_tapset_old >= 0); +} +#endif + int celt_decoder_get_size(int channels) { const CELTMode *mode = opus_custom_mode_create(48000, 960, NULL); @@ -163,6 +204,11 @@ OPUS_CUSTOM_NOSTATIC int opus_custom_decoder_init(CELTDecoder *st, const CELTMod st->start = 0; st->end = st->mode->effEBands; st->signalling = 1; +#ifndef DISABLE_UPDATE_DRAFT + st->disable_inv = channels == 1; +#else + st->disable_inv = 0; +#endif st->arch = opus_select_arch(); opus_custom_decoder_ctl(st, OPUS_RESET_STATE); @@ -177,6 +223,36 @@ void opus_custom_decoder_destroy(CELTDecoder *st) } #endif /* CUSTOM_MODES */ +#ifndef CUSTOM_MODES +/* Special case for stereo with no downsampling and no accumulation. This is + quite common and we can make it faster by processing both channels in the + same loop, reducing overhead due to the dependency loop in the IIR filter. */ +static void deemphasis_stereo_simple(celt_sig *in[], opus_val16 *pcm, int N, const opus_val16 coef0, + celt_sig *mem) +{ + celt_sig * OPUS_RESTRICT x0; + celt_sig * OPUS_RESTRICT x1; + celt_sig m0, m1; + int j; + x0=in[0]; + x1=in[1]; + m0 = mem[0]; + m1 = mem[1]; + for (j=0;j<N;j++) + { + celt_sig tmp0, tmp1; + /* Add VERY_SMALL to x[] first to reduce dependency chain. */ + tmp0 = x0[j] + VERY_SMALL + m0; + tmp1 = x1[j] + VERY_SMALL + m1; + m0 = MULT16_32_Q15(coef0, tmp0); + m1 = MULT16_32_Q15(coef0, tmp1); + pcm[2*j ] = SCALEOUT(SIG2WORD16(tmp0)); + pcm[2*j+1] = SCALEOUT(SIG2WORD16(tmp1)); + } + mem[0] = m0; + mem[1] = m1; +} +#endif #ifndef RESYNTH static @@ -190,6 +266,14 @@ void deemphasis(celt_sig *in[], opus_val16 *pcm, int N, int C, int downsample, c opus_val16 coef0; VARDECL(celt_sig, scratch); SAVE_STACK; +#ifndef CUSTOM_MODES + /* Short version for common case. */ + if (downsample == 1 && C == 2 && !accum) + { + deemphasis_stereo_simple(in, pcm, N, coef[0], mem); + return; + } +#endif #ifndef FIXED_POINT (void)accum; celt_assert(accum==0); @@ -225,7 +309,7 @@ void deemphasis(celt_sig *in[], opus_val16 *pcm, int N, int C, int downsample, c /* Shortcut for the standard (non-custom modes) case */ for (j=0;j<N;j++) { - celt_sig tmp = x[j] + m + VERY_SMALL; + celt_sig tmp = x[j] + VERY_SMALL + m; m = MULT16_32_Q15(coef0, tmp); scratch[j] = tmp; } @@ -246,7 +330,7 @@ void deemphasis(celt_sig *in[], opus_val16 *pcm, int N, int C, int downsample, c { for (j=0;j<N;j++) { - celt_sig tmp = x[j] + m + VERY_SMALL; + celt_sig tmp = x[j] + VERY_SMALL + m; m = MULT16_32_Q15(coef0, tmp); y[j*C] = SCALEOUT(SIG2WORD16(tmp)); } @@ -333,7 +417,7 @@ void celt_synthesis(const CELTMode *mode, celt_norm *X, celt_sig * out_syn[], denormalise_bands(mode, X+N, freq2, oldBandE+nbEBands, start, effEnd, M, downsample, silence); for (i=0;i<N;i++) - freq[i] = HALF32(ADD32(freq[i],freq2[i])); + freq[i] = ADD32(HALF32(freq[i]), HALF32(freq2[i])); for (b=0;b<B;b++) clt_mdct_backward(&mode->mdct, &freq[b], out_syn[0]+NB*b, mode->window, overlap, shift, B, arch); } else { @@ -345,6 +429,12 @@ void celt_synthesis(const CELTMode *mode, celt_norm *X, celt_sig * out_syn[], clt_mdct_backward(&mode->mdct, &freq[b], out_syn[c]+NB*b, mode->window, overlap, shift, B, arch); } while (++c<CC); } + /* Saturate IMDCT output so that we can't overflow in the pitch postfilter + or in the */ + c=0; do { + for (i=0;i<N;i++) + out_syn[c][i] = SATURATE(out_syn[c][i], SIG_SAT); + } while (++c<CC); RESTORE_STACK; } @@ -387,14 +477,6 @@ static void tf_decode(int start, int end, int isTransient, int *tf_res, int LM, } } -/* The maximum pitch lag to allow in the pitch-based PLC. It's possible to save - CPU time in the PLC pitch search by making this smaller than MAX_PERIOD. The - current value corresponds to a pitch of 66.67 Hz. */ -#define PLC_PITCH_LAG_MAX (720) -/* The minimum pitch lag to allow in the pitch-based PLC. This corresponds to a - pitch of 480 Hz. */ -#define PLC_PITCH_LAG_MIN (100) - static int celt_plc_pitch_search(celt_sig *decode_mem[2], int C, int arch) { int pitch_index; @@ -504,12 +586,15 @@ static void celt_decode_lost(CELTDecoder * OPUS_RESTRICT st, int N, int LM) celt_synthesis(mode, X, out_syn, oldBandE, start, effEnd, C, C, 0, LM, st->downsample, 0, st->arch); } else { + int exc_length; /* Pitch-based PLC */ const opus_val16 *window; + opus_val16 *exc; opus_val16 fade = Q15ONE; int pitch_index; VARDECL(opus_val32, etmp); - VARDECL(opus_val16, exc); + VARDECL(opus_val16, _exc); + VARDECL(opus_val16, fir_tmp); if (loss_count == 0) { @@ -519,8 +604,14 @@ static void celt_decode_lost(CELTDecoder * OPUS_RESTRICT st, int N, int LM) fade = QCONST16(.8f,15); } + /* We want the excitation for 2 pitch periods in order to look for a + decaying signal, but we can't get more than MAX_PERIOD. */ + exc_length = IMIN(2*pitch_index, MAX_PERIOD); + ALLOC(etmp, overlap, opus_val32); - ALLOC(exc, MAX_PERIOD, opus_val16); + ALLOC(_exc, MAX_PERIOD+LPC_ORDER, opus_val16); + ALLOC(fir_tmp, exc_length, opus_val16); + exc = _exc+LPC_ORDER; window = mode->window; c=0; do { opus_val16 decay; @@ -529,13 +620,11 @@ static void celt_decode_lost(CELTDecoder * OPUS_RESTRICT st, int N, int LM) celt_sig *buf; int extrapolation_offset; int extrapolation_len; - int exc_length; int j; buf = decode_mem[c]; - for (i=0;i<MAX_PERIOD;i++) { - exc[i] = ROUND16(buf[DECODE_BUFFER_SIZE-MAX_PERIOD+i], SIG_SHIFT); - } + for (i=0;i<MAX_PERIOD+LPC_ORDER;i++) + exc[i-LPC_ORDER] = ROUND16(buf[DECODE_BUFFER_SIZE-MAX_PERIOD-LPC_ORDER+i], SIG_SHIFT); if (loss_count == 0) { @@ -561,22 +650,32 @@ static void celt_decode_lost(CELTDecoder * OPUS_RESTRICT st, int N, int LM) #endif } _celt_lpc(lpc+c*LPC_ORDER, ac, LPC_ORDER); +#ifdef FIXED_POINT + /* For fixed-point, apply bandwidth expansion until we can guarantee that + no overflow can happen in the IIR filter. This means: + 32768*sum(abs(filter)) < 2^31 */ + while (1) { + opus_val16 tmp=Q15ONE; + opus_val32 sum=QCONST16(1., SIG_SHIFT); + for (i=0;i<LPC_ORDER;i++) + sum += ABS16(lpc[c*LPC_ORDER+i]); + if (sum < 65535) break; + for (i=0;i<LPC_ORDER;i++) + { + tmp = MULT16_16_Q15(QCONST16(.99f,15), tmp); + lpc[c*LPC_ORDER+i] = MULT16_16_Q15(lpc[c*LPC_ORDER+i], tmp); + } + } +#endif } - /* We want the excitation for 2 pitch periods in order to look for a - decaying signal, but we can't get more than MAX_PERIOD. */ - exc_length = IMIN(2*pitch_index, MAX_PERIOD); /* Initialize the LPC history with the samples just before the start of the region for which we're computing the excitation. */ { - opus_val16 lpc_mem[LPC_ORDER]; - for (i=0;i<LPC_ORDER;i++) - { - lpc_mem[i] = - ROUND16(buf[DECODE_BUFFER_SIZE-exc_length-1-i], SIG_SHIFT); - } - /* Compute the excitation for exc_length samples before the loss. */ + /* Compute the excitation for exc_length samples before the loss. We need the copy + because celt_fir() cannot filter in-place. */ celt_fir(exc+MAX_PERIOD-exc_length, lpc+c*LPC_ORDER, - exc+MAX_PERIOD-exc_length, exc_length, LPC_ORDER, lpc_mem, st->arch); + fir_tmp, exc_length, LPC_ORDER, st->arch); + OPUS_COPY(exc+MAX_PERIOD-exc_length, fir_tmp, exc_length); } /* Check if the waveform is decaying, and if so how fast. @@ -630,9 +729,8 @@ static void celt_decode_lost(CELTDecoder * OPUS_RESTRICT st, int N, int LM) tmp = ROUND16( buf[DECODE_BUFFER_SIZE-MAX_PERIOD-N+extrapolation_offset+j], SIG_SHIFT); - S1 += SHR32(MULT16_16(tmp, tmp), 8); + S1 += SHR32(MULT16_16(tmp, tmp), 10); } - { opus_val16 lpc_mem[LPC_ORDER]; /* Copy the last decoded samples (prior to the overlap region) to @@ -644,6 +742,10 @@ static void celt_decode_lost(CELTDecoder * OPUS_RESTRICT st, int N, int LM) celt_iir(buf+DECODE_BUFFER_SIZE-N, lpc+c*LPC_ORDER, buf+DECODE_BUFFER_SIZE-N, extrapolation_len, LPC_ORDER, lpc_mem, st->arch); +#ifdef FIXED_POINT + for (i=0; i < extrapolation_len; i++) + buf[DECODE_BUFFER_SIZE-N+i] = SATURATE(buf[DECODE_BUFFER_SIZE-N+i], SIG_SAT); +#endif } /* Check if the synthesis energy is higher than expected, which can @@ -654,7 +756,7 @@ static void celt_decode_lost(CELTDecoder * OPUS_RESTRICT st, int N, int LM) for (i=0;i<extrapolation_len;i++) { opus_val16 tmp = ROUND16(buf[DECODE_BUFFER_SIZE-N+i], SIG_SHIFT); - S2 += SHR32(MULT16_16(tmp, tmp), 8); + S2 += SHR32(MULT16_16(tmp, tmp), 10); } /* This checks for an "explosion" in the synthesis. */ #ifdef FIXED_POINT @@ -762,6 +864,7 @@ int celt_decode_with_ec(CELTDecoder * OPUS_RESTRICT st, const unsigned char *dat const opus_int16 *eBands; ALLOC_STACK; + VALIDATE_CELT_DECODER(st); mode = st->mode; nbEBands = mode->nbEBands; overlap = mode->overlap; @@ -956,7 +1059,7 @@ int celt_decode_with_ec(CELTDecoder * OPUS_RESTRICT st, const unsigned char *dat ALLOC(pulses, nbEBands, int); ALLOC(fine_priority, nbEBands, int); - codedBands = compute_allocation(mode, start, end, offsets, cap, + codedBands = clt_compute_allocation(mode, start, end, offsets, cap, alloc_trim, &intensity, &dual_stereo, bits, &balance, pulses, fine_quant, fine_priority, C, LM, dec, 0, 0, 0); @@ -979,7 +1082,8 @@ int celt_decode_with_ec(CELTDecoder * OPUS_RESTRICT st, const unsigned char *dat quant_all_bands(0, mode, start, end, X, C==2 ? X+N : NULL, collapse_masks, NULL, pulses, shortBlocks, spread_decision, dual_stereo, intensity, tf_res, - len*(8<<BITRES)-anti_collapse_rsv, balance, dec, LM, codedBands, &st->rng, st->arch); + len*(8<<BITRES)-anti_collapse_rsv, balance, dec, LM, codedBands, &st->rng, 0, + st->arch, st->disable_inv); if (anti_collapse_rsv > 0) { @@ -1234,6 +1338,26 @@ int opus_custom_decoder_ctl(CELTDecoder * OPUS_RESTRICT st, int request, ...) *value=st->rng; } break; + case OPUS_SET_PHASE_INVERSION_DISABLED_REQUEST: + { + opus_int32 value = va_arg(ap, opus_int32); + if(value<0 || value>1) + { + goto bad_arg; + } + st->disable_inv = value; + } + break; + case OPUS_GET_PHASE_INVERSION_DISABLED_REQUEST: + { + opus_int32 *value = va_arg(ap, opus_int32*); + if (!value) + { + goto bad_arg; + } + *value = st->disable_inv; + } + break; default: goto bad_request; } diff --git a/thirdparty/opus/celt/celt_encoder.c b/thirdparty/opus/celt/celt_encoder.c index 3ee7a4d3f7..44cb0850ab 100644 --- a/thirdparty/opus/celt/celt_encoder.c +++ b/thirdparty/opus/celt/celt_encoder.c @@ -73,8 +73,8 @@ struct OpusCustomEncoder { int constrained_vbr; /* If zero, VBR can do whatever it likes with the rate */ int loss_rate; int lsb_depth; - int variable_duration; int lfe; + int disable_inv; int arch; /* Everything beyond this point gets cleared on a reset */ @@ -98,6 +98,7 @@ struct OpusCustomEncoder { #endif int consec_transient; AnalysisInfo analysis; + SILKInfo silk_info; opus_val32 preemph_memE[2]; opus_val32 preemph_memD[2]; @@ -123,6 +124,7 @@ struct OpusCustomEncoder { /* opus_val16 oldBandE[], Size = channels*mode->nbEBands */ /* opus_val16 oldLogE[], Size = channels*mode->nbEBands */ /* opus_val16 oldLogE2[], Size = channels*mode->nbEBands */ + /* opus_val16 energyError[], Size = channels*mode->nbEBands */ }; int celt_encoder_get_size(int channels) @@ -136,9 +138,10 @@ OPUS_CUSTOM_NOSTATIC int opus_custom_encoder_get_size(const CELTMode *mode, int int size = sizeof(struct CELTEncoder) + (channels*mode->overlap-1)*sizeof(celt_sig) /* celt_sig in_mem[channels*mode->overlap]; */ + channels*COMBFILTER_MAXPERIOD*sizeof(celt_sig) /* celt_sig prefilter_mem[channels*COMBFILTER_MAXPERIOD]; */ - + 3*channels*mode->nbEBands*sizeof(opus_val16); /* opus_val16 oldBandE[channels*mode->nbEBands]; */ + + 4*channels*mode->nbEBands*sizeof(opus_val16); /* opus_val16 oldBandE[channels*mode->nbEBands]; */ /* opus_val16 oldLogE[channels*mode->nbEBands]; */ /* opus_val16 oldLogE2[channels*mode->nbEBands]; */ + /* opus_val16 energyError[channels*mode->nbEBands]; */ return size; } @@ -178,7 +181,6 @@ static int opus_custom_encoder_init_arch(CELTEncoder *st, const CELTMode *mode, st->start = 0; st->end = st->mode->effEBands; st->signalling = 1; - st->arch = arch; st->constrained_vbr = 1; @@ -223,7 +225,8 @@ void opus_custom_encoder_destroy(CELTEncoder *st) static int transient_analysis(const opus_val32 * OPUS_RESTRICT in, int len, int C, - opus_val16 *tf_estimate, int *tf_chan) + opus_val16 *tf_estimate, int *tf_chan, int allow_weak_transients, + int *weak_transient) { int i; VARDECL(opus_val16, tmp); @@ -233,6 +236,12 @@ static int transient_analysis(const opus_val32 * OPUS_RESTRICT in, int len, int int c; opus_val16 tf_max; int len2; + /* Forward masking: 6.7 dB/ms. */ +#ifdef FIXED_POINT + int forward_shift = 4; +#else + opus_val16 forward_decay = QCONST16(.0625f,15); +#endif /* Table of 6*64/x, trained on real data to minimize the average error */ static const unsigned char inv_table[128] = { 255,255,156,110, 86, 70, 59, 51, 45, 40, 37, 33, 31, 28, 26, 25, @@ -247,6 +256,19 @@ static int transient_analysis(const opus_val32 * OPUS_RESTRICT in, int len, int SAVE_STACK; ALLOC(tmp, len, opus_val16); + *weak_transient = 0; + /* For lower bitrates, let's be more conservative and have a forward masking + decay of 3.3 dB/ms. This avoids having to code transients at very low + bitrate (mostly for hybrid), which can result in unstable energy and/or + partial collapse. */ + if (allow_weak_transients) + { +#ifdef FIXED_POINT + forward_shift = 5; +#else + forward_decay = QCONST16(.03125f,15); +#endif + } len2=len/2; for (c=0;c<C;c++) { @@ -269,7 +291,7 @@ static int transient_analysis(const opus_val32 * OPUS_RESTRICT in, int len, int mem0 = mem1 + y - 2*x; mem1 = x - .5f*y; #endif - tmp[i] = EXTRACT16(SHR32(y,2)); + tmp[i] = SROUND16(y, 2); /*printf("%f ", tmp[i]);*/ } /*printf("\n");*/ @@ -280,7 +302,7 @@ static int transient_analysis(const opus_val32 * OPUS_RESTRICT in, int len, int /* Normalize tmp to max range */ { int shift=0; - shift = 14-celt_ilog2(1+celt_maxabs16(tmp, len)); + shift = 14-celt_ilog2(MAX16(1, celt_maxabs16(tmp, len))); if (shift!=0) { for (i=0;i<len;i++) @@ -299,9 +321,9 @@ static int transient_analysis(const opus_val32 * OPUS_RESTRICT in, int len, int mean += x2; #ifdef FIXED_POINT /* FIXME: Use PSHR16() instead */ - tmp[i] = mem0 + PSHR32(x2-mem0,4); + tmp[i] = mem0 + PSHR32(x2-mem0,forward_shift); #else - tmp[i] = mem0 + MULT16_16_P15(QCONST16(.0625f,15),x2-mem0); + tmp[i] = mem0 + MULT16_16_P15(forward_decay,x2-mem0); #endif mem0 = tmp[i]; } @@ -311,6 +333,7 @@ static int transient_analysis(const opus_val32 * OPUS_RESTRICT in, int len, int /* Backward pass to compute the pre-echo threshold */ for (i=len2-1;i>=0;i--) { + /* Backward masking: 13.9 dB/ms. */ #ifdef FIXED_POINT /* FIXME: Use PSHR16() instead */ tmp[i] = mem0 + PSHR32(tmp[i]-mem0,3); @@ -339,6 +362,12 @@ static int transient_analysis(const opus_val32 * OPUS_RESTRICT in, int len, int /* Compute harmonic mean discarding the unreliable boundaries The data is smooth, so we only take 1/4th of the samples */ unmask=0; + /* We should never see NaNs here. If we find any, then something really bad happened and we better abort + before it does any damage later on. If these asserts are disabled (no hardening), then the table + lookup a few lines below (id = ...) is likely to crash dur to an out-of-bounds read. DO NOT FIX + that crash on NaN since it could result in a worse issue later on. */ + celt_assert(!celt_isnan(tmp[0])); + celt_assert(!celt_isnan(norm)); for (i=12;i<len2-5;i+=4) { int id; @@ -359,7 +388,12 @@ static int transient_analysis(const opus_val32 * OPUS_RESTRICT in, int len, int } } is_transient = mask_metric>200; - + /* For low bitrates, define "weak transients" that need to be + handled differently to avoid partial collapse. */ + if (allow_weak_transients && is_transient && mask_metric<600) { + is_transient = 0; + *weak_transient = 1; + } /* Arbitrary metric for VBR boost */ tf_max = MAX16(0,celt_sqrt(27*mask_metric)-42); /* *tf_estimate = 1 + MIN16(1, sqrt(MAX16(0, tf_max-30))/20); */ @@ -549,7 +583,7 @@ static opus_val32 l1_metric(const celt_norm *tmp, int N, int LM, opus_val16 bias static int tf_analysis(const CELTMode *m, int len, int isTransient, int *tf_res, int lambda, celt_norm *X, int N0, int LM, - int *tf_sum, opus_val16 tf_estimate, int tf_chan) + opus_val16 tf_estimate, int tf_chan, int *importance) { int i; VARDECL(int, metric); @@ -574,7 +608,6 @@ static int tf_analysis(const CELTMode *m, int len, int isTransient, ALLOC(path0, len, int); ALLOC(path1, len, int); - *tf_sum = 0; for (i=0;i<len;i++) { int k, N; @@ -629,27 +662,26 @@ static int tf_analysis(const CELTMode *m, int len, int isTransient, metric[i] = 2*best_level; else metric[i] = -2*best_level; - *tf_sum += (isTransient ? LM : 0) - metric[i]/2; /* For bands that can't be split to -1, set the metric to the half-way point to avoid biasing the decision */ if (narrow && (metric[i]==0 || metric[i]==-2*LM)) metric[i]-=1; - /*printf("%d ", metric[i]);*/ + /*printf("%d ", metric[i]/2 + (!isTransient)*LM);*/ } /*printf("\n");*/ /* Search for the optimal tf resolution, including tf_select */ tf_select = 0; for (sel=0;sel<2;sel++) { - cost0 = 0; - cost1 = isTransient ? 0 : lambda; + cost0 = importance[0]*abs(metric[0]-2*tf_select_table[LM][4*isTransient+2*sel+0]); + cost1 = importance[0]*abs(metric[0]-2*tf_select_table[LM][4*isTransient+2*sel+1]) + (isTransient ? 0 : lambda); for (i=1;i<len;i++) { int curr0, curr1; curr0 = IMIN(cost0, cost1 + lambda); curr1 = IMIN(cost0 + lambda, cost1); - cost0 = curr0 + abs(metric[i]-2*tf_select_table[LM][4*isTransient+2*sel+0]); - cost1 = curr1 + abs(metric[i]-2*tf_select_table[LM][4*isTransient+2*sel+1]); + cost0 = curr0 + importance[i]*abs(metric[i]-2*tf_select_table[LM][4*isTransient+2*sel+0]); + cost1 = curr1 + importance[i]*abs(metric[i]-2*tf_select_table[LM][4*isTransient+2*sel+1]); } cost0 = IMIN(cost0, cost1); selcost[sel]=cost0; @@ -658,8 +690,8 @@ static int tf_analysis(const CELTMode *m, int len, int isTransient, * If tests confirm it's useful for non-transients, we could allow it. */ if (selcost[1]<selcost[0] && isTransient) tf_select=1; - cost0 = 0; - cost1 = isTransient ? 0 : lambda; + cost0 = importance[0]*abs(metric[0]-2*tf_select_table[LM][4*isTransient+2*tf_select+0]); + cost1 = importance[0]*abs(metric[0]-2*tf_select_table[LM][4*isTransient+2*tf_select+1]) + (isTransient ? 0 : lambda); /* Viterbi forward pass */ for (i=1;i<len;i++) { @@ -687,8 +719,8 @@ static int tf_analysis(const CELTMode *m, int len, int isTransient, curr1 = from1; path1[i]= 1; } - cost0 = curr0 + abs(metric[i]-2*tf_select_table[LM][4*isTransient+2*tf_select+0]); - cost1 = curr1 + abs(metric[i]-2*tf_select_table[LM][4*isTransient+2*tf_select+1]); + cost0 = curr0 + importance[i]*abs(metric[i]-2*tf_select_table[LM][4*isTransient+2*tf_select+0]); + cost1 = curr1 + importance[i]*abs(metric[i]-2*tf_select_table[LM][4*isTransient+2*tf_select+1]); } tf_res[len-1] = cost0 < cost1 ? 0 : 1; /* Viterbi backward pass to check the decisions */ @@ -754,7 +786,7 @@ static void tf_encode(int start, int end, int isTransient, int *tf_res, int LM, static int alloc_trim_analysis(const CELTMode *m, const celt_norm *X, const opus_val16 *bandLogE, int end, int LM, int C, int N0, AnalysisInfo *analysis, opus_val16 *stereo_saving, opus_val16 tf_estimate, - int intensity, opus_val16 surround_trim, int arch) + int intensity, opus_val16 surround_trim, opus_int32 equiv_rate, int arch) { int i; opus_val32 diff=0; @@ -762,6 +794,14 @@ static int alloc_trim_analysis(const CELTMode *m, const celt_norm *X, int trim_index; opus_val16 trim = QCONST16(5.f, 8); opus_val16 logXC, logXC2; + /* At low bitrate, reducing the trim seems to help. At higher bitrates, it's less + clear what's best, so we're keeping it as it was before, at least for now. */ + if (equiv_rate < 64000) { + trim = QCONST16(4.f, 8); + } else if (equiv_rate < 80000) { + opus_int32 frac = (equiv_rate-64000) >> 10; + trim = QCONST16(4.f, 8) + QCONST16(1.f/16.f, 8)*frac; + } if (C==2) { opus_val16 sum = 0; /* Q10 */ @@ -809,7 +849,7 @@ static int alloc_trim_analysis(const CELTMode *m, const celt_norm *X, } while (++c<C); diff /= C*(end-1); /*printf("%f\n", diff);*/ - trim -= MAX16(-QCONST16(2.f, 8), MIN16(QCONST16(2.f, 8), SHR16(diff+QCONST16(1.f, DB_SHIFT),DB_SHIFT-8)/6 )); + trim -= MAX32(-QCONST16(2.f, 8), MIN32(QCONST16(2.f, 8), SHR32(diff+QCONST16(1.f, DB_SHIFT),DB_SHIFT-8)/6 )); trim -= SHR16(surround_trim, DB_SHIFT-8); trim -= 2*SHR16(tf_estimate, 14-8); #ifndef DISABLE_FLOAT_API @@ -930,7 +970,8 @@ static opus_val16 median_of_3(const opus_val16 *x) static opus_val16 dynalloc_analysis(const opus_val16 *bandLogE, const opus_val16 *bandLogE2, int nbEBands, int start, int end, int C, int *offsets, int lsb_depth, const opus_int16 *logN, int isTransient, int vbr, int constrained_vbr, const opus_int16 *eBands, int LM, - int effectiveBytes, opus_int32 *tot_boost_, int lfe, opus_val16 *surround_dynalloc) + int effectiveBytes, opus_int32 *tot_boost_, int lfe, opus_val16 *surround_dynalloc, + AnalysisInfo *analysis, int *importance, int *spread_weight) { int i, c; opus_int32 tot_boost=0; @@ -956,6 +997,42 @@ static opus_val16 dynalloc_analysis(const opus_val16 *bandLogE, const opus_val16 for (i=0;i<end;i++) maxDepth = MAX16(maxDepth, bandLogE[c*nbEBands+i]-noise_floor[i]); } while (++c<C); + { + /* Compute a really simple masking model to avoid taking into account completely masked + bands when computing the spreading decision. */ + VARDECL(opus_val16, mask); + VARDECL(opus_val16, sig); + ALLOC(mask, nbEBands, opus_val16); + ALLOC(sig, nbEBands, opus_val16); + for (i=0;i<end;i++) + mask[i] = bandLogE[i]-noise_floor[i]; + if (C==2) + { + for (i=0;i<end;i++) + mask[i] = MAX16(mask[i], bandLogE[nbEBands+i]-noise_floor[i]); + } + OPUS_COPY(sig, mask, end); + for (i=1;i<end;i++) + mask[i] = MAX16(mask[i], mask[i-1] - QCONST16(2.f, DB_SHIFT)); + for (i=end-2;i>=0;i--) + mask[i] = MAX16(mask[i], mask[i+1] - QCONST16(3.f, DB_SHIFT)); + for (i=0;i<end;i++) + { + /* Compute SMR: Mask is never more than 72 dB below the peak and never below the noise floor.*/ + opus_val16 smr = sig[i]-MAX16(MAX16(0, maxDepth-QCONST16(12.f, DB_SHIFT)), mask[i]); + /* Clamp SMR to make sure we're not shifting by something negative or too large. */ +#ifdef FIXED_POINT + /* FIXME: Use PSHR16() instead */ + int shift = -PSHR32(MAX16(-QCONST16(5.f, DB_SHIFT), MIN16(0, smr)), DB_SHIFT); +#else + int shift = IMIN(5, IMAX(0, -(int)floor(.5f + smr))); +#endif + spread_weight[i] = 32 >> shift; + } + /*for (i=0;i<end;i++) + printf("%d ", spread_weight[i]); + printf("\n");*/ + } /* Make sure that dynamic allocation can't make us bust the budget */ if (effectiveBytes > 50 && LM>=1 && !lfe) { @@ -1012,6 +1089,14 @@ static opus_val16 dynalloc_analysis(const opus_val16 *bandLogE, const opus_val16 } for (i=start;i<end;i++) follower[i] = MAX16(follower[i], surround_dynalloc[i]); + for (i=start;i<end;i++) + { +#ifdef FIXED_POINT + importance[i] = PSHR32(13*celt_exp2(MIN16(follower[i], QCONST16(4.f, DB_SHIFT))), 16); +#else + importance[i] = (int)floor(.5f+13*celt_exp2(MIN16(follower[i], QCONST16(4.f, DB_SHIFT)))); +#endif + } /* For non-transient CBR/CVBR frames, halve the dynalloc contribution */ if ((!vbr || constrained_vbr)&&!isTransient) { @@ -1020,14 +1105,26 @@ static opus_val16 dynalloc_analysis(const opus_val16 *bandLogE, const opus_val16 } for (i=start;i<end;i++) { - int width; - int boost; - int boost_bits; - if (i<8) follower[i] *= 2; if (i>=12) follower[i] = HALF16(follower[i]); + } +#ifdef DISABLE_FLOAT_API + (void)analysis; +#else + if (analysis->valid) + { + for (i=start;i<IMIN(LEAK_BANDS, end);i++) + follower[i] = follower[i] + QCONST16(1.f/64.f, DB_SHIFT)*analysis->leak_boost[i]; + } +#endif + for (i=start;i<end;i++) + { + int width; + int boost; + int boost_bits; + follower[i] = MIN16(follower[i], QCONST16(4, DB_SHIFT)); width = C*(eBands[i+1]-eBands[i])<<LM; @@ -1042,11 +1139,11 @@ static opus_val16 dynalloc_analysis(const opus_val16 *bandLogE, const opus_val16 boost = (int)SHR32(EXTEND32(follower[i])*width/6,DB_SHIFT); boost_bits = boost*6<<BITRES; } - /* For CBR and non-transient CVBR frames, limit dynalloc to 1/4 of the bits */ + /* For CBR and non-transient CVBR frames, limit dynalloc to 2/3 of the bits */ if ((!vbr || (constrained_vbr&&!isTransient)) - && (tot_boost+boost_bits)>>BITRES>>3 > effectiveBytes/4) + && (tot_boost+boost_bits)>>BITRES>>3 > 2*effectiveBytes/3) { - opus_int32 cap = ((effectiveBytes/4)<<BITRES<<3); + opus_int32 cap = ((2*effectiveBytes/3)<<BITRES<<3); offsets[i] = cap-tot_boost; tot_boost = cap; break; @@ -1055,6 +1152,9 @@ static opus_val16 dynalloc_analysis(const opus_val16 *bandLogE, const opus_val16 tot_boost += boost_bits; } } + } else { + for (i=start;i<end;i++) + importance[i] = 13; } *tot_boost_ = tot_boost; RESTORE_STACK; @@ -1063,7 +1163,7 @@ static opus_val16 dynalloc_analysis(const opus_val16 *bandLogE, const opus_val16 static int run_prefilter(CELTEncoder *st, celt_sig *in, celt_sig *prefilter_mem, int CC, int N, - int prefilter_tapset, int *pitch, opus_val16 *gain, int *qgain, int enabled, int nbAvailableBytes) + int prefilter_tapset, int *pitch, opus_val16 *gain, int *qgain, int enabled, int nbAvailableBytes, AnalysisInfo *analysis) { int c; VARDECL(celt_sig, _pre); @@ -1119,7 +1219,12 @@ static int run_prefilter(CELTEncoder *st, celt_sig *in, celt_sig *prefilter_mem, gain1 = 0; pitch_index = COMBFILTER_MINPERIOD; } - +#ifndef DISABLE_FLOAT_API + if (analysis->valid) + gain1 = (opus_val16)(gain1 * analysis->max_pitch_ratio); +#else + (void)analysis; +#endif /* Gain threshold for enabling the prefilter/postfilter */ pf_threshold = QCONST16(.2f,15); @@ -1193,7 +1298,7 @@ static int compute_vbr(const CELTMode *mode, AnalysisInfo *analysis, opus_int32 int LM, opus_int32 bitrate, int lastCodedBands, int C, int intensity, int constrained_vbr, opus_val16 stereo_saving, int tot_boost, opus_val16 tf_estimate, int pitch_change, opus_val16 maxDepth, - int variable_duration, int lfe, int has_surround_mask, opus_val16 surround_masking, + int lfe, int has_surround_mask, opus_val16 surround_masking, opus_val16 temporal_vbr) { /* The target rate in 8th bits per frame */ @@ -1235,10 +1340,9 @@ static int compute_vbr(const CELTMode *mode, AnalysisInfo *analysis, opus_int32 SHR32(MULT16_16(stereo_saving-QCONST16(0.1f,8),(coded_stereo_dof<<BITRES)),8)); } /* Boost the rate according to dynalloc (minus the dynalloc average for calibration). */ - target += tot_boost-(16<<LM); + target += tot_boost-(19<<LM); /* Apply transient boost, compensating for average boost. */ - tf_calibration = variable_duration==OPUS_FRAMESIZE_VARIABLE ? - QCONST16(0.02f,14) : QCONST16(0.04f,14); + tf_calibration = QCONST16(0.044f,14); target += (opus_int32)SHL32(MULT16_32_Q15(tf_estimate-tf_calibration, target),1); #ifndef DISABLE_FLOAT_API @@ -1249,7 +1353,7 @@ static int compute_vbr(const CELTMode *mode, AnalysisInfo *analysis, opus_int32 float tonal; /* Tonality boost (compensating for the average). */ - tonal = MAX16(0.f,analysis->tonality-.15f)-0.09f; + tonal = MAX16(0.f,analysis->tonality-.15f)-0.12f; tonal_target = target + (opus_int32)((coded_bins<<BITRES)*1.2f*tonal); if (pitch_change) tonal_target += (opus_int32)((coded_bins<<BITRES)*.8f); @@ -1279,21 +1383,11 @@ static int compute_vbr(const CELTMode *mode, AnalysisInfo *analysis, opus_int32 /*printf("%f %d\n", maxDepth, floor_depth);*/ } - if ((!has_surround_mask||lfe) && (constrained_vbr || bitrate<64000)) + /* Make VBR less aggressive for constrained VBR because we can't keep a higher bitrate + for long. Needs tuning. */ + if ((!has_surround_mask||lfe) && constrained_vbr) { - opus_val16 rate_factor = Q15ONE; - if (bitrate < 64000) - { -#ifdef FIXED_POINT - rate_factor = MAX16(0,(bitrate-32000)); -#else - rate_factor = MAX16(0,(1.f/32768)*(bitrate-32000)); -#endif - } - if (constrained_vbr) - rate_factor = MIN16(rate_factor, QCONST16(0.67f, 15)); - target = base_target + (opus_int32)MULT16_32_Q15(rate_factor, target-base_target); - + target = base_target + (opus_int32)MULT16_32_Q15(QCONST16(0.67f, 15), target-base_target); } if (!has_surround_mask && tf_estimate < QCONST16(.2f, 14)) @@ -1327,11 +1421,13 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, VARDECL(int, pulses); VARDECL(int, cap); VARDECL(int, offsets); + VARDECL(int, importance); + VARDECL(int, spread_weight); VARDECL(int, fine_priority); VARDECL(int, tf_res); VARDECL(unsigned char, collapse_masks); celt_sig *prefilter_mem; - opus_val16 *oldBandE, *oldLogE, *oldLogE2; + opus_val16 *oldBandE, *oldLogE, *oldLogE2, *energyError; int shortBlocks=0; int isTransient=0; const int CC = st->channels; @@ -1343,7 +1439,6 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, int end; int effEnd; int codedBands; - int tf_sum; int alloc_trim; int pitch_index=COMBFILTER_MINPERIOD; opus_val16 gain1 = 0; @@ -1355,6 +1450,7 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, opus_int32 total_boost; opus_int32 balance; opus_int32 tell; + opus_int32 tell0_frac; int prefilter_tapset=0; int pf_on; int anti_collapse_rsv; @@ -1376,7 +1472,10 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, opus_val16 surround_masking=0; opus_val16 temporal_vbr=0; opus_val16 surround_trim = 0; - opus_int32 equiv_rate = 510000; + opus_int32 equiv_rate; + int hybrid; + int weak_transient = 0; + int enable_tf_analysis; VARDECL(opus_val16, surround_dynalloc); ALLOC_STACK; @@ -1386,6 +1485,7 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, eBands = mode->eBands; start = st->start; end = st->end; + hybrid = start != 0; tf_estimate = 0; if (nbCompressedBytes<2 || pcm==NULL) { @@ -1409,12 +1509,14 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, oldBandE = (opus_val16*)(st->in_mem+CC*(overlap+COMBFILTER_MAXPERIOD)); oldLogE = oldBandE + CC*nbEBands; oldLogE2 = oldLogE + CC*nbEBands; + energyError = oldLogE2 + CC*nbEBands; if (enc==NULL) { - tell=1; + tell0_frac=tell=1; nbFilledBytes=0; } else { + tell0_frac=ec_tell_frac(enc); tell=ec_tell(enc); nbFilledBytes=(tell+4)>>3; } @@ -1467,10 +1569,11 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, if (st->bitrate!=OPUS_BITRATE_MAX) nbCompressedBytes = IMAX(2, IMIN(nbCompressedBytes, (tmp+4*mode->Fs)/(8*mode->Fs)-!!st->signalling)); - effectiveBytes = nbCompressedBytes; + effectiveBytes = nbCompressedBytes - nbFilledBytes; } + equiv_rate = ((opus_int32)nbCompressedBytes*8*50 >> (3-LM)) - (40*C+20)*((400>>LM) - 50); if (st->bitrate != OPUS_BITRATE_MAX) - equiv_rate = st->bitrate - (40*C+20)*((400>>LM) - 50); + equiv_rate = IMIN(equiv_rate, st->bitrate - (40*C+20)*((400>>LM) - 50)); if (enc==NULL) { @@ -1558,17 +1661,17 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, { int enabled; int qg; - enabled = ((st->lfe&&nbAvailableBytes>3) || nbAvailableBytes>12*C) && start==0 && !silence && !st->disable_pf - && st->complexity >= 5 && !(st->consec_transient && LM!=3 && st->variable_duration==OPUS_FRAMESIZE_VARIABLE); + enabled = ((st->lfe&&nbAvailableBytes>3) || nbAvailableBytes>12*C) && !hybrid && !silence && !st->disable_pf + && st->complexity >= 5; prefilter_tapset = st->tapset_decision; - pf_on = run_prefilter(st, in, prefilter_mem, CC, N, prefilter_tapset, &pitch_index, &gain1, &qg, enabled, nbAvailableBytes); + pf_on = run_prefilter(st, in, prefilter_mem, CC, N, prefilter_tapset, &pitch_index, &gain1, &qg, enabled, nbAvailableBytes, &st->analysis); if ((gain1 > QCONST16(.4f,15) || st->prefilter_gain > QCONST16(.4f,15)) && (!st->analysis.valid || st->analysis.tonality > .3) && (pitch_index > 1.26*st->prefilter_period || pitch_index < .79*st->prefilter_period)) pitch_change = 1; if (pf_on==0) { - if(start==0 && tell+16<=total_bits) + if(!hybrid && tell+16<=total_bits) ec_enc_bit_logp(enc, 0, 1); } else { /*This block is not gated by a total bits check only because @@ -1589,8 +1692,12 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, shortBlocks = 0; if (st->complexity >= 1 && !st->lfe) { + /* Reduces the likelihood of energy instability on fricatives at low bitrate + in hybrid mode. It seems like we still want to have real transients on vowels + though (small SILK quantization offset value). */ + int allow_weak_transients = hybrid && effectiveBytes<15 && st->silk_info.signalType != 2; isTransient = transient_analysis(in, N+overlap, CC, - &tf_estimate, &tf_chan); + &tf_estimate, &tf_chan, allow_weak_transients, &weak_transient); } if (LM>0 && ec_tell(enc)+3<=total_bits) { @@ -1610,16 +1717,19 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, if (secondMdct) { compute_mdcts(mode, 0, in, freq, C, CC, LM, st->upsample, st->arch); - compute_band_energies(mode, freq, bandE, effEnd, C, LM); + compute_band_energies(mode, freq, bandE, effEnd, C, LM, st->arch); amp2Log2(mode, effEnd, end, bandE, bandLogE2, C); for (i=0;i<C*nbEBands;i++) bandLogE2[i] += HALF16(SHL16(LM, DB_SHIFT)); } compute_mdcts(mode, shortBlocks, in, freq, C, CC, LM, st->upsample, st->arch); + /* This should catch any NaN in the CELT input. Since we're not supposed to see any (they're filtered + at the Opus layer), just abort. */ + celt_assert(!celt_isnan(freq[0]) && (C==1 || !celt_isnan(freq[N]))); if (CC==2&&C==1) tf_chan = 0; - compute_band_energies(mode, freq, bandE, effEnd, C, LM); + compute_band_energies(mode, freq, bandE, effEnd, C, LM, st->arch); if (st->lfe) { @@ -1634,7 +1744,7 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, ALLOC(surround_dynalloc, C*nbEBands, opus_val16); OPUS_CLEAR(surround_dynalloc, end); /* This computes how much masking takes place between surround channels */ - if (start==0&&st->energy_mask&&!st->lfe) + if (!hybrid&&st->energy_mask&&!st->lfe) { int mask_end; int midband; @@ -1736,14 +1846,14 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, /* Last chance to catch any transient we might have missed in the time-domain analysis */ - if (LM>0 && ec_tell(enc)+3<=total_bits && !isTransient && st->complexity>=5 && !st->lfe) + if (LM>0 && ec_tell(enc)+3<=total_bits && !isTransient && st->complexity>=5 && !st->lfe && !hybrid) { if (patch_transient_decision(bandLogE, oldBandE, nbEBands, start, end, C)) { isTransient = 1; shortBlocks = M; compute_mdcts(mode, shortBlocks, in, freq, C, CC, LM, st->upsample, st->arch); - compute_band_energies(mode, freq, bandE, effEnd, C, LM); + compute_band_energies(mode, freq, bandE, effEnd, C, LM, st->arch); amp2Log2(mode, effEnd, end, bandE, bandLogE, C); /* Compensate for the scaling of short vs long mdcts */ for (i=0;i<C*nbEBands;i++) @@ -1760,31 +1870,59 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, /* Band normalisation */ normalise_bands(mode, freq, X, bandE, effEnd, C, M); + enable_tf_analysis = effectiveBytes>=15*C && !hybrid && st->complexity>=2 && !st->lfe; + + ALLOC(offsets, nbEBands, int); + ALLOC(importance, nbEBands, int); + ALLOC(spread_weight, nbEBands, int); + + maxDepth = dynalloc_analysis(bandLogE, bandLogE2, nbEBands, start, end, C, offsets, + st->lsb_depth, mode->logN, isTransient, st->vbr, st->constrained_vbr, + eBands, LM, effectiveBytes, &tot_boost, st->lfe, surround_dynalloc, &st->analysis, importance, spread_weight); + ALLOC(tf_res, nbEBands, int); /* Disable variable tf resolution for hybrid and at very low bitrate */ - if (effectiveBytes>=15*C && start==0 && st->complexity>=2 && !st->lfe) + if (enable_tf_analysis) { int lambda; - if (effectiveBytes<40) - lambda = 12; - else if (effectiveBytes<60) - lambda = 6; - else if (effectiveBytes<100) - lambda = 4; - else - lambda = 3; - lambda*=2; - tf_select = tf_analysis(mode, effEnd, isTransient, tf_res, lambda, X, N, LM, &tf_sum, tf_estimate, tf_chan); + lambda = IMAX(80, 20480/effectiveBytes + 2); + tf_select = tf_analysis(mode, effEnd, isTransient, tf_res, lambda, X, N, LM, tf_estimate, tf_chan, importance); for (i=effEnd;i<end;i++) tf_res[i] = tf_res[effEnd-1]; + } else if (hybrid && weak_transient) + { + /* For weak transients, we rely on the fact that improving time resolution using + TF on a long window is imperfect and will not result in an energy collapse at + low bitrate. */ + for (i=0;i<end;i++) + tf_res[i] = 1; + tf_select=0; + } else if (hybrid && effectiveBytes<15 && st->silk_info.signalType != 2) + { + /* For low bitrate hybrid, we force temporal resolution to 5 ms rather than 2.5 ms. */ + for (i=0;i<end;i++) + tf_res[i] = 0; + tf_select=isTransient; } else { - tf_sum = 0; for (i=0;i<end;i++) tf_res[i] = isTransient; tf_select=0; } ALLOC(error, C*nbEBands, opus_val16); + c=0; + do { + for (i=start;i<end;i++) + { + /* When the energy is stable, slightly bias energy quantization towards + the previous error to make the gain more stable (a constant offset is + better than fluctuations). */ + if (ABS32(SUB32(bandLogE[i+c*nbEBands], oldBandE[i+c*nbEBands])) < QCONST16(2.f, DB_SHIFT)) + { + bandLogE[i+c*nbEBands] -= MULT16_16_Q15(energyError[i+c*nbEBands], QCONST16(0.25f, 15)); + } + } + } while (++c < C); quant_coarse_energy(mode, start, end, effEnd, bandLogE, oldBandE, total_bits, error, enc, C, LM, nbAvailableBytes, st->force_intra, @@ -1798,7 +1936,15 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, { st->tapset_decision = 0; st->spread_decision = SPREAD_NORMAL; - } else if (shortBlocks || st->complexity < 3 || nbAvailableBytes < 10*C || start != 0) + } else if (hybrid) + { + if (st->complexity == 0) + st->spread_decision = SPREAD_NONE; + else if (isTransient) + st->spread_decision = SPREAD_NORMAL; + else + st->spread_decision = SPREAD_AGGRESSIVE; + } else if (shortBlocks || st->complexity < 3 || nbAvailableBytes < 10*C) { if (st->complexity == 0) st->spread_decision = SPREAD_NONE; @@ -1822,7 +1968,7 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, { st->spread_decision = spreading_decision(mode, X, &st->tonal_average, st->spread_decision, &st->hf_average, - &st->tapset_decision, pf_on&&!shortBlocks, effEnd, C, M); + &st->tapset_decision, pf_on&&!shortBlocks, effEnd, C, M, spread_weight); } /*printf("%d %d\n", st->tapset_decision, st->spread_decision);*/ /*printf("%f %d %f %d\n\n", st->analysis.tonality, st->spread_decision, st->analysis.tonality_slope, st->tapset_decision);*/ @@ -1830,11 +1976,6 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, ec_enc_icdf(enc, st->spread_decision, spread_icdf, 5); } - ALLOC(offsets, nbEBands, int); - - maxDepth = dynalloc_analysis(bandLogE, bandLogE2, nbEBands, start, end, C, offsets, - st->lsb_depth, mode->logN, isTransient, st->vbr, st->constrained_vbr, - eBands, LM, effectiveBytes, &tot_boost, st->lfe, surround_dynalloc); /* For LFE, everything interesting is in the first band */ if (st->lfe) offsets[0] = IMIN(8, effectiveBytes/3); @@ -1896,12 +2037,15 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, alloc_trim = 5; if (tell+(6<<BITRES) <= total_bits - total_boost) { - if (st->lfe) + if (start > 0 || st->lfe) + { + st->stereo_saving = 0; alloc_trim = 5; - else + } else { alloc_trim = alloc_trim_analysis(mode, X, bandLogE, end, LM, C, N, &st->analysis, &st->stereo_saving, tf_estimate, - st->intensity, surround_trim, st->arch); + st->intensity, surround_trim, equiv_rate, st->arch); + } ec_enc_icdf(enc, alloc_trim, trim_icdf, 7); tell = ec_tell_frac(enc); } @@ -1919,17 +2063,36 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, /* Don't attempt to use more than 510 kb/s, even for frames smaller than 20 ms. The CELT allocator will just not be able to use more than that anyway. */ nbCompressedBytes = IMIN(nbCompressedBytes,1275>>(3-LM)); - base_target = vbr_rate - ((40*C+20)<<BITRES); + if (!hybrid) + { + base_target = vbr_rate - ((40*C+20)<<BITRES); + } else { + base_target = IMAX(0, vbr_rate - ((9*C+4)<<BITRES)); + } if (st->constrained_vbr) base_target += (st->vbr_offset>>lm_diff); - target = compute_vbr(mode, &st->analysis, base_target, LM, equiv_rate, + if (!hybrid) + { + target = compute_vbr(mode, &st->analysis, base_target, LM, equiv_rate, st->lastCodedBands, C, st->intensity, st->constrained_vbr, st->stereo_saving, tot_boost, tf_estimate, pitch_change, maxDepth, - st->variable_duration, st->lfe, st->energy_mask!=NULL, surround_masking, + st->lfe, st->energy_mask!=NULL, surround_masking, temporal_vbr); - + } else { + target = base_target; + /* Tonal frames (offset<100) need more bits than noisy (offset>100) ones. */ + if (st->silk_info.offset < 100) target += 12 << BITRES >> (3-LM); + if (st->silk_info.offset > 100) target -= 18 << BITRES >> (3-LM); + /* Boosting bitrate on transients and vowels with significant temporal + spikes. */ + target += (opus_int32)MULT16_16_Q14(tf_estimate-QCONST16(.25f,14), (50<<BITRES)); + /* If we have a strong transient, let's make sure it has enough bits to code + the first two bands, so that it can use folding rather than noise. */ + if (tf_estimate > QCONST16(.7f,14)) + target = IMAX(target, 50<<BITRES); + } /* The current offset is removed from the target and the space used so far is added*/ target=target+tell; @@ -1937,11 +2100,16 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, result in the encoder running out of bits. The margin of 2 bytes ensures that none of the bust-prevention logic in the decoder will have triggered so far. */ - min_allowed = ((tell+total_boost+(1<<(BITRES+3))-1)>>(BITRES+3)) + 2 - nbFilledBytes; + min_allowed = ((tell+total_boost+(1<<(BITRES+3))-1)>>(BITRES+3)) + 2; + /* Take into account the 37 bits we need to have left in the packet to + signal a redundant frame in hybrid mode. Creating a shorter packet would + create an entropy coder desync. */ + if (hybrid) + min_allowed = IMAX(min_allowed, (tell0_frac+(37<<BITRES)+total_boost+(1<<(BITRES+3))-1)>>(BITRES+3)); nbAvailableBytes = (target+(1<<(BITRES+2)))>>(BITRES+3); nbAvailableBytes = IMAX(min_allowed,nbAvailableBytes); - nbAvailableBytes = IMIN(nbCompressedBytes,nbAvailableBytes+nbFilledBytes) - nbFilledBytes; + nbAvailableBytes = IMIN(nbCompressedBytes,nbAvailableBytes); /* By how much did we "miss" the target on that frame */ delta = target - vbr_rate; @@ -1988,7 +2156,7 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, st->vbr_reservoir = 0; /*printf ("+%d\n", adjust);*/ } - nbCompressedBytes = IMIN(nbCompressedBytes,nbAvailableBytes+nbFilledBytes); + nbCompressedBytes = IMIN(nbCompressedBytes,nbAvailableBytes); /*printf("%d\n", nbCompressedBytes*50*8);*/ /* This moves the raw bits to take into account the new compressed size */ ec_enc_shrink(enc, nbCompressedBytes); @@ -2023,7 +2191,7 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, #endif if (st->lfe) signalBandwidth = 1; - codedBands = compute_allocation(mode, start, end, offsets, cap, + codedBands = clt_compute_allocation(mode, start, end, offsets, cap, alloc_trim, &st->intensity, &dual_stereo, bits, &balance, pulses, fine_quant, fine_priority, C, LM, enc, 1, st->lastCodedBands, signalBandwidth); if (st->lastCodedBands) @@ -2038,7 +2206,7 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, quant_all_bands(1, mode, start, end, X, C==2 ? X+N : NULL, collapse_masks, bandE, pulses, shortBlocks, st->spread_decision, dual_stereo, st->intensity, tf_res, nbCompressedBytes*(8<<BITRES)-anti_collapse_rsv, - balance, enc, LM, codedBands, &st->rng, st->arch); + balance, enc, LM, codedBands, &st->rng, st->complexity, st->arch, st->disable_inv); if (anti_collapse_rsv > 0) { @@ -2049,6 +2217,14 @@ int celt_encode_with_ec(CELTEncoder * OPUS_RESTRICT st, const opus_val16 * pcm, ec_enc_bits(enc, anti_collapse_on, 1); } quant_energy_finalise(mode, start, end, oldBandE, error, fine_quant, fine_priority, nbCompressedBytes*8-ec_tell(enc), enc, C); + OPUS_CLEAR(energyError, nbEBands*CC); + c=0; + do { + for (i=start;i<end;i++) + { + energyError[i+c*nbEBands] = MAX16(-QCONST16(0.5f, 15), MIN16(QCONST16(0.5f, 15), error[i+c*nbEBands])); + } + } while (++c < C); if (silence) { @@ -2321,10 +2497,24 @@ int opus_custom_encoder_ctl(CELTEncoder * OPUS_RESTRICT st, int request, ...) *value=st->lsb_depth; } break; - case OPUS_SET_EXPERT_FRAME_DURATION_REQUEST: + case OPUS_SET_PHASE_INVERSION_DISABLED_REQUEST: { opus_int32 value = va_arg(ap, opus_int32); - st->variable_duration = value; + if(value<0 || value>1) + { + goto bad_arg; + } + st->disable_inv = value; + } + break; + case OPUS_GET_PHASE_INVERSION_DISABLED_REQUEST: + { + opus_int32 *value = va_arg(ap, opus_int32*); + if (!value) + { + goto bad_arg; + } + *value = st->disable_inv; } break; case OPUS_RESET_STATE: @@ -2368,6 +2558,13 @@ int opus_custom_encoder_ctl(CELTEncoder * OPUS_RESTRICT st, int request, ...) OPUS_COPY(&st->analysis, info, 1); } break; + case CELT_SET_SILK_INFO_REQUEST: + { + SILKInfo *info = va_arg(ap, SILKInfo *); + if (info) + OPUS_COPY(&st->silk_info, info, 1); + } + break; case CELT_GET_MODE_REQUEST: { const CELTMode ** value = va_arg(ap, const CELTMode**); diff --git a/thirdparty/opus/celt/celt_lpc.c b/thirdparty/opus/celt/celt_lpc.c index b410a21c5f..8ecb693ee9 100644 --- a/thirdparty/opus/celt/celt_lpc.c +++ b/thirdparty/opus/celt/celt_lpc.c @@ -89,58 +89,40 @@ int p void celt_fir_c( - const opus_val16 *_x, + const opus_val16 *x, const opus_val16 *num, - opus_val16 *_y, + opus_val16 *y, int N, int ord, - opus_val16 *mem, int arch) { int i,j; VARDECL(opus_val16, rnum); - VARDECL(opus_val16, x); SAVE_STACK; - + celt_assert(x != y); ALLOC(rnum, ord, opus_val16); - ALLOC(x, N+ord, opus_val16); for(i=0;i<ord;i++) rnum[i] = num[ord-i-1]; - for(i=0;i<ord;i++) - x[i] = mem[ord-i-1]; - for (i=0;i<N;i++) - x[i+ord]=_x[i]; - for(i=0;i<ord;i++) - mem[i] = _x[N-i-1]; -#ifdef SMALL_FOOTPRINT - (void)arch; - for (i=0;i<N;i++) - { - opus_val32 sum = SHL32(EXTEND32(_x[i]), SIG_SHIFT); - for (j=0;j<ord;j++) - { - sum = MAC16_16(sum,rnum[j],x[i+j]); - } - _y[i] = SATURATE16(PSHR32(sum, SIG_SHIFT)); - } -#else for (i=0;i<N-3;i+=4) { - opus_val32 sum[4]={0,0,0,0}; - xcorr_kernel(rnum, x+i, sum, ord, arch); - _y[i ] = SATURATE16(ADD32(EXTEND32(_x[i ]), PSHR32(sum[0], SIG_SHIFT))); - _y[i+1] = SATURATE16(ADD32(EXTEND32(_x[i+1]), PSHR32(sum[1], SIG_SHIFT))); - _y[i+2] = SATURATE16(ADD32(EXTEND32(_x[i+2]), PSHR32(sum[2], SIG_SHIFT))); - _y[i+3] = SATURATE16(ADD32(EXTEND32(_x[i+3]), PSHR32(sum[3], SIG_SHIFT))); + opus_val32 sum[4]; + sum[0] = SHL32(EXTEND32(x[i ]), SIG_SHIFT); + sum[1] = SHL32(EXTEND32(x[i+1]), SIG_SHIFT); + sum[2] = SHL32(EXTEND32(x[i+2]), SIG_SHIFT); + sum[3] = SHL32(EXTEND32(x[i+3]), SIG_SHIFT); + xcorr_kernel(rnum, x+i-ord, sum, ord, arch); + y[i ] = ROUND16(sum[0], SIG_SHIFT); + y[i+1] = ROUND16(sum[1], SIG_SHIFT); + y[i+2] = ROUND16(sum[2], SIG_SHIFT); + y[i+3] = ROUND16(sum[3], SIG_SHIFT); } for (;i<N;i++) { - opus_val32 sum = 0; + opus_val32 sum = SHL32(EXTEND32(x[i]), SIG_SHIFT); for (j=0;j<ord;j++) - sum = MAC16_16(sum,rnum[j],x[i+j]); - _y[i] = SATURATE16(ADD32(EXTEND32(_x[i]), PSHR32(sum, SIG_SHIFT))); + sum = MAC16_16(sum,rnum[j],x[i+j-ord]); + y[i] = ROUND16(sum, SIG_SHIFT); } -#endif RESTORE_STACK; } @@ -166,7 +148,7 @@ void celt_iir(const opus_val32 *_x, { mem[j]=mem[j-1]; } - mem[0] = ROUND16(sum,SIG_SHIFT); + mem[0] = SROUND16(sum, SIG_SHIFT); _y[i] = sum; } #else @@ -195,20 +177,20 @@ void celt_iir(const opus_val32 *_x, xcorr_kernel(rden, y+i, sum, ord, arch); /* Patch up the result to compensate for the fact that this is an IIR */ - y[i+ord ] = -ROUND16(sum[0],SIG_SHIFT); + y[i+ord ] = -SROUND16(sum[0],SIG_SHIFT); _y[i ] = sum[0]; sum[1] = MAC16_16(sum[1], y[i+ord ], den[0]); - y[i+ord+1] = -ROUND16(sum[1],SIG_SHIFT); + y[i+ord+1] = -SROUND16(sum[1],SIG_SHIFT); _y[i+1] = sum[1]; sum[2] = MAC16_16(sum[2], y[i+ord+1], den[0]); sum[2] = MAC16_16(sum[2], y[i+ord ], den[1]); - y[i+ord+2] = -ROUND16(sum[2],SIG_SHIFT); + y[i+ord+2] = -SROUND16(sum[2],SIG_SHIFT); _y[i+2] = sum[2]; sum[3] = MAC16_16(sum[3], y[i+ord+2], den[0]); sum[3] = MAC16_16(sum[3], y[i+ord+1], den[1]); sum[3] = MAC16_16(sum[3], y[i+ord ], den[2]); - y[i+ord+3] = -ROUND16(sum[3],SIG_SHIFT); + y[i+ord+3] = -SROUND16(sum[3],SIG_SHIFT); _y[i+3] = sum[3]; } for (;i<N;i++) @@ -216,7 +198,7 @@ void celt_iir(const opus_val32 *_x, opus_val32 sum = _x[i]; for (j=0;j<ord;j++) sum -= MULT16_16(rden[j],y[i+j]); - y[i+ord] = ROUND16(sum,SIG_SHIFT); + y[i+ord] = SROUND16(sum,SIG_SHIFT); _y[i] = sum; } for(i=0;i<ord;i++) diff --git a/thirdparty/opus/celt/celt_lpc.h b/thirdparty/opus/celt/celt_lpc.h index 323459eb1a..a4c5fd6ea5 100644 --- a/thirdparty/opus/celt/celt_lpc.h +++ b/thirdparty/opus/celt/celt_lpc.h @@ -45,12 +45,11 @@ void celt_fir_c( opus_val16 *y, int N, int ord, - opus_val16 *mem, int arch); #if !defined(OVERRIDE_CELT_FIR) -#define celt_fir(x, num, y, N, ord, mem, arch) \ - (celt_fir_c(x, num, y, N, ord, mem, arch)) +#define celt_fir(x, num, y, N, ord, arch) \ + (celt_fir_c(x, num, y, N, ord, arch)) #endif void celt_iir(const opus_val32 *x, diff --git a/thirdparty/opus/celt/cwrs.c b/thirdparty/opus/celt/cwrs.c index 9722f0ac86..a552e4f0fb 100644 --- a/thirdparty/opus/celt/cwrs.c +++ b/thirdparty/opus/celt/cwrs.c @@ -482,7 +482,7 @@ static opus_val32 cwrsi(int _n,int _k,opus_uint32 _i,int *_y){ k0=_k; q=row[_n]; if(q>_i){ - celt_assert(p>q); + celt_sig_assert(p>q); _k=_n; do p=CELT_PVQ_U_ROW[--_k][_n]; while(p>_i); diff --git a/thirdparty/opus/celt/entcode.h b/thirdparty/opus/celt/entcode.h index 13d6c84ef0..3763e3f284 100644 --- a/thirdparty/opus/celt/entcode.h +++ b/thirdparty/opus/celt/entcode.h @@ -122,7 +122,7 @@ opus_uint32 ec_tell_frac(ec_ctx *_this); /* Tested exhaustively for all n and for 1<=d<=256 */ static OPUS_INLINE opus_uint32 celt_udiv(opus_uint32 n, opus_uint32 d) { - celt_assert(d>0); + celt_sig_assert(d>0); #ifdef USE_SMALL_DIV_TABLE if (d>256) return n/d; @@ -138,7 +138,7 @@ static OPUS_INLINE opus_uint32 celt_udiv(opus_uint32 n, opus_uint32 d) { } static OPUS_INLINE opus_int32 celt_sudiv(opus_int32 n, opus_int32 d) { - celt_assert(d>0); + celt_sig_assert(d>0); #ifdef USE_SMALL_DIV_TABLE if (n<0) return -(opus_int32)celt_udiv(-n, d); diff --git a/thirdparty/opus/celt/entdec.h b/thirdparty/opus/celt/entdec.h index d8ab318730..025fc1870d 100644 --- a/thirdparty/opus/celt/entdec.h +++ b/thirdparty/opus/celt/entdec.h @@ -85,7 +85,7 @@ int ec_dec_icdf(ec_dec *_this,const unsigned char *_icdf,unsigned _ftb); The bits must have been encoded with ec_enc_uint(). No call to ec_dec_update() is necessary after this call. _ft: The number of integers that can be decoded (one more than the max). - This must be at least one, and no more than 2**32-1. + This must be at least 2, and no more than 2**32-1. Return: The decoded bits.*/ opus_uint32 ec_dec_uint(ec_dec *_this,opus_uint32 _ft); diff --git a/thirdparty/opus/celt/entenc.h b/thirdparty/opus/celt/entenc.h index 796bc4d572..f502eaf662 100644 --- a/thirdparty/opus/celt/entenc.h +++ b/thirdparty/opus/celt/entenc.h @@ -67,7 +67,7 @@ void ec_enc_icdf(ec_enc *_this,int _s,const unsigned char *_icdf,unsigned _ftb); /*Encodes a raw unsigned integer in the stream. _fl: The integer to encode. _ft: The number of integers that can be encoded (one more than the max). - This must be at least one, and no more than 2**32-1.*/ + This must be at least 2, and no more than 2**32-1.*/ void ec_enc_uint(ec_enc *_this,opus_uint32 _fl,opus_uint32 _ft); /*Encodes a sequence of raw bits in the stream. diff --git a/thirdparty/opus/celt/fixed_debug.h b/thirdparty/opus/celt/fixed_debug.h index d28227f5dc..f435295234 100644 --- a/thirdparty/opus/celt/fixed_debug.h +++ b/thirdparty/opus/celt/fixed_debug.h @@ -59,6 +59,14 @@ extern opus_int64 celt_mips; #define SHR(a,b) SHR32(a,b) #define PSHR(a,b) PSHR32(a,b) +/** Add two 32-bit values, ignore any overflows */ +#define ADD32_ovflw(a,b) (celt_mips+=2,(opus_val32)((opus_uint32)(a)+(opus_uint32)(b))) +/** Subtract two 32-bit values, ignore any overflows */ +#define SUB32_ovflw(a,b) (celt_mips+=2,(opus_val32)((opus_uint32)(a)-(opus_uint32)(b))) +/* Avoid MSVC warning C4146: unary minus operator applied to unsigned type */ +/** Negate 32-bit value, ignore any overflows */ +#define NEG32_ovflw(a) (celt_mips+=2,(opus_val32)(0-(opus_uint32)(a))) + static OPUS_INLINE short NEG16(int x) { int res; @@ -227,12 +235,11 @@ static OPUS_INLINE int SHL32_(opus_int64 a, int shift, char *file, int line) #define VSHR32(a, shift) (((shift)>0) ? SHR32(a, shift) : SHL32(a, -(shift))) #define ROUND16(x,a) (celt_mips--,EXTRACT16(PSHR32((x),(a)))) +#define SROUND16(x,a) (celt_mips--,EXTRACT16(SATURATE(PSHR32(x,a), 32767))); + #define HALF16(x) (SHR16(x,1)) #define HALF32(x) (SHR32(x,1)) -//#define SHR(a,shift) ((a) >> (shift)) -//#define SHL(a,shift) ((a) << (shift)) - #define ADD16(a, b) ADD16_(a, b, __FILE__, __LINE__) static OPUS_INLINE short ADD16_(int a, int b, char *file, int line) { diff --git a/thirdparty/opus/celt/fixed_generic.h b/thirdparty/opus/celt/fixed_generic.h index 1cfd6d6989..5f4abda76e 100644 --- a/thirdparty/opus/celt/fixed_generic.h +++ b/thirdparty/opus/celt/fixed_generic.h @@ -104,6 +104,9 @@ /** Shift by a and round-to-neareast 32-bit value. Result is a 16-bit value */ #define ROUND16(x,a) (EXTRACT16(PSHR32((x),(a)))) +/** Shift by a and round-to-neareast 32-bit value. Result is a saturated 16-bit value */ +#define SROUND16(x,a) EXTRACT16(SATURATE(PSHR32(x,a), 32767)); + /** Divide by two */ #define HALF16(x) (SHR16(x,1)) #define HALF32(x) (SHR32(x,1)) @@ -117,6 +120,14 @@ /** Subtract two 32-bit values */ #define SUB32(a,b) ((opus_val32)(a)-(opus_val32)(b)) +/** Add two 32-bit values, ignore any overflows */ +#define ADD32_ovflw(a,b) ((opus_val32)((opus_uint32)(a)+(opus_uint32)(b))) +/** Subtract two 32-bit values, ignore any overflows */ +#define SUB32_ovflw(a,b) ((opus_val32)((opus_uint32)(a)-(opus_uint32)(b))) +/* Avoid MSVC warning C4146: unary minus operator applied to unsigned type */ +/** Negate 32-bit value, ignore any overflows */ +#define NEG32_ovflw(a) ((opus_val32)(0-(opus_uint32)(a))) + /** 16x16 multiplication where the result fits in 16 bits */ #define MULT16_16_16(a,b) ((((opus_val16)(a))*((opus_val16)(b)))) diff --git a/thirdparty/opus/celt/float_cast.h b/thirdparty/opus/celt/float_cast.h index ed5a39b543..889dae965f 100644 --- a/thirdparty/opus/celt/float_cast.h +++ b/thirdparty/opus/celt/float_cast.h @@ -61,7 +61,13 @@ ** the config.h file. */ -#if (HAVE_LRINTF) +/* With GCC, when SSE is available, the fastest conversion is cvtss2si. */ +#if defined(__GNUC__) && defined(__SSE__) + +#include <xmmintrin.h> +static OPUS_INLINE opus_int32 float2int(float x) {return _mm_cvt_ss2si(_mm_set_ss(x));} + +#elif defined(HAVE_LRINTF) /* These defines enable functionality introduced with the 1999 ISO C ** standard. They must be defined before the inclusion of math.h to @@ -90,10 +96,10 @@ #include <math.h> #define float2int(x) lrint(x) -#elif (defined(_MSC_VER) && _MSC_VER >= 1400) && defined (_M_X64) +#elif (defined(_MSC_VER) && _MSC_VER >= 1400) && (defined(_M_X64) || (defined(_M_IX86_FP) && _M_IX86_FP >= 1)) #include <xmmintrin.h> - __inline long int float2int(float value) + static __inline long int float2int(float value) { return _mm_cvtss_si32(_mm_load_ss(&value)); } @@ -104,7 +110,7 @@ ** Therefore implement OPUS_INLINE versions of these functions here. */ - __inline long int + static __inline long int float2int (float flt) { int intgr; diff --git a/thirdparty/opus/celt/kiss_fft.c b/thirdparty/opus/celt/kiss_fft.c index 1f8fd05321..83775165d8 100644 --- a/thirdparty/opus/celt/kiss_fft.c +++ b/thirdparty/opus/celt/kiss_fft.c @@ -82,8 +82,8 @@ static void kf_bfly2( C_SUB( Fout2[0] , Fout[0] , t ); C_ADDTO( Fout[0] , t ); - t.r = S_MUL(Fout2[1].r+Fout2[1].i, tw); - t.i = S_MUL(Fout2[1].i-Fout2[1].r, tw); + t.r = S_MUL(ADD32_ovflw(Fout2[1].r, Fout2[1].i), tw); + t.i = S_MUL(SUB32_ovflw(Fout2[1].i, Fout2[1].r), tw); C_SUB( Fout2[1] , Fout[1] , t ); C_ADDTO( Fout[1] , t ); @@ -92,8 +92,8 @@ static void kf_bfly2( C_SUB( Fout2[2] , Fout[2] , t ); C_ADDTO( Fout[2] , t ); - t.r = S_MUL(Fout2[3].i-Fout2[3].r, tw); - t.i = S_MUL(-Fout2[3].i-Fout2[3].r, tw); + t.r = S_MUL(SUB32_ovflw(Fout2[3].i, Fout2[3].r), tw); + t.i = S_MUL(NEG32_ovflw(ADD32_ovflw(Fout2[3].i, Fout2[3].r)), tw); C_SUB( Fout2[3] , Fout[3] , t ); C_ADDTO( Fout[3] , t ); Fout += 8; @@ -126,10 +126,10 @@ static void kf_bfly4( C_ADDTO( *Fout , scratch1 ); C_SUB( scratch1 , Fout[1] , Fout[3] ); - Fout[1].r = scratch0.r + scratch1.i; - Fout[1].i = scratch0.i - scratch1.r; - Fout[3].r = scratch0.r - scratch1.i; - Fout[3].i = scratch0.i + scratch1.r; + Fout[1].r = ADD32_ovflw(scratch0.r, scratch1.i); + Fout[1].i = SUB32_ovflw(scratch0.i, scratch1.r); + Fout[3].r = SUB32_ovflw(scratch0.r, scratch1.i); + Fout[3].i = ADD32_ovflw(scratch0.i, scratch1.r); Fout+=4; } } else { @@ -160,10 +160,10 @@ static void kf_bfly4( tw3 += fstride*3; C_ADDTO( *Fout , scratch[3] ); - Fout[m].r = scratch[5].r + scratch[4].i; - Fout[m].i = scratch[5].i - scratch[4].r; - Fout[m3].r = scratch[5].r - scratch[4].i; - Fout[m3].i = scratch[5].i + scratch[4].r; + Fout[m].r = ADD32_ovflw(scratch[5].r, scratch[4].i); + Fout[m].i = SUB32_ovflw(scratch[5].i, scratch[4].r); + Fout[m3].r = SUB32_ovflw(scratch[5].r, scratch[4].i); + Fout[m3].i = ADD32_ovflw(scratch[5].i, scratch[4].r); ++Fout; } } @@ -212,18 +212,18 @@ static void kf_bfly3( tw1 += fstride; tw2 += fstride*2; - Fout[m].r = Fout->r - HALF_OF(scratch[3].r); - Fout[m].i = Fout->i - HALF_OF(scratch[3].i); + Fout[m].r = SUB32_ovflw(Fout->r, HALF_OF(scratch[3].r)); + Fout[m].i = SUB32_ovflw(Fout->i, HALF_OF(scratch[3].i)); C_MULBYSCALAR( scratch[0] , epi3.i ); C_ADDTO(*Fout,scratch[3]); - Fout[m2].r = Fout[m].r + scratch[0].i; - Fout[m2].i = Fout[m].i - scratch[0].r; + Fout[m2].r = ADD32_ovflw(Fout[m].r, scratch[0].i); + Fout[m2].i = SUB32_ovflw(Fout[m].i, scratch[0].r); - Fout[m].r -= scratch[0].i; - Fout[m].i += scratch[0].r; + Fout[m].r = SUB32_ovflw(Fout[m].r, scratch[0].i); + Fout[m].i = ADD32_ovflw(Fout[m].i, scratch[0].r); ++Fout; } while(--k); @@ -282,22 +282,22 @@ static void kf_bfly5( C_ADD( scratch[8],scratch[2],scratch[3]); C_SUB( scratch[9],scratch[2],scratch[3]); - Fout0->r += scratch[7].r + scratch[8].r; - Fout0->i += scratch[7].i + scratch[8].i; + Fout0->r = ADD32_ovflw(Fout0->r, ADD32_ovflw(scratch[7].r, scratch[8].r)); + Fout0->i = ADD32_ovflw(Fout0->i, ADD32_ovflw(scratch[7].i, scratch[8].i)); - scratch[5].r = scratch[0].r + S_MUL(scratch[7].r,ya.r) + S_MUL(scratch[8].r,yb.r); - scratch[5].i = scratch[0].i + S_MUL(scratch[7].i,ya.r) + S_MUL(scratch[8].i,yb.r); + scratch[5].r = ADD32_ovflw(scratch[0].r, ADD32_ovflw(S_MUL(scratch[7].r,ya.r), S_MUL(scratch[8].r,yb.r))); + scratch[5].i = ADD32_ovflw(scratch[0].i, ADD32_ovflw(S_MUL(scratch[7].i,ya.r), S_MUL(scratch[8].i,yb.r))); - scratch[6].r = S_MUL(scratch[10].i,ya.i) + S_MUL(scratch[9].i,yb.i); - scratch[6].i = -S_MUL(scratch[10].r,ya.i) - S_MUL(scratch[9].r,yb.i); + scratch[6].r = ADD32_ovflw(S_MUL(scratch[10].i,ya.i), S_MUL(scratch[9].i,yb.i)); + scratch[6].i = NEG32_ovflw(ADD32_ovflw(S_MUL(scratch[10].r,ya.i), S_MUL(scratch[9].r,yb.i))); C_SUB(*Fout1,scratch[5],scratch[6]); C_ADD(*Fout4,scratch[5],scratch[6]); - scratch[11].r = scratch[0].r + S_MUL(scratch[7].r,yb.r) + S_MUL(scratch[8].r,ya.r); - scratch[11].i = scratch[0].i + S_MUL(scratch[7].i,yb.r) + S_MUL(scratch[8].i,ya.r); - scratch[12].r = - S_MUL(scratch[10].i,yb.i) + S_MUL(scratch[9].i,ya.i); - scratch[12].i = S_MUL(scratch[10].r,yb.i) - S_MUL(scratch[9].r,ya.i); + scratch[11].r = ADD32_ovflw(scratch[0].r, ADD32_ovflw(S_MUL(scratch[7].r,yb.r), S_MUL(scratch[8].r,ya.r))); + scratch[11].i = ADD32_ovflw(scratch[0].i, ADD32_ovflw(S_MUL(scratch[7].i,yb.r), S_MUL(scratch[8].i,ya.r))); + scratch[12].r = SUB32_ovflw(S_MUL(scratch[9].i,ya.i), S_MUL(scratch[10].i,yb.i)); + scratch[12].i = SUB32_ovflw(S_MUL(scratch[10].r,yb.i), S_MUL(scratch[9].r,ya.i)); C_ADD(*Fout2,scratch[11],scratch[12]); C_SUB(*Fout3,scratch[11],scratch[12]); diff --git a/thirdparty/opus/celt/mathops.c b/thirdparty/opus/celt/mathops.c index 21a01f52e4..6ee9b9e101 100644 --- a/thirdparty/opus/celt/mathops.c +++ b/thirdparty/opus/celt/mathops.c @@ -38,7 +38,8 @@ #include "mathops.h" /*Compute floor(sqrt(_val)) with exact arithmetic. - This has been tested on all possible 32-bit inputs.*/ + _val must be greater than 0. + This has been tested on all possible 32-bit inputs greater than 0.*/ unsigned isqrt32(opus_uint32 _val){ unsigned b; unsigned g; @@ -182,7 +183,7 @@ opus_val32 celt_rcp(opus_val32 x) int i; opus_val16 n; opus_val16 r; - celt_assert2(x>0, "celt_rcp() only defined for positive values"); + celt_sig_assert(x>0); i = celt_ilog2(x); /* n is Q15 with range [0,1). */ n = VSHR32(x,i-15)-32768; diff --git a/thirdparty/opus/celt/mathops.h b/thirdparty/opus/celt/mathops.h index a0525a9610..5e86ff0dd2 100644 --- a/thirdparty/opus/celt/mathops.h +++ b/thirdparty/opus/celt/mathops.h @@ -38,11 +38,44 @@ #include "entcode.h" #include "os_support.h" +#define PI 3.141592653f + /* Multiplies two 16-bit fractional values. Bit-exactness of this macro is important */ #define FRAC_MUL16(a,b) ((16384+((opus_int32)(opus_int16)(a)*(opus_int16)(b)))>>15) unsigned isqrt32(opus_uint32 _val); +/* CELT doesn't need it for fixed-point, by analysis.c does. */ +#if !defined(FIXED_POINT) || defined(ANALYSIS_C) +#define cA 0.43157974f +#define cB 0.67848403f +#define cC 0.08595542f +#define cE ((float)PI/2) +static OPUS_INLINE float fast_atan2f(float y, float x) { + float x2, y2; + x2 = x*x; + y2 = y*y; + /* For very small values, we don't care about the answer, so + we can just return 0. */ + if (x2 + y2 < 1e-18f) + { + return 0; + } + if(x2<y2){ + float den = (y2 + cB*x2) * (y2 + cC*x2); + return -x*y*(y2 + cA*x2) / den + (y<0 ? -cE : cE); + }else{ + float den = (x2 + cB*y2) * (x2 + cC*y2); + return x*y*(x2 + cA*y2) / den + (y<0 ? -cE : cE) - (x*y<0 ? -cE : cE); + } +} +#undef cA +#undef cB +#undef cC +#undef cE +#endif + + #ifndef OVERRIDE_CELT_MAXABS16 static OPUS_INLINE opus_val32 celt_maxabs16(const opus_val16 *x, int len) { @@ -80,7 +113,6 @@ static OPUS_INLINE opus_val32 celt_maxabs32(const opus_val32 *x, int len) #ifndef FIXED_POINT -#define PI 3.141592653f #define celt_sqrt(x) ((float)sqrt(x)) #define celt_rsqrt(x) (1.f/celt_sqrt(x)) #define celt_rsqrt_norm(x) (celt_rsqrt(x)) @@ -147,7 +179,7 @@ static OPUS_INLINE float celt_exp2(float x) /** Integer log in base2. Undefined for zero and negative numbers */ static OPUS_INLINE opus_int16 celt_ilog2(opus_int32 x) { - celt_assert2(x>0, "celt_ilog2() only defined for strictly positive numbers"); + celt_sig_assert(x>0); return EC_ILOG(x)-1; } #endif diff --git a/thirdparty/opus/celt/mdct.c b/thirdparty/opus/celt/mdct.c index 5315ad11a3..5c6dab5b75 100644 --- a/thirdparty/opus/celt/mdct.c +++ b/thirdparty/opus/celt/mdct.c @@ -270,8 +270,8 @@ void clt_mdct_backward_c(const mdct_lookup *l, kiss_fft_scalar *in, kiss_fft_sca int rev; kiss_fft_scalar yr, yi; rev = *bitrev++; - yr = S_MUL(*xp2, t[i]) + S_MUL(*xp1, t[N4+i]); - yi = S_MUL(*xp1, t[i]) - S_MUL(*xp2, t[N4+i]); + yr = ADD32_ovflw(S_MUL(*xp2, t[i]), S_MUL(*xp1, t[N4+i])); + yi = SUB32_ovflw(S_MUL(*xp1, t[i]), S_MUL(*xp2, t[N4+i])); /* We swap real and imag because we use an FFT instead of an IFFT. */ yp[2*rev+1] = yr; yp[2*rev] = yi; @@ -301,8 +301,8 @@ void clt_mdct_backward_c(const mdct_lookup *l, kiss_fft_scalar *in, kiss_fft_sca t0 = t[i]; t1 = t[N4+i]; /* We'd scale up by 2 here, but instead it's done when mixing the windows */ - yr = S_MUL(re,t0) + S_MUL(im,t1); - yi = S_MUL(re,t1) - S_MUL(im,t0); + yr = ADD32_ovflw(S_MUL(re,t0), S_MUL(im,t1)); + yi = SUB32_ovflw(S_MUL(re,t1), S_MUL(im,t0)); /* We swap real and imag because we're using an FFT instead of an IFFT. */ re = yp1[1]; im = yp1[0]; @@ -312,8 +312,8 @@ void clt_mdct_backward_c(const mdct_lookup *l, kiss_fft_scalar *in, kiss_fft_sca t0 = t[(N4-i-1)]; t1 = t[(N2-i-1)]; /* We'd scale up by 2 here, but instead it's done when mixing the windows */ - yr = S_MUL(re,t0) + S_MUL(im,t1); - yi = S_MUL(re,t1) - S_MUL(im,t0); + yr = ADD32_ovflw(S_MUL(re,t0), S_MUL(im,t1)); + yi = SUB32_ovflw(S_MUL(re,t1), S_MUL(im,t0)); yp1[0] = yr; yp0[1] = yi; yp0 += 2; @@ -333,8 +333,8 @@ void clt_mdct_backward_c(const mdct_lookup *l, kiss_fft_scalar *in, kiss_fft_sca kiss_fft_scalar x1, x2; x1 = *xp1; x2 = *yp1; - *yp1++ = MULT16_32_Q15(*wp2, x2) - MULT16_32_Q15(*wp1, x1); - *xp1-- = MULT16_32_Q15(*wp1, x2) + MULT16_32_Q15(*wp2, x1); + *yp1++ = SUB32_ovflw(MULT16_32_Q15(*wp2, x2), MULT16_32_Q15(*wp1, x1)); + *xp1-- = ADD32_ovflw(MULT16_32_Q15(*wp1, x2), MULT16_32_Q15(*wp2, x1)); wp1++; wp2--; } diff --git a/thirdparty/opus/celt/mips/celt_mipsr1.h b/thirdparty/opus/celt/mips/celt_mipsr1.h index e85661a661..c332fe0471 100644 --- a/thirdparty/opus/celt/mips/celt_mipsr1.h +++ b/thirdparty/opus/celt/mips/celt_mipsr1.h @@ -53,6 +53,7 @@ #include "celt_lpc.h" #include "vq.h" +#define OVERRIDE_COMB_FILTER_CONST #define OVERRIDE_comb_filter void comb_filter(opus_val32 *y, opus_val32 *x, int T0, int T1, int N, opus_val16 g0, opus_val16 g1, int tapset0, int tapset1, diff --git a/thirdparty/opus/celt/mips/vq_mipsr1.h b/thirdparty/opus/celt/mips/vq_mipsr1.h index 54cef86133..f26a33e755 100644 --- a/thirdparty/opus/celt/mips/vq_mipsr1.h +++ b/thirdparty/opus/celt/mips/vq_mipsr1.h @@ -36,11 +36,6 @@ #include "mathops.h" #include "arch.h" -static unsigned extract_collapse_mask(int *iy, int N, int B); -static void normalise_residual(int * OPUS_RESTRICT iy, celt_norm * OPUS_RESTRICT X, int N, opus_val32 Ryy, opus_val16 gain); -static void exp_rotation(celt_norm *X, int len, int dir, int stride, int K, int spread); -static void renormalise_vector_mips(celt_norm *X, int N, opus_val16 gain, int arch); - #define OVERRIDE_vq_exp_rotation1 static void exp_rotation1(celt_norm *X, int len, int stride, opus_val16 c, opus_val16 s) { @@ -69,11 +64,7 @@ static void exp_rotation1(celt_norm *X, int len, int stride, opus_val16 c, opus_ } #define OVERRIDE_renormalise_vector - -#define renormalise_vector(X, N, gain, arch) \ - (renormalise_vector_mips(X, N, gain, arch)) - -void renormalise_vector_mips(celt_norm *X, int N, opus_val16 gain, int arch) +void renormalise_vector(celt_norm *X, int N, opus_val16 gain, int arch) { int i; #ifdef FIXED_POINT diff --git a/thirdparty/opus/celt/modes.c b/thirdparty/opus/celt/modes.c index 911686e905..390c5e8aeb 100644 --- a/thirdparty/opus/celt/modes.c +++ b/thirdparty/opus/celt/modes.c @@ -427,7 +427,7 @@ void opus_custom_mode_destroy(CELTMode *mode) } #endif /* CUSTOM_MODES_ONLY */ opus_free((opus_int16*)mode->eBands); - opus_free((opus_int16*)mode->allocVectors); + opus_free((unsigned char*)mode->allocVectors); opus_free((opus_val16*)mode->window); opus_free((opus_int16*)mode->logN); diff --git a/thirdparty/opus/celt/pitch.c b/thirdparty/opus/celt/pitch.c index bf46e7d562..872582a48a 100644 --- a/thirdparty/opus/celt/pitch.c +++ b/thirdparty/opus/celt/pitch.c @@ -102,11 +102,9 @@ static void find_best_pitch(opus_val32 *xcorr, opus_val16 *y, int len, } } -static void celt_fir5(const opus_val16 *x, +static void celt_fir5(opus_val16 *x, const opus_val16 *num, - opus_val16 *y, - int N, - opus_val16 *mem) + int N) { int i; opus_val16 num0, num1, num2, num3, num4; @@ -116,11 +114,11 @@ static void celt_fir5(const opus_val16 *x, num2=num[2]; num3=num[3]; num4=num[4]; - mem0=mem[0]; - mem1=mem[1]; - mem2=mem[2]; - mem3=mem[3]; - mem4=mem[4]; + mem0=0; + mem1=0; + mem2=0; + mem3=0; + mem4=0; for (i=0;i<N;i++) { opus_val32 sum = SHL32(EXTEND32(x[i]), SIG_SHIFT); @@ -134,13 +132,8 @@ static void celt_fir5(const opus_val16 *x, mem2 = mem1; mem1 = mem0; mem0 = x[i]; - y[i] = ROUND16(sum, SIG_SHIFT); + x[i] = ROUND16(sum, SIG_SHIFT); } - mem[0]=mem0; - mem[1]=mem1; - mem[2]=mem2; - mem[3]=mem3; - mem[4]=mem4; } @@ -150,7 +143,7 @@ void pitch_downsample(celt_sig * OPUS_RESTRICT x[], opus_val16 * OPUS_RESTRICT x int i; opus_val32 ac[5]; opus_val16 tmp=Q15ONE; - opus_val16 lpc[4], mem[5]={0,0,0,0,0}; + opus_val16 lpc[4]; opus_val16 lpc2[5]; opus_val16 c1 = QCONST16(.8f,15); #ifdef FIXED_POINT @@ -211,7 +204,7 @@ void pitch_downsample(celt_sig * OPUS_RESTRICT x[], opus_val16 * OPUS_RESTRICT x lpc2[2] = lpc[2] + MULT16_16_Q15(c1,lpc[1]); lpc2[3] = lpc[3] + MULT16_16_Q15(c1,lpc[2]); lpc2[4] = MULT16_16_Q15(c1,lpc[3]); - celt_fir5(x_lp, lpc2, x_lp, len>>1, mem); + celt_fir5(x_lp, lpc2, len>>1); } /* Pure C implementation. */ @@ -220,13 +213,8 @@ opus_val32 #else void #endif -#if defined(OVERRIDE_PITCH_XCORR) celt_pitch_xcorr_c(const opus_val16 *_x, const opus_val16 *_y, - opus_val32 *xcorr, int len, int max_pitch) -#else -celt_pitch_xcorr(const opus_val16 *_x, const opus_val16 *_y, opus_val32 *xcorr, int len, int max_pitch, int arch) -#endif { #if 0 /* This is a simple version of the pitch correlation that should work @@ -261,15 +249,11 @@ celt_pitch_xcorr(const opus_val16 *_x, const opus_val16 *_y, opus_val32 maxcorr=1; #endif celt_assert(max_pitch>0); - celt_assert((((unsigned char *)_x-(unsigned char *)NULL)&3)==0); + celt_sig_assert((((unsigned char *)_x-(unsigned char *)NULL)&3)==0); for (i=0;i<max_pitch-3;i+=4) { opus_val32 sum[4]={0,0,0,0}; -#if defined(OVERRIDE_PITCH_XCORR) - xcorr_kernel_c(_x, _y+i, sum, len); -#else xcorr_kernel(_x, _y+i, sum, len, arch); -#endif xcorr[i]=sum[0]; xcorr[i+1]=sum[1]; xcorr[i+2]=sum[2]; @@ -285,11 +269,7 @@ celt_pitch_xcorr(const opus_val16 *_x, const opus_val16 *_y, for (;i<max_pitch;i++) { opus_val32 sum; -#if defined(OVERRIDE_PITCH_XCORR) - sum = celt_inner_prod_c(_x, _y+i, len); -#else sum = celt_inner_prod(_x, _y+i, len, arch); -#endif xcorr[i] = sum; #ifdef FIXED_POINT maxcorr = MAX32(maxcorr, sum); @@ -378,7 +358,7 @@ void pitch_search(const opus_val16 * OPUS_RESTRICT x_lp, opus_val16 * OPUS_RESTR for (j=0;j<len>>1;j++) sum += SHR32(MULT16_16(x_lp[j],y[i+j]), shift); #else - sum = celt_inner_prod_c(x_lp, y+i, len>>1); + sum = celt_inner_prod(x_lp, y+i, len>>1, arch); #endif xcorr[i] = MAX32(-1, sum); #ifdef FIXED_POINT @@ -424,7 +404,7 @@ static opus_val16 compute_pitch_gain(opus_val32 xy, opus_val32 xx, opus_val32 yy sx = celt_ilog2(xx)-14; sy = celt_ilog2(yy)-14; shift = sx + sy; - x2y2 = MULT16_16_Q14(VSHR32(xx, sx), VSHR32(yy, sy)); + x2y2 = SHR32(MULT16_16(VSHR32(xx, sx), VSHR32(yy, sy)), 14); if (shift & 1) { if (x2y2 < 32768) { diff --git a/thirdparty/opus/celt/pitch.h b/thirdparty/opus/celt/pitch.h index d3503532a0..e425f56aea 100644 --- a/thirdparty/opus/celt/pitch.h +++ b/thirdparty/opus/celt/pitch.h @@ -46,8 +46,7 @@ #include "mips/pitch_mipsr1.h" #endif -#if ((defined(OPUS_ARM_ASM) && defined(FIXED_POINT)) \ - || defined(OPUS_ARM_MAY_HAVE_NEON_INTR)) +#if (defined(OPUS_ARM_ASM) || defined(OPUS_ARM_MAY_HAVE_NEON_INTR)) # include "arm/pitch_arm.h" #endif @@ -184,17 +183,10 @@ opus_val32 void #endif celt_pitch_xcorr_c(const opus_val16 *_x, const opus_val16 *_y, - opus_val32 *xcorr, int len, int max_pitch); - -#if !defined(OVERRIDE_PITCH_XCORR) -#ifdef FIXED_POINT -opus_val32 -#else -void -#endif -celt_pitch_xcorr(const opus_val16 *_x, const opus_val16 *_y, opus_val32 *xcorr, int len, int max_pitch, int arch); +#ifndef OVERRIDE_PITCH_XCORR +# define celt_pitch_xcorr celt_pitch_xcorr_c #endif #endif diff --git a/thirdparty/opus/celt/quant_bands.c b/thirdparty/opus/celt/quant_bands.c index 95076e0af2..39a221eda5 100644 --- a/thirdparty/opus/celt/quant_bands.c +++ b/thirdparty/opus/celt/quant_bands.c @@ -418,6 +418,7 @@ void quant_energy_finalise(const CELTMode *m, int start, int end, opus_val16 *ol offset = (q2-.5f)*(1<<(14-fine_quant[i]-1))*(1.f/16384); #endif oldEBands[i+c*m->nbEBands] += offset; + error[i+c*m->nbEBands] -= offset; bits_left--; } while (++c < C); } @@ -456,7 +457,7 @@ void unquant_coarse_energy(const CELTMode *m, int start, int end, opus_val16 *ol /* It would be better to express this invariant as a test on C at function entry, but that isn't enough to make the static analyzer happy. */ - celt_assert(c<2); + celt_sig_assert(c<2); tell = ec_tell(dec); if(budget-tell>=15) { @@ -547,9 +548,15 @@ void amp2Log2(const CELTMode *m, int effEnd, int end, c=0; do { for (i=0;i<effEnd;i++) + { bandLogE[i+c*m->nbEBands] = - celt_log2(SHL32(bandE[i+c*m->nbEBands],2)) + celt_log2(bandE[i+c*m->nbEBands]) - SHL16((opus_val16)eMeans[i],6); +#ifdef FIXED_POINT + /* Compensate for bandE[] being Q12 but celt_log2() taking a Q14 input. */ + bandLogE[i+c*m->nbEBands] += QCONST16(2.f, DB_SHIFT); +#endif + } for (i=effEnd;i<end;i++) bandLogE[c*m->nbEBands+i] = -QCONST16(14.f,DB_SHIFT); } while (++c < C); diff --git a/thirdparty/opus/celt/rate.c b/thirdparty/opus/celt/rate.c index 7dfa5be8a6..465e1ba26c 100644 --- a/thirdparty/opus/celt/rate.c +++ b/thirdparty/opus/celt/rate.c @@ -348,12 +348,17 @@ static OPUS_INLINE int interp_bits2pulses(const CELTMode *m, int start, int end, /*This if() block is the only part of the allocation function that is not a mandatory part of the bitstream: any bands we choose to skip here must be explicitly signaled.*/ - /*Choose a threshold with some hysteresis to keep bands from - fluctuating in and out.*/ + int depth_threshold; + /*We choose a threshold with some hysteresis to keep bands from + fluctuating in and out, but we try not to fold below a certain point. */ + if (codedBands > 17) + depth_threshold = j<prev ? 7 : 9; + else + depth_threshold = 0; #ifdef FUZZING if ((rand()&0x1) == 0) #else - if (codedBands<=start+2 || (band_bits > ((j<prev?7:9)*band_width<<LM<<BITRES)>>4 && j<=signalBandwidth)) + if (codedBands<=start+2 || (band_bits > (depth_threshold*band_width<<LM<<BITRES)>>4 && j<=signalBandwidth)) #endif { ec_enc_bit_logp(ec, 1, 1); @@ -524,7 +529,7 @@ static OPUS_INLINE int interp_bits2pulses(const CELTMode *m, int start, int end, return codedBands; } -int compute_allocation(const CELTMode *m, int start, int end, const int *offsets, const int *cap, int alloc_trim, int *intensity, int *dual_stereo, +int clt_compute_allocation(const CELTMode *m, int start, int end, const int *offsets, const int *cap, int alloc_trim, int *intensity, int *dual_stereo, opus_int32 total, opus_int32 *balance, int *pulses, int *ebits, int *fine_priority, int C, int LM, ec_ctx *ec, int encode, int prev, int signalBandwidth) { int lo, hi, len, j; diff --git a/thirdparty/opus/celt/rate.h b/thirdparty/opus/celt/rate.h index 515f7687ce..fad5e412da 100644 --- a/thirdparty/opus/celt/rate.h +++ b/thirdparty/opus/celt/rate.h @@ -95,7 +95,7 @@ static OPUS_INLINE int pulses2bits(const CELTMode *m, int band, int LM, int puls @param pulses Number of pulses per band (returned) @return Total number of bits allocated */ -int compute_allocation(const CELTMode *m, int start, int end, const int *offsets, const int *cap, int alloc_trim, int *intensity, int *dual_stero, +int clt_compute_allocation(const CELTMode *m, int start, int end, const int *offsets, const int *cap, int alloc_trim, int *intensity, int *dual_stereo, opus_int32 total, opus_int32 *balance, int *pulses, int *ebits, int *fine_priority, int C, int LM, ec_ctx *ec, int encode, int prev, int signalBandwidth); #endif diff --git a/thirdparty/opus/celt/static_modes_fixed_arm_ne10.h b/thirdparty/opus/celt/static_modes_fixed_arm_ne10.h index b8ef0cee98..7623092192 100644 --- a/thirdparty/opus/celt/static_modes_fixed_arm_ne10.h +++ b/thirdparty/opus/celt/static_modes_fixed_arm_ne10.h @@ -1,7 +1,7 @@ /* The contents of this file was automatically generated by * dump_mode_arm_ne10.c with arguments: 48000 960 * It contains static definitions for some pre-defined modes. */ -#include <NE10_init.h> +#include <NE10_types.h> #ifndef NE10_FFT_PARAMS48000_960 #define NE10_FFT_PARAMS48000_960 diff --git a/thirdparty/opus/celt/static_modes_float_arm_ne10.h b/thirdparty/opus/celt/static_modes_float_arm_ne10.h index 934a82a420..66e1abb101 100644 --- a/thirdparty/opus/celt/static_modes_float_arm_ne10.h +++ b/thirdparty/opus/celt/static_modes_float_arm_ne10.h @@ -1,7 +1,7 @@ /* The contents of this file was automatically generated by * dump_mode_arm_ne10.c with arguments: 48000 960 * It contains static definitions for some pre-defined modes. */ -#include <NE10_init.h> +#include <NE10_types.h> #ifndef NE10_FFT_PARAMS48000_960 #define NE10_FFT_PARAMS48000_960 diff --git a/thirdparty/opus/celt/tests/test_unit_cwrs32.c b/thirdparty/opus/celt/tests/test_unit_cwrs32.c deleted file mode 100644 index 36dd8af5f5..0000000000 --- a/thirdparty/opus/celt/tests/test_unit_cwrs32.c +++ /dev/null @@ -1,161 +0,0 @@ -/* Copyright (c) 2008-2011 Xiph.Org Foundation, Mozilla Corporation, - Gregory Maxwell - Written by Jean-Marc Valin, Gregory Maxwell, and Timothy B. Terriberry */ -/* - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - - Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - - - Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in the - documentation and/or other materials provided with the distribution. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER - OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, - EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, - PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR - PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF - LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING - NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS - SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -*/ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#include <stdio.h> -#include <string.h> - -#ifndef CUSTOM_MODES -#define CUSTOM_MODES -#else -#define TEST_CUSTOM_MODES -#endif - -#define CELT_C -#include "stack_alloc.h" -#include "entenc.c" -#include "entdec.c" -#include "entcode.c" -#include "cwrs.c" -#include "mathops.c" -#include "rate.h" - -#define NMAX (240) -#define KMAX (128) - -#ifdef TEST_CUSTOM_MODES - -#define NDIMS (44) -static const int pn[NDIMS]={ - 2, 3, 4, 5, 6, 7, 8, 9, 10, - 11, 12, 13, 14, 15, 16, 18, 20, 22, - 24, 26, 28, 30, 32, 36, 40, 44, 48, - 52, 56, 60, 64, 72, 80, 88, 96, 104, - 112, 120, 128, 144, 160, 176, 192, 208 -}; -static const int pkmax[NDIMS]={ - 128, 128, 128, 128, 88, 52, 36, 26, 22, - 18, 16, 15, 13, 12, 12, 11, 10, 9, - 9, 8, 8, 7, 7, 7, 7, 6, 6, - 6, 6, 6, 5, 5, 5, 5, 5, 5, - 4, 4, 4, 4, 4, 4, 4, 4 -}; - -#else /* TEST_CUSTOM_MODES */ - -#define NDIMS (22) -static const int pn[NDIMS]={ - 2, 3, 4, 6, 8, 9, 11, 12, 16, - 18, 22, 24, 32, 36, 44, 48, 64, 72, - 88, 96, 144, 176 -}; -static const int pkmax[NDIMS]={ - 128, 128, 128, 88, 36, 26, 18, 16, 12, - 11, 9, 9, 7, 7, 6, 6, 5, 5, - 5, 5, 4, 4 -}; - -#endif - -int main(void){ - int t; - int n; - ALLOC_STACK; - for(t=0;t<NDIMS;t++){ - int pseudo; - n=pn[t]; - for(pseudo=1;pseudo<41;pseudo++) - { - int k; -#if defined(SMALL_FOOTPRINT) - opus_uint32 uu[KMAX+2U]; -#endif - opus_uint32 inc; - opus_uint32 nc; - opus_uint32 i; - k=get_pulses(pseudo); - if (k>pkmax[t])break; - printf("Testing CWRS with N=%i, K=%i...\n",n,k); -#if defined(SMALL_FOOTPRINT) - nc=ncwrs_urow(n,k,uu); -#else - nc=CELT_PVQ_V(n,k); -#endif - inc=nc/20000; - if(inc<1)inc=1; - for(i=0;i<nc;i+=inc){ -#if defined(SMALL_FOOTPRINT) - opus_uint32 u[KMAX+2U]; -#endif - int y[NMAX]; - int sy; - opus_uint32 v; - opus_uint32 ii; - int j; -#if defined(SMALL_FOOTPRINT) - memcpy(u,uu,(k+2U)*sizeof(*u)); - cwrsi(n,k,i,y,u); -#else - cwrsi(n,k,i,y); -#endif - sy=0; - for(j=0;j<n;j++)sy+=abs(y[j]); - if(sy!=k){ - fprintf(stderr,"N=%d Pulse count mismatch in cwrsi (%d!=%d).\n", - n,sy,k); - return 99; - } - /*printf("%6u of %u:",i,nc); - for(j=0;j<n;j++)printf(" %+3i",y[j]); - printf(" ->");*/ -#if defined(SMALL_FOOTPRINT) - ii=icwrs(n,k,&v,y,u); -#else - ii=icwrs(n,y); - v=CELT_PVQ_V(n,k); -#endif - if(ii!=i){ - fprintf(stderr,"Combination-index mismatch (%lu!=%lu).\n", - (long)ii,(long)i); - return 1; - } - if(v!=nc){ - fprintf(stderr,"Combination count mismatch (%lu!=%lu).\n", - (long)v,(long)nc); - return 2; - } - /*printf(" %6u\n",i);*/ - } - /*printf("\n");*/ - } - } - return 0; -} diff --git a/thirdparty/opus/celt/tests/test_unit_dft.c b/thirdparty/opus/celt/tests/test_unit_dft.c deleted file mode 100644 index 6166eb0e4f..0000000000 --- a/thirdparty/opus/celt/tests/test_unit_dft.c +++ /dev/null @@ -1,189 +0,0 @@ -/* Copyright (c) 2008 Xiph.Org Foundation - Written by Jean-Marc Valin */ -/* - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - - Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - - - Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in the - documentation and/or other materials provided with the distribution. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER - OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, - EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, - PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR - PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF - LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING - NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS - SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -*/ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#define SKIP_CONFIG_H - -#ifndef CUSTOM_MODES -#define CUSTOM_MODES -#endif - -#include <stdio.h> - -#define CELT_C -#define TEST_UNIT_DFT_C -#include "stack_alloc.h" -#include "kiss_fft.h" -#include "kiss_fft.c" -#include "mathops.c" -#include "entcode.c" - -#if defined(OPUS_X86_MAY_HAVE_SSE2) || defined(OPUS_X86_MAY_HAVE_SSE4_1) -# include "x86/x86cpu.c" -#elif defined(OPUS_ARM_ASM) || defined(OPUS_ARM_MAY_HAVE_NEON_INTR) -# include "arm/armcpu.c" -# include "celt_lpc.c" -# include "pitch.c" -# if defined(OPUS_ARM_MAY_HAVE_NEON_INTR) -# include "arm/celt_neon_intr.c" -# if defined(HAVE_ARM_NE10) -# include "mdct.c" -# include "arm/celt_ne10_fft.c" -# include "arm/celt_ne10_mdct.c" -# endif -# endif -# include "arm/arm_celt_map.c" -#endif - -#ifndef M_PI -#define M_PI 3.141592653 -#endif - -int ret = 0; - -void check(kiss_fft_cpx * in,kiss_fft_cpx * out,int nfft,int isinverse) -{ - int bin,k; - double errpow=0,sigpow=0, snr; - - for (bin=0;bin<nfft;++bin) { - double ansr = 0; - double ansi = 0; - double difr; - double difi; - - for (k=0;k<nfft;++k) { - double phase = -2*M_PI*bin*k/nfft; - double re = cos(phase); - double im = sin(phase); - if (isinverse) - im = -im; - - if (!isinverse) - { - re /= nfft; - im /= nfft; - } - - ansr += in[k].r * re - in[k].i * im; - ansi += in[k].r * im + in[k].i * re; - } - /*printf ("%d %d ", (int)ansr, (int)ansi);*/ - difr = ansr - out[bin].r; - difi = ansi - out[bin].i; - errpow += difr*difr + difi*difi; - sigpow += ansr*ansr+ansi*ansi; - } - snr = 10*log10(sigpow/errpow); - printf("nfft=%d inverse=%d,snr = %f\n",nfft,isinverse,snr ); - if (snr<60) { - printf( "** poor snr: %f ** \n", snr); - ret = 1; - } -} - -void test1d(int nfft,int isinverse,int arch) -{ - size_t buflen = sizeof(kiss_fft_cpx)*nfft; - - kiss_fft_cpx * in = (kiss_fft_cpx*)malloc(buflen); - kiss_fft_cpx * out= (kiss_fft_cpx*)malloc(buflen); - kiss_fft_state *cfg = opus_fft_alloc(nfft,0,0,arch); - int k; - - for (k=0;k<nfft;++k) { - in[k].r = (rand() % 32767) - 16384; - in[k].i = (rand() % 32767) - 16384; - } - - for (k=0;k<nfft;++k) { - in[k].r *= 32768; - in[k].i *= 32768; - } - - if (isinverse) - { - for (k=0;k<nfft;++k) { - in[k].r /= nfft; - in[k].i /= nfft; - } - } - - /*for (k=0;k<nfft;++k) printf("%d %d ", in[k].r, in[k].i);printf("\n");*/ - - if (isinverse) - opus_ifft(cfg,in,out, arch); - else - opus_fft(cfg,in,out, arch); - - /*for (k=0;k<nfft;++k) printf("%d %d ", out[k].r, out[k].i);printf("\n");*/ - - check(in,out,nfft,isinverse); - - free(in); - free(out); - opus_fft_free(cfg, arch); -} - -int main(int argc,char ** argv) -{ - ALLOC_STACK; - int arch = opus_select_arch(); - - if (argc>1) { - int k; - for (k=1;k<argc;++k) { - test1d(atoi(argv[k]),0,arch); - test1d(atoi(argv[k]),1,arch); - } - }else{ - test1d(32,0,arch); - test1d(32,1,arch); - test1d(128,0,arch); - test1d(128,1,arch); - test1d(256,0,arch); - test1d(256,1,arch); -#ifndef RADIX_TWO_ONLY - test1d(36,0,arch); - test1d(36,1,arch); - test1d(50,0,arch); - test1d(50,1,arch); - test1d(60,0,arch); - test1d(60,1,arch); - test1d(120,0,arch); - test1d(120,1,arch); - test1d(240,0,arch); - test1d(240,1,arch); - test1d(480,0,arch); - test1d(480,1,arch); -#endif - } - return ret; -} diff --git a/thirdparty/opus/celt/tests/test_unit_entropy.c b/thirdparty/opus/celt/tests/test_unit_entropy.c deleted file mode 100644 index ff9265864c..0000000000 --- a/thirdparty/opus/celt/tests/test_unit_entropy.c +++ /dev/null @@ -1,382 +0,0 @@ -/* Copyright (c) 2007-2011 Xiph.Org Foundation, Mozilla Corporation, - Gregory Maxwell - Written by Jean-Marc Valin, Gregory Maxwell, and Timothy B. Terriberry */ -/* - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - - Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - - - Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in the - documentation and/or other materials provided with the distribution. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER - OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, - EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, - PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR - PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF - LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING - NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS - SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -*/ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#include <stdlib.h> -#include <stdio.h> -#include <math.h> -#include <time.h> -#include "entcode.h" -#include "entenc.h" -#include "entdec.h" -#include <string.h> - -#include "entenc.c" -#include "entdec.c" -#include "entcode.c" - -#ifndef M_LOG2E -# define M_LOG2E 1.4426950408889634074 -#endif -#define DATA_SIZE 10000000 -#define DATA_SIZE2 10000 - -int main(int _argc,char **_argv){ - ec_enc enc; - ec_dec dec; - long nbits; - long nbits2; - double entropy; - int ft; - int ftb; - int sz; - int i; - int ret; - unsigned int sym; - unsigned int seed; - unsigned char *ptr; - const char *env_seed; - ret=0; - entropy=0; - if (_argc > 2) { - fprintf(stderr, "Usage: %s [<seed>]\n", _argv[0]); - return 1; - } - env_seed = getenv("SEED"); - if (_argc > 1) - seed = atoi(_argv[1]); - else if (env_seed) - seed = atoi(env_seed); - else - seed = time(NULL); - /*Testing encoding of raw bit values.*/ - ptr = (unsigned char *)malloc(DATA_SIZE); - ec_enc_init(&enc,ptr, DATA_SIZE); - for(ft=2;ft<1024;ft++){ - for(i=0;i<ft;i++){ - entropy+=log(ft)*M_LOG2E; - ec_enc_uint(&enc,i,ft); - } - } - /*Testing encoding of raw bit values.*/ - for(ftb=1;ftb<16;ftb++){ - for(i=0;i<(1<<ftb);i++){ - entropy+=ftb; - nbits=ec_tell(&enc); - ec_enc_bits(&enc,i,ftb); - nbits2=ec_tell(&enc); - if(nbits2-nbits!=ftb){ - fprintf(stderr,"Used %li bits to encode %i bits directly.\n", - nbits2-nbits,ftb); - ret=-1; - } - } - } - nbits=ec_tell_frac(&enc); - ec_enc_done(&enc); - fprintf(stderr, - "Encoded %0.2lf bits of entropy to %0.2lf bits (%0.3lf%% wasted).\n", - entropy,ldexp(nbits,-3),100*(nbits-ldexp(entropy,3))/nbits); - fprintf(stderr,"Packed to %li bytes.\n",(long)ec_range_bytes(&enc)); - ec_dec_init(&dec,ptr,DATA_SIZE); - for(ft=2;ft<1024;ft++){ - for(i=0;i<ft;i++){ - sym=ec_dec_uint(&dec,ft); - if(sym!=(unsigned)i){ - fprintf(stderr,"Decoded %i instead of %i with ft of %i.\n",sym,i,ft); - ret=-1; - } - } - } - for(ftb=1;ftb<16;ftb++){ - for(i=0;i<(1<<ftb);i++){ - sym=ec_dec_bits(&dec,ftb); - if(sym!=(unsigned)i){ - fprintf(stderr,"Decoded %i instead of %i with ftb of %i.\n",sym,i,ftb); - ret=-1; - } - } - } - nbits2=ec_tell_frac(&dec); - if(nbits!=nbits2){ - fprintf(stderr, - "Reported number of bits used was %0.2lf, should be %0.2lf.\n", - ldexp(nbits2,-3),ldexp(nbits,-3)); - ret=-1; - } - /*Testing an encoder bust prefers range coder data over raw bits. - This isn't a general guarantee, will only work for data that is buffered in - the encoder state and not yet stored in the user buffer, and should never - get used in practice. - It's mostly here for code coverage completeness.*/ - /*Start with a 16-bit buffer.*/ - ec_enc_init(&enc,ptr,2); - /*Write 7 raw bits.*/ - ec_enc_bits(&enc,0x55,7); - /*Write 12.3 bits of range coder data.*/ - ec_enc_uint(&enc,1,2); - ec_enc_uint(&enc,1,3); - ec_enc_uint(&enc,1,4); - ec_enc_uint(&enc,1,5); - ec_enc_uint(&enc,2,6); - ec_enc_uint(&enc,6,7); - ec_enc_done(&enc); - ec_dec_init(&dec,ptr,2); - if(!enc.error - /*The raw bits should have been overwritten by the range coder data.*/ - ||ec_dec_bits(&dec,7)!=0x05 - /*And all the range coder data should have been encoded correctly.*/ - ||ec_dec_uint(&dec,2)!=1 - ||ec_dec_uint(&dec,3)!=1 - ||ec_dec_uint(&dec,4)!=1 - ||ec_dec_uint(&dec,5)!=1 - ||ec_dec_uint(&dec,6)!=2 - ||ec_dec_uint(&dec,7)!=6){ - fprintf(stderr,"Encoder bust overwrote range coder data with raw bits.\n"); - ret=-1; - } - srand(seed); - fprintf(stderr,"Testing random streams... Random seed: %u (%.4X)\n", seed, rand() % 65536); - for(i=0;i<409600;i++){ - unsigned *data; - unsigned *tell; - unsigned tell_bits; - int j; - int zeros; - ft=rand()/((RAND_MAX>>(rand()%11U))+1U)+10; - sz=rand()/((RAND_MAX>>(rand()%9U))+1U); - data=(unsigned *)malloc(sz*sizeof(*data)); - tell=(unsigned *)malloc((sz+1)*sizeof(*tell)); - ec_enc_init(&enc,ptr,DATA_SIZE2); - zeros = rand()%13==0; - tell[0]=ec_tell_frac(&enc); - for(j=0;j<sz;j++){ - if (zeros) - data[j]=0; - else - data[j]=rand()%ft; - ec_enc_uint(&enc,data[j],ft); - tell[j+1]=ec_tell_frac(&enc); - } - if (rand()%2==0) - while(ec_tell(&enc)%8 != 0) - ec_enc_uint(&enc, rand()%2, 2); - tell_bits = ec_tell(&enc); - ec_enc_done(&enc); - if(tell_bits!=(unsigned)ec_tell(&enc)){ - fprintf(stderr,"ec_tell() changed after ec_enc_done(): %i instead of %i (Random seed: %u)\n", - ec_tell(&enc),tell_bits,seed); - ret=-1; - } - if ((tell_bits+7)/8 < ec_range_bytes(&enc)) - { - fprintf (stderr, "ec_tell() lied, there's %i bytes instead of %d (Random seed: %u)\n", - ec_range_bytes(&enc), (tell_bits+7)/8,seed); - ret=-1; - } - ec_dec_init(&dec,ptr,DATA_SIZE2); - if(ec_tell_frac(&dec)!=tell[0]){ - fprintf(stderr, - "Tell mismatch between encoder and decoder at symbol %i: %i instead of %i (Random seed: %u).\n", - 0,ec_tell_frac(&dec),tell[0],seed); - } - for(j=0;j<sz;j++){ - sym=ec_dec_uint(&dec,ft); - if(sym!=data[j]){ - fprintf(stderr, - "Decoded %i instead of %i with ft of %i at position %i of %i (Random seed: %u).\n", - sym,data[j],ft,j,sz,seed); - ret=-1; - } - if(ec_tell_frac(&dec)!=tell[j+1]){ - fprintf(stderr, - "Tell mismatch between encoder and decoder at symbol %i: %i instead of %i (Random seed: %u).\n", - j+1,ec_tell_frac(&dec),tell[j+1],seed); - } - } - free(tell); - free(data); - } - /*Test compatibility between multiple different encode/decode routines.*/ - for(i=0;i<409600;i++){ - unsigned *logp1; - unsigned *data; - unsigned *tell; - unsigned *enc_method; - int j; - sz=rand()/((RAND_MAX>>(rand()%9U))+1U); - logp1=(unsigned *)malloc(sz*sizeof(*logp1)); - data=(unsigned *)malloc(sz*sizeof(*data)); - tell=(unsigned *)malloc((sz+1)*sizeof(*tell)); - enc_method=(unsigned *)malloc(sz*sizeof(*enc_method)); - ec_enc_init(&enc,ptr,DATA_SIZE2); - tell[0]=ec_tell_frac(&enc); - for(j=0;j<sz;j++){ - data[j]=rand()/((RAND_MAX>>1)+1); - logp1[j]=(rand()%15)+1; - enc_method[j]=rand()/((RAND_MAX>>2)+1); - switch(enc_method[j]){ - case 0:{ - ec_encode(&enc,data[j]?(1<<logp1[j])-1:0, - (1<<logp1[j])-(data[j]?0:1),1<<logp1[j]); - }break; - case 1:{ - ec_encode_bin(&enc,data[j]?(1<<logp1[j])-1:0, - (1<<logp1[j])-(data[j]?0:1),logp1[j]); - }break; - case 2:{ - ec_enc_bit_logp(&enc,data[j],logp1[j]); - }break; - case 3:{ - unsigned char icdf[2]; - icdf[0]=1; - icdf[1]=0; - ec_enc_icdf(&enc,data[j],icdf,logp1[j]); - }break; - } - tell[j+1]=ec_tell_frac(&enc); - } - ec_enc_done(&enc); - if((ec_tell(&enc)+7U)/8U<ec_range_bytes(&enc)){ - fprintf(stderr,"tell() lied, there's %i bytes instead of %d (Random seed: %u)\n", - ec_range_bytes(&enc),(ec_tell(&enc)+7)/8,seed); - ret=-1; - } - ec_dec_init(&dec,ptr,DATA_SIZE2); - if(ec_tell_frac(&dec)!=tell[0]){ - fprintf(stderr, - "Tell mismatch between encoder and decoder at symbol %i: %i instead of %i (Random seed: %u).\n", - 0,ec_tell_frac(&dec),tell[0],seed); - } - for(j=0;j<sz;j++){ - int fs; - int dec_method; - dec_method=rand()/((RAND_MAX>>2)+1); - switch(dec_method){ - case 0:{ - fs=ec_decode(&dec,1<<logp1[j]); - sym=fs>=(1<<logp1[j])-1; - ec_dec_update(&dec,sym?(1<<logp1[j])-1:0, - (1<<logp1[j])-(sym?0:1),1<<logp1[j]); - }break; - case 1:{ - fs=ec_decode_bin(&dec,logp1[j]); - sym=fs>=(1<<logp1[j])-1; - ec_dec_update(&dec,sym?(1<<logp1[j])-1:0, - (1<<logp1[j])-(sym?0:1),1<<logp1[j]); - }break; - case 2:{ - sym=ec_dec_bit_logp(&dec,logp1[j]); - }break; - case 3:{ - unsigned char icdf[2]; - icdf[0]=1; - icdf[1]=0; - sym=ec_dec_icdf(&dec,icdf,logp1[j]); - }break; - } - if(sym!=data[j]){ - fprintf(stderr, - "Decoded %i instead of %i with logp1 of %i at position %i of %i (Random seed: %u).\n", - sym,data[j],logp1[j],j,sz,seed); - fprintf(stderr,"Encoding method: %i, decoding method: %i\n", - enc_method[j],dec_method); - ret=-1; - } - if(ec_tell_frac(&dec)!=tell[j+1]){ - fprintf(stderr, - "Tell mismatch between encoder and decoder at symbol %i: %i instead of %i (Random seed: %u).\n", - j+1,ec_tell_frac(&dec),tell[j+1],seed); - } - } - free(enc_method); - free(tell); - free(data); - free(logp1); - } - ec_enc_init(&enc,ptr,DATA_SIZE2); - ec_enc_bit_logp(&enc,0,1); - ec_enc_bit_logp(&enc,0,1); - ec_enc_bit_logp(&enc,0,1); - ec_enc_bit_logp(&enc,0,1); - ec_enc_bit_logp(&enc,0,2); - ec_enc_patch_initial_bits(&enc,3,2); - if(enc.error){ - fprintf(stderr,"patch_initial_bits failed"); - ret=-1; - } - ec_enc_patch_initial_bits(&enc,0,5); - if(!enc.error){ - fprintf(stderr,"patch_initial_bits didn't fail when it should have"); - ret=-1; - } - ec_enc_done(&enc); - if(ec_range_bytes(&enc)!=1||ptr[0]!=192){ - fprintf(stderr,"Got %d when expecting 192 for patch_initial_bits",ptr[0]); - ret=-1; - } - ec_enc_init(&enc,ptr,DATA_SIZE2); - ec_enc_bit_logp(&enc,0,1); - ec_enc_bit_logp(&enc,0,1); - ec_enc_bit_logp(&enc,1,6); - ec_enc_bit_logp(&enc,0,2); - ec_enc_patch_initial_bits(&enc,0,2); - if(enc.error){ - fprintf(stderr,"patch_initial_bits failed"); - ret=-1; - } - ec_enc_done(&enc); - if(ec_range_bytes(&enc)!=2||ptr[0]!=63){ - fprintf(stderr,"Got %d when expecting 63 for patch_initial_bits",ptr[0]); - ret=-1; - } - ec_enc_init(&enc,ptr,2); - ec_enc_bit_logp(&enc,0,2); - for(i=0;i<48;i++){ - ec_enc_bits(&enc,0,1); - } - ec_enc_done(&enc); - if(!enc.error){ - fprintf(stderr,"Raw bits overfill didn't fail when it should have"); - ret=-1; - } - ec_enc_init(&enc,ptr,2); - for(i=0;i<17;i++){ - ec_enc_bits(&enc,0,1); - } - ec_enc_done(&enc); - if(!enc.error){ - fprintf(stderr,"17 raw bits encoded in two bytes"); - ret=-1; - } - free(ptr); - return ret; -} diff --git a/thirdparty/opus/celt/tests/test_unit_laplace.c b/thirdparty/opus/celt/tests/test_unit_laplace.c deleted file mode 100644 index 22951e29ee..0000000000 --- a/thirdparty/opus/celt/tests/test_unit_laplace.c +++ /dev/null @@ -1,93 +0,0 @@ -/* Copyright (c) 2008-2011 Xiph.Org Foundation, Mozilla Corporation - Written by Jean-Marc Valin and Timothy B. Terriberry */ -/* - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - - Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - - - Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in the - documentation and/or other materials provided with the distribution. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER - OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, - EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, - PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR - PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF - LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING - NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS - SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -*/ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#include <stdio.h> -#include <stdlib.h> -#include "laplace.h" -#define CELT_C -#include "stack_alloc.h" - -#include "entenc.c" -#include "entdec.c" -#include "entcode.c" -#include "laplace.c" - -#define DATA_SIZE 40000 - -int ec_laplace_get_start_freq(int decay) -{ - opus_uint32 ft = 32768 - LAPLACE_MINP*(2*LAPLACE_NMIN+1); - int fs = (ft*(16384-decay))/(16384+decay); - return fs+LAPLACE_MINP; -} - -int main(void) -{ - int i; - int ret = 0; - ec_enc enc; - ec_dec dec; - unsigned char *ptr; - int val[10000], decay[10000]; - ALLOC_STACK; - ptr = (unsigned char *)malloc(DATA_SIZE); - ec_enc_init(&enc,ptr,DATA_SIZE); - - val[0] = 3; decay[0] = 6000; - val[1] = 0; decay[1] = 5800; - val[2] = -1; decay[2] = 5600; - for (i=3;i<10000;i++) - { - val[i] = rand()%15-7; - decay[i] = rand()%11000+5000; - } - for (i=0;i<10000;i++) - ec_laplace_encode(&enc, &val[i], - ec_laplace_get_start_freq(decay[i]), decay[i]); - - ec_enc_done(&enc); - - ec_dec_init(&dec,ec_get_buffer(&enc),ec_range_bytes(&enc)); - - for (i=0;i<10000;i++) - { - int d = ec_laplace_decode(&dec, - ec_laplace_get_start_freq(decay[i]), decay[i]); - if (d != val[i]) - { - fprintf (stderr, "Got %d instead of %d\n", d, val[i]); - ret = 1; - } - } - - free(ptr); - return ret; -} diff --git a/thirdparty/opus/celt/tests/test_unit_mathops.c b/thirdparty/opus/celt/tests/test_unit_mathops.c deleted file mode 100644 index fd3319da91..0000000000 --- a/thirdparty/opus/celt/tests/test_unit_mathops.c +++ /dev/null @@ -1,304 +0,0 @@ -/* Copyright (c) 2008-2011 Xiph.Org Foundation, Mozilla Corporation, - Gregory Maxwell - Written by Jean-Marc Valin, Gregory Maxwell, and Timothy B. Terriberry */ -/* - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - - Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - - - Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in the - documentation and/or other materials provided with the distribution. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER - OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, - EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, - PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR - PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF - LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING - NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS - SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -*/ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#ifndef CUSTOM_MODES -#define CUSTOM_MODES -#endif - -#define CELT_C - -#include <stdio.h> -#include <math.h> -#include "mathops.c" -#include "entenc.c" -#include "entdec.c" -#include "entcode.c" -#include "bands.c" -#include "quant_bands.c" -#include "laplace.c" -#include "vq.c" -#include "cwrs.c" -#include "pitch.c" -#include "celt_lpc.c" -#include "celt.c" - -#if defined(OPUS_X86_MAY_HAVE_SSE) || defined(OPUS_X86_MAY_HAVE_SSE2) || defined(OPUS_X86_MAY_HAVE_SSE4_1) -# if defined(OPUS_X86_MAY_HAVE_SSE) -# include "x86/pitch_sse.c" -# endif -# if defined(OPUS_X86_MAY_HAVE_SSE2) -# include "x86/pitch_sse2.c" -# endif -# if defined(OPUS_X86_MAY_HAVE_SSE4_1) -# include "x86/pitch_sse4_1.c" -# include "x86/celt_lpc_sse.c" -# endif -# include "x86/x86_celt_map.c" -#elif defined(OPUS_ARM_ASM) || defined(OPUS_ARM_MAY_HAVE_NEON_INTR) -# include "arm/armcpu.c" -# if defined(OPUS_ARM_MAY_HAVE_NEON_INTR) -# include "arm/celt_neon_intr.c" -# if defined(HAVE_ARM_NE10) -# include "kiss_fft.c" -# include "mdct.c" -# include "arm/celt_ne10_fft.c" -# include "arm/celt_ne10_mdct.c" -# endif -# endif -# include "arm/arm_celt_map.c" -#endif - -#ifdef FIXED_POINT -#define WORD "%d" -#else -#define WORD "%f" -#endif - -int ret = 0; - -void testdiv(void) -{ - opus_int32 i; - for (i=1;i<=327670;i++) - { - double prod; - opus_val32 val; - val = celt_rcp(i); -#ifdef FIXED_POINT - prod = (1./32768./65526.)*val*i; -#else - prod = val*i; -#endif - if (fabs(prod-1) > .00025) - { - fprintf (stderr, "div failed: 1/%d="WORD" (product = %f)\n", i, val, prod); - ret = 1; - } - } -} - -void testsqrt(void) -{ - opus_int32 i; - for (i=1;i<=1000000000;i++) - { - double ratio; - opus_val16 val; - val = celt_sqrt(i); - ratio = val/sqrt(i); - if (fabs(ratio - 1) > .0005 && fabs(val-sqrt(i)) > 2) - { - fprintf (stderr, "sqrt failed: sqrt(%d)="WORD" (ratio = %f)\n", i, val, ratio); - ret = 1; - } - i+= i>>10; - } -} - -void testbitexactcos(void) -{ - int i; - opus_int32 min_d,max_d,last,chk; - chk=max_d=0; - last=min_d=32767; - for(i=64;i<=16320;i++) - { - opus_int32 d; - opus_int32 q=bitexact_cos(i); - chk ^= q*i; - d = last - q; - if (d>max_d)max_d=d; - if (d<min_d)min_d=d; - last = q; - } - if ((chk!=89408644)||(max_d!=5)||(min_d!=0)||(bitexact_cos(64)!=32767)|| - (bitexact_cos(16320)!=200)||(bitexact_cos(8192)!=23171)) - { - fprintf (stderr, "bitexact_cos failed\n"); - ret = 1; - } -} - -void testbitexactlog2tan(void) -{ - int i,fail; - opus_int32 min_d,max_d,last,chk; - fail=chk=max_d=0; - last=min_d=15059; - for(i=64;i<8193;i++) - { - opus_int32 d; - opus_int32 mid=bitexact_cos(i); - opus_int32 side=bitexact_cos(16384-i); - opus_int32 q=bitexact_log2tan(mid,side); - chk ^= q*i; - d = last - q; - if (q!=-1*bitexact_log2tan(side,mid)) - fail = 1; - if (d>max_d)max_d=d; - if (d<min_d)min_d=d; - last = q; - } - if ((chk!=15821257)||(max_d!=61)||(min_d!=-2)||fail|| - (bitexact_log2tan(32767,200)!=15059)||(bitexact_log2tan(30274,12540)!=2611)|| - (bitexact_log2tan(23171,23171)!=0)) - { - fprintf (stderr, "bitexact_log2tan failed\n"); - ret = 1; - } -} - -#ifndef FIXED_POINT -void testlog2(void) -{ - float x; - for (x=0.001;x<1677700.0;x+=(x/8.0)) - { - float error = fabs((1.442695040888963387*log(x))-celt_log2(x)); - if (error>0.0009) - { - fprintf (stderr, "celt_log2 failed: fabs((1.442695040888963387*log(x))-celt_log2(x))>0.001 (x = %f, error = %f)\n", x,error); - ret = 1; - } - } -} - -void testexp2(void) -{ - float x; - for (x=-11.0;x<24.0;x+=0.0007) - { - float error = fabs(x-(1.442695040888963387*log(celt_exp2(x)))); - if (error>0.0002) - { - fprintf (stderr, "celt_exp2 failed: fabs(x-(1.442695040888963387*log(celt_exp2(x))))>0.0005 (x = %f, error = %f)\n", x,error); - ret = 1; - } - } -} - -void testexp2log2(void) -{ - float x; - for (x=-11.0;x<24.0;x+=0.0007) - { - float error = fabs(x-(celt_log2(celt_exp2(x)))); - if (error>0.001) - { - fprintf (stderr, "celt_log2/celt_exp2 failed: fabs(x-(celt_log2(celt_exp2(x))))>0.001 (x = %f, error = %f)\n", x,error); - ret = 1; - } - } -} -#else -void testlog2(void) -{ - opus_val32 x; - for (x=8;x<1073741824;x+=(x>>3)) - { - float error = fabs((1.442695040888963387*log(x/16384.0))-celt_log2(x)/1024.0); - if (error>0.003) - { - fprintf (stderr, "celt_log2 failed: x = %ld, error = %f\n", (long)x,error); - ret = 1; - } - } -} - -void testexp2(void) -{ - opus_val16 x; - for (x=-32768;x<15360;x++) - { - float error1 = fabs(x/1024.0-(1.442695040888963387*log(celt_exp2(x)/65536.0))); - float error2 = fabs(exp(0.6931471805599453094*x/1024.0)-celt_exp2(x)/65536.0); - if (error1>0.0002&&error2>0.00004) - { - fprintf (stderr, "celt_exp2 failed: x = "WORD", error1 = %f, error2 = %f\n", x,error1,error2); - ret = 1; - } - } -} - -void testexp2log2(void) -{ - opus_val32 x; - for (x=8;x<65536;x+=(x>>3)) - { - float error = fabs(x-0.25*celt_exp2(celt_log2(x)))/16384; - if (error>0.004) - { - fprintf (stderr, "celt_log2/celt_exp2 failed: fabs(x-(celt_exp2(celt_log2(x))))>0.001 (x = %ld, error = %f)\n", (long)x,error); - ret = 1; - } - } -} - -void testilog2(void) -{ - opus_val32 x; - for (x=1;x<=268435455;x+=127) - { - opus_val32 lg; - opus_val32 y; - - lg = celt_ilog2(x); - if (lg<0 || lg>=31) - { - printf("celt_ilog2 failed: 0<=celt_ilog2(x)<31 (x = %d, celt_ilog2(x) = %d)\n",x,lg); - ret = 1; - } - y = 1<<lg; - - if (x<y || (x>>1)>=y) - { - printf("celt_ilog2 failed: 2**celt_ilog2(x)<=x<2**(celt_ilog2(x)+1) (x = %d, 2**celt_ilog2(x) = %d)\n",x,y); - ret = 1; - } - } -} -#endif - -int main(void) -{ - testbitexactcos(); - testbitexactlog2tan(); - testdiv(); - testsqrt(); - testlog2(); - testexp2(); - testexp2log2(); -#ifdef FIXED_POINT - testilog2(); -#endif - return ret; -} diff --git a/thirdparty/opus/celt/tests/test_unit_mdct.c b/thirdparty/opus/celt/tests/test_unit_mdct.c deleted file mode 100644 index 8dbb9caa2e..0000000000 --- a/thirdparty/opus/celt/tests/test_unit_mdct.c +++ /dev/null @@ -1,230 +0,0 @@ -/* Copyright (c) 2008-2011 Xiph.Org Foundation - Written by Jean-Marc Valin */ -/* - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - - Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - - - Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in the - documentation and/or other materials provided with the distribution. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER - OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, - EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, - PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR - PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF - LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING - NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS - SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -*/ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#define SKIP_CONFIG_H - -#ifndef CUSTOM_MODES -#define CUSTOM_MODES -#endif - -#include <stdio.h> - -#define CELT_C -#include "mdct.h" -#include "stack_alloc.h" - -#include "kiss_fft.c" -#include "mdct.c" -#include "mathops.c" -#include "entcode.c" - -#if defined(OPUS_X86_MAY_HAVE_SSE2) || defined(OPUS_X86_MAY_HAVE_SSE4_1) -# include "x86/x86cpu.c" -#elif defined(OPUS_ARM_ASM) || defined(OPUS_ARM_MAY_HAVE_NEON_INTR) -# include "arm/armcpu.c" -# include "pitch.c" -# include "celt_lpc.c" -# if defined(OPUS_ARM_MAY_HAVE_NEON_INTR) -# include "arm/celt_neon_intr.c" -# if defined(HAVE_ARM_NE10) -# include "arm/celt_ne10_fft.c" -# include "arm/celt_ne10_mdct.c" -# endif -# endif -# include "arm/arm_celt_map.c" -#endif - -#ifndef M_PI -#define M_PI 3.141592653 -#endif - -int ret = 0; -void check(kiss_fft_scalar * in,kiss_fft_scalar * out,int nfft,int isinverse) -{ - int bin,k; - double errpow=0,sigpow=0; - double snr; - for (bin=0;bin<nfft/2;++bin) { - double ansr = 0; - double difr; - - for (k=0;k<nfft;++k) { - double phase = 2*M_PI*(k+.5+.25*nfft)*(bin+.5)/nfft; - double re = cos(phase); - - re /= nfft/4; - - ansr += in[k] * re; - } - /*printf ("%f %f\n", ansr, out[bin]);*/ - difr = ansr - out[bin]; - errpow += difr*difr; - sigpow += ansr*ansr; - } - snr = 10*log10(sigpow/errpow); - printf("nfft=%d inverse=%d,snr = %f\n",nfft,isinverse,snr ); - if (snr<60) { - printf( "** poor snr: %f **\n", snr); - ret = 1; - } -} - -void check_inv(kiss_fft_scalar * in,kiss_fft_scalar * out,int nfft,int isinverse) -{ - int bin,k; - double errpow=0,sigpow=0; - double snr; - for (bin=0;bin<nfft;++bin) { - double ansr = 0; - double difr; - - for (k=0;k<nfft/2;++k) { - double phase = 2*M_PI*(bin+.5+.25*nfft)*(k+.5)/nfft; - double re = cos(phase); - - /*re *= 2;*/ - - ansr += in[k] * re; - } - /*printf ("%f %f\n", ansr, out[bin]);*/ - difr = ansr - out[bin]; - errpow += difr*difr; - sigpow += ansr*ansr; - } - snr = 10*log10(sigpow/errpow); - printf("nfft=%d inverse=%d,snr = %f\n",nfft,isinverse,snr ); - if (snr<60) { - printf( "** poor snr: %f **\n", snr); - ret = 1; - } -} - - -void test1d(int nfft,int isinverse,int arch) -{ - mdct_lookup cfg; - size_t buflen = sizeof(kiss_fft_scalar)*nfft; - - kiss_fft_scalar * in = (kiss_fft_scalar*)malloc(buflen); - kiss_fft_scalar * in_copy = (kiss_fft_scalar*)malloc(buflen); - kiss_fft_scalar * out= (kiss_fft_scalar*)malloc(buflen); - opus_val16 * window= (opus_val16*)malloc(sizeof(opus_val16)*nfft/2); - int k; - - clt_mdct_init(&cfg, nfft, 0, arch); - for (k=0;k<nfft;++k) { - in[k] = (rand() % 32768) - 16384; - } - - for (k=0;k<nfft/2;++k) { - window[k] = Q15ONE; - } - for (k=0;k<nfft;++k) { - in[k] *= 32768; - } - - if (isinverse) - { - for (k=0;k<nfft;++k) { - in[k] /= nfft; - } - } - - for (k=0;k<nfft;++k) - in_copy[k] = in[k]; - /*for (k=0;k<nfft;++k) printf("%d %d ", in[k].r, in[k].i);printf("\n");*/ - - if (isinverse) - { - for (k=0;k<nfft;++k) - out[k] = 0; - clt_mdct_backward(&cfg,in,out, window, nfft/2, 0, 1, arch); - /* apply TDAC because clt_mdct_backward() no longer does that */ - for (k=0;k<nfft/4;++k) - out[nfft-k-1] = out[nfft/2+k]; - check_inv(in,out,nfft,isinverse); - } else { - clt_mdct_forward(&cfg,in,out,window, nfft/2, 0, 1, arch); - check(in_copy,out,nfft,isinverse); - } - /*for (k=0;k<nfft;++k) printf("%d %d ", out[k].r, out[k].i);printf("\n");*/ - - - free(in); - free(in_copy); - free(out); - free(window); - clt_mdct_clear(&cfg, arch); -} - -int main(int argc,char ** argv) -{ - ALLOC_STACK; - int arch = opus_select_arch(); - - if (argc>1) { - int k; - for (k=1;k<argc;++k) { - test1d(atoi(argv[k]),0,arch); - test1d(atoi(argv[k]),1,arch); - } - }else{ - test1d(32,0,arch); - test1d(32,1,arch); - test1d(256,0,arch); - test1d(256,1,arch); - test1d(512,0,arch); - test1d(512,1,arch); - test1d(1024,0,arch); - test1d(1024,1,arch); - test1d(2048,0,arch); - test1d(2048,1,arch); -#ifndef RADIX_TWO_ONLY - test1d(36,0,arch); - test1d(36,1,arch); - test1d(40,0,arch); - test1d(40,1,arch); - test1d(60,0,arch); - test1d(60,1,arch); - test1d(120,0,arch); - test1d(120,1,arch); - test1d(240,0,arch); - test1d(240,1,arch); - test1d(480,0,arch); - test1d(480,1,arch); - test1d(960,0,arch); - test1d(960,1,arch); - test1d(1920,0,arch); - test1d(1920,1,arch); -#endif - } - return ret; -} diff --git a/thirdparty/opus/celt/tests/test_unit_rotation.c b/thirdparty/opus/celt/tests/test_unit_rotation.c deleted file mode 100644 index 1080c2085d..0000000000 --- a/thirdparty/opus/celt/tests/test_unit_rotation.c +++ /dev/null @@ -1,120 +0,0 @@ -/* Copyright (c) 2008-2011 Xiph.Org Foundation - Written by Jean-Marc Valin */ -/* - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - - Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - - - Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in the - documentation and/or other materials provided with the distribution. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER - OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, - EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, - PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR - PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF - LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING - NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS - SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -*/ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#ifndef CUSTOM_MODES -#define CUSTOM_MODES -#endif - -#define CELT_C - -#include <stdio.h> -#include <stdlib.h> -#include "vq.c" -#include "cwrs.c" -#include "entcode.c" -#include "entenc.c" -#include "entdec.c" -#include "mathops.c" -#include "bands.h" -#include "pitch.c" -#include "celt_lpc.c" -#include "celt.c" -#include <math.h> - -#if defined(OPUS_X86_MAY_HAVE_SSE) || defined(OPUS_X86_MAY_HAVE_SSE2) || defined(OPUS_X86_MAY_HAVE_SSE4_1) -# if defined(OPUS_X86_MAY_HAVE_SSE) -# include "x86/pitch_sse.c" -# endif -# if defined(OPUS_X86_MAY_HAVE_SSE2) -# include "x86/pitch_sse2.c" -# endif -# if defined(OPUS_X86_MAY_HAVE_SSE4_1) -# include "x86/pitch_sse4_1.c" -# include "x86/celt_lpc_sse.c" -# endif -# include "x86/x86_celt_map.c" -#elif defined(OPUS_ARM_ASM) || defined(OPUS_ARM_MAY_HAVE_NEON_INTR) -# include "arm/armcpu.c" -# if defined(OPUS_ARM_MAY_HAVE_NEON_INTR) -# include "arm/celt_neon_intr.c" -# if defined(HAVE_ARM_NE10) -# include "kiss_fft.c" -# include "mdct.c" -# include "arm/celt_ne10_fft.c" -# include "arm/celt_ne10_mdct.c" -# endif -# endif -# include "arm/arm_celt_map.c" -#endif - -#define MAX_SIZE 100 - -int ret=0; -void test_rotation(int N, int K) -{ - int i; - double err = 0, ener = 0, snr, snr0; - opus_val16 x0[MAX_SIZE]; - opus_val16 x1[MAX_SIZE]; - for (i=0;i<N;i++) - x1[i] = x0[i] = rand()%32767-16384; - exp_rotation(x1, N, 1, 1, K, SPREAD_NORMAL); - for (i=0;i<N;i++) - { - err += (x0[i]-(double)x1[i])*(x0[i]-(double)x1[i]); - ener += x0[i]*(double)x0[i]; - } - snr0 = 20*log10(ener/err); - err = ener = 0; - exp_rotation(x1, N, -1, 1, K, SPREAD_NORMAL); - for (i=0;i<N;i++) - { - err += (x0[i]-(double)x1[i])*(x0[i]-(double)x1[i]); - ener += x0[i]*(double)x0[i]; - } - snr = 20*log10(ener/err); - printf ("SNR for size %d (%d pulses) is %f (was %f without inverse)\n", N, K, snr, snr0); - if (snr < 60 || snr0 > 20) - { - fprintf(stderr, "FAIL!\n"); - ret = 1; - } -} - -int main(void) -{ - ALLOC_STACK; - test_rotation(15, 3); - test_rotation(23, 5); - test_rotation(50, 3); - test_rotation(80, 1); - return ret; -} diff --git a/thirdparty/opus/celt/vq.c b/thirdparty/opus/celt/vq.c index d29f38fd8e..8011e22548 100644 --- a/thirdparty/opus/celt/vq.c +++ b/thirdparty/opus/celt/vq.c @@ -39,6 +39,10 @@ #include "rate.h" #include "pitch.h" +#if defined(MIPSr1_ASM) +#include "mips/vq_mipsr1.h" +#endif + #ifndef OVERRIDE_vq_exp_rotation1 static void exp_rotation1(celt_norm *X, int len, int stride, opus_val16 c, opus_val16 s) { @@ -67,7 +71,7 @@ static void exp_rotation1(celt_norm *X, int len, int stride, opus_val16 c, opus_ } #endif /* OVERRIDE_vq_exp_rotation1 */ -static void exp_rotation(celt_norm *X, int len, int dir, int stride, int K, int spread) +void exp_rotation(celt_norm *X, int len, int dir, int stride, int K, int spread) { static const int SPREAD_FACTOR[3]={15,10,5}; int i; @@ -158,42 +162,27 @@ static unsigned extract_collapse_mask(int *iy, int N, int B) return collapse_mask; } -unsigned alg_quant(celt_norm *X, int N, int K, int spread, int B, ec_enc *enc -#ifdef RESYNTH - , opus_val16 gain -#endif - ) +opus_val16 op_pvq_search_c(celt_norm *X, int *iy, int K, int N, int arch) { VARDECL(celt_norm, y); - VARDECL(int, iy); - VARDECL(opus_val16, signx); + VARDECL(int, signx); int i, j; - opus_val16 s; int pulsesLeft; opus_val32 sum; opus_val32 xy; opus_val16 yy; - unsigned collapse_mask; SAVE_STACK; - celt_assert2(K>0, "alg_quant() needs at least one pulse"); - celt_assert2(N>1, "alg_quant() needs at least two dimensions"); - + (void)arch; ALLOC(y, N, celt_norm); - ALLOC(iy, N, int); - ALLOC(signx, N, opus_val16); - - exp_rotation(X, N, 1, B, K, spread); + ALLOC(signx, N, int); /* Get rid of the sign */ sum = 0; j=0; do { - if (X[j]>0) - signx[j]=1; - else { - signx[j]=-1; - X[j]=-X[j]; - } + signx[j] = X[j]<0; + /* OPT: Make sure the compiler doesn't use a branch on ABS16(). */ + X[j] = ABS16(X[j]); iy[j] = 0; y[j] = 0; } while (++j<N); @@ -225,7 +214,12 @@ unsigned alg_quant(celt_norm *X, int N, int K, int spread, int B, ec_enc *enc while (++j<N); sum = QCONST16(1.f,14); } - rcp = EXTRACT16(MULT16_32_Q16(K-1, celt_rcp(sum))); +#ifdef FIXED_POINT + rcp = EXTRACT16(MULT16_32_Q16(K, celt_rcp(sum))); +#else + /* Using K+e with e < 1 guarantees we cannot get more than K pulses. */ + rcp = EXTRACT16(MULT16_32_Q16(K+0.8f, celt_rcp(sum))); +#endif j=0; do { #ifdef FIXED_POINT /* It's really important to round *towards zero* here */ @@ -240,12 +234,12 @@ unsigned alg_quant(celt_norm *X, int N, int K, int spread, int B, ec_enc *enc pulsesLeft -= iy[j]; } while (++j<N); } - celt_assert2(pulsesLeft>=1, "Allocated too many pulses in the quick pass"); + celt_sig_assert(pulsesLeft>=0); /* This should never happen, but just in case it does (e.g. on silence) we fill the first bin with pulses. */ #ifdef FIXED_POINT_DEBUG - celt_assert2(pulsesLeft<=N+3, "Not enough pulses in the quick pass"); + celt_sig_assert(pulsesLeft<=N+3); #endif if (pulsesLeft > N+3) { @@ -256,12 +250,12 @@ unsigned alg_quant(celt_norm *X, int N, int K, int spread, int B, ec_enc *enc pulsesLeft=0; } - s = 1; for (i=0;i<pulsesLeft;i++) { + opus_val16 Rxy, Ryy; int best_id; - opus_val32 best_num = -VERY_LARGE16; - opus_val16 best_den = 0; + opus_val32 best_num; + opus_val16 best_den; #ifdef FIXED_POINT int rshift; #endif @@ -272,9 +266,22 @@ unsigned alg_quant(celt_norm *X, int N, int K, int spread, int B, ec_enc *enc /* The squared magnitude term gets added anyway, so we might as well add it outside the loop */ yy = ADD16(yy, 1); - j=0; + + /* Calculations for position 0 are out of the loop, in part to reduce + mispredicted branches (since the if condition is usually false) + in the loop. */ + /* Temporary sums of the new pulse(s) */ + Rxy = EXTRACT16(SHR32(ADD32(xy, EXTEND32(X[0])),rshift)); + /* We're multiplying y[j] by two so we don't have to do it here */ + Ryy = ADD16(yy, y[0]); + + /* Approximate score: we maximise Rxy/sqrt(Ryy) (we're guaranteed that + Rxy is positive because the sign is pre-computed) */ + Rxy = MULT16_16_Q15(Rxy,Rxy); + best_den = Ryy; + best_num = Rxy; + j=1; do { - opus_val16 Rxy, Ryy; /* Temporary sums of the new pulse(s) */ Rxy = EXTRACT16(SHR32(ADD32(xy, EXTEND32(X[j])),rshift)); /* We're multiplying y[j] by two so we don't have to do it here */ @@ -285,8 +292,11 @@ unsigned alg_quant(celt_norm *X, int N, int K, int spread, int B, ec_enc *enc Rxy = MULT16_16_Q15(Rxy,Rxy); /* The idea is to check for num/den >= best_num/best_den, but that way we can do it without any division */ - /* OPT: Make sure to use conditional moves here */ - if (MULT16_16(best_den, Rxy) > MULT16_16(Ryy, best_num)) + /* OPT: It's not clear whether a cmov is faster than a branch here + since the condition is more often false than true and using + a cmov introduces data dependencies across iterations. The optimal + choice may be architecture-dependent. */ + if (opus_unlikely(MULT16_16(best_den, Rxy) > MULT16_16(Ryy, best_num))) { best_den = Ryy; best_num = Rxy; @@ -301,23 +311,47 @@ unsigned alg_quant(celt_norm *X, int N, int K, int spread, int B, ec_enc *enc /* Only now that we've made the final choice, update y/iy */ /* Multiplying y[j] by 2 so we don't have to do it everywhere else */ - y[best_id] += 2*s; + y[best_id] += 2; iy[best_id]++; } /* Put the original sign back */ j=0; do { - X[j] = MULT16_16(signx[j],X[j]); - if (signx[j] < 0) - iy[j] = -iy[j]; + /*iy[j] = signx[j] ? -iy[j] : iy[j];*/ + /* OPT: The is more likely to be compiled without a branch than the code above + but has the same performance otherwise. */ + iy[j] = (iy[j]^-signx[j]) + signx[j]; } while (++j<N); + RESTORE_STACK; + return yy; +} + +unsigned alg_quant(celt_norm *X, int N, int K, int spread, int B, ec_enc *enc, + opus_val16 gain, int resynth, int arch) +{ + VARDECL(int, iy); + opus_val16 yy; + unsigned collapse_mask; + SAVE_STACK; + + celt_assert2(K>0, "alg_quant() needs at least one pulse"); + celt_assert2(N>1, "alg_quant() needs at least two dimensions"); + + /* Covers vectorization by up to 4. */ + ALLOC(iy, N+3, int); + + exp_rotation(X, N, 1, B, K, spread); + + yy = op_pvq_search(X, iy, K, N, arch); + encode_pulses(iy, N, K, enc); -#ifdef RESYNTH - normalise_residual(iy, X, N, yy, gain); - exp_rotation(X, N, -1, B, K, spread); -#endif + if (resynth) + { + normalise_residual(iy, X, N, yy, gain); + exp_rotation(X, N, -1, B, K, spread); + } collapse_mask = extract_collapse_mask(iy, N, B); RESTORE_STACK; @@ -401,7 +435,7 @@ int stereo_itheta(const celt_norm *X, const celt_norm *Y, int stereo, int N, int /* 0.63662 = 2/pi */ itheta = MULT16_16_Q15(QCONST16(0.63662f,15),celt_atan2p(side, mid)); #else - itheta = (int)floor(.5f+16384*0.63662f*atan2(side,mid)); + itheta = (int)floor(.5f+16384*0.63662f*fast_atan2f(side,mid)); #endif return itheta; diff --git a/thirdparty/opus/celt/vq.h b/thirdparty/opus/celt/vq.h index 5cfcbe50ea..45ec55918e 100644 --- a/thirdparty/opus/celt/vq.h +++ b/thirdparty/opus/celt/vq.h @@ -37,10 +37,18 @@ #include "entdec.h" #include "modes.h" -#if defined(MIPSr1_ASM) -#include "mips/vq_mipsr1.h" +#if (defined(OPUS_X86_MAY_HAVE_SSE2) && !defined(FIXED_POINT)) +#include "x86/vq_sse.h" #endif +void exp_rotation(celt_norm *X, int len, int dir, int stride, int K, int spread); + +opus_val16 op_pvq_search_c(celt_norm *X, int *iy, int K, int N, int arch); + +#if !defined(OVERRIDE_OP_PVQ_SEARCH) +#define op_pvq_search(x, iy, K, N, arch) \ + (op_pvq_search_c(x, iy, K, N, arch)) +#endif /** Algebraic pulse-vector quantiser. The signal x is replaced by the sum of * the pitch and a combination of pulses such that its norm is still equal @@ -51,12 +59,8 @@ * @param enc Entropy encoder state * @ret A mask indicating which blocks in the band received pulses */ -unsigned alg_quant(celt_norm *X, int N, int K, int spread, int B, - ec_enc *enc -#ifdef RESYNTH - , opus_val16 gain -#endif - ); +unsigned alg_quant(celt_norm *X, int N, int K, int spread, int B, ec_enc *enc, + opus_val16 gain, int resynth, int arch); /** Algebraic pulse decoder * @param X Decoded normalised spectrum (returned) diff --git a/thirdparty/opus/celt/x86/celt_lpc_sse.h b/thirdparty/opus/celt/x86/celt_lpc_sse.h index c5ec796ed5..7d1ecf7533 100644 --- a/thirdparty/opus/celt/x86/celt_lpc_sse.h +++ b/thirdparty/opus/celt/x86/celt_lpc_sse.h @@ -41,12 +41,11 @@ void celt_fir_sse4_1( opus_val16 *y, int N, int ord, - opus_val16 *mem, int arch); #if defined(OPUS_X86_PRESUME_SSE4_1) -#define celt_fir(x, num, y, N, ord, mem, arch) \ - ((void)arch, celt_fir_sse4_1(x, num, y, N, ord, mem, arch)) +#define celt_fir(x, num, y, N, ord, arch) \ + ((void)arch, celt_fir_sse4_1(x, num, y, N, ord, arch)) #else @@ -56,11 +55,10 @@ extern void (*const CELT_FIR_IMPL[OPUS_ARCHMASK + 1])( opus_val16 *y, int N, int ord, - opus_val16 *mem, int arch); -# define celt_fir(x, num, y, N, ord, mem, arch) \ - ((*CELT_FIR_IMPL[(arch) & OPUS_ARCHMASK])(x, num, y, N, ord, mem, arch)) +# define celt_fir(x, num, y, N, ord, arch) \ + ((*CELT_FIR_IMPL[(arch) & OPUS_ARCHMASK])(x, num, y, N, ord, arch)) #endif #endif diff --git a/thirdparty/opus/celt/x86/celt_lpc_sse.c b/thirdparty/opus/celt/x86/celt_lpc_sse4_1.c index 67e5592acf..5478568849 100644 --- a/thirdparty/opus/celt/x86/celt_lpc_sse.c +++ b/thirdparty/opus/celt/x86/celt_lpc_sse4_1.c @@ -40,65 +40,23 @@ #if defined(FIXED_POINT) -void celt_fir_sse4_1(const opus_val16 *_x, +void celt_fir_sse4_1(const opus_val16 *x, const opus_val16 *num, - opus_val16 *_y, + opus_val16 *y, int N, int ord, - opus_val16 *mem, int arch) { int i,j; VARDECL(opus_val16, rnum); - VARDECL(opus_val16, x); __m128i vecNoA; opus_int32 noA ; SAVE_STACK; ALLOC(rnum, ord, opus_val16); - ALLOC(x, N+ord, opus_val16); for(i=0;i<ord;i++) rnum[i] = num[ord-i-1]; - for(i=0;i<ord;i++) - x[i] = mem[ord-i-1]; - - for (i=0;i<N-7;i+=8) - { - x[i+ord ]=_x[i ]; - x[i+ord+1]=_x[i+1]; - x[i+ord+2]=_x[i+2]; - x[i+ord+3]=_x[i+3]; - x[i+ord+4]=_x[i+4]; - x[i+ord+5]=_x[i+5]; - x[i+ord+6]=_x[i+6]; - x[i+ord+7]=_x[i+7]; - } - - for (;i<N-3;i+=4) - { - x[i+ord ]=_x[i ]; - x[i+ord+1]=_x[i+1]; - x[i+ord+2]=_x[i+2]; - x[i+ord+3]=_x[i+3]; - } - - for (;i<N;i++) - x[i+ord]=_x[i]; - - for(i=0;i<ord;i++) - mem[i] = _x[N-i-1]; -#ifdef SMALL_FOOTPRINT - for (i=0;i<N;i++) - { - opus_val32 sum = SHL32(EXTEND32(_x[i]), SIG_SHIFT); - for (j=0;j<ord;j++) - { - sum = MAC16_16(sum,rnum[j],x[i+j]); - } - _y[i] = SATURATE16(PSHR32(sum, SIG_SHIFT)); - } -#else noA = EXTEND32(1) << SIG_SHIFT >> 1; vecNoA = _mm_set_epi32(noA, noA, noA, noA); @@ -107,25 +65,24 @@ void celt_fir_sse4_1(const opus_val16 *_x, opus_val32 sums[4] = {0}; __m128i vecSum, vecX; - xcorr_kernel(rnum, x+i, sums, ord, arch); + xcorr_kernel(rnum, x+i-ord, sums, ord, arch); vecSum = _mm_loadu_si128((__m128i *)sums); vecSum = _mm_add_epi32(vecSum, vecNoA); vecSum = _mm_srai_epi32(vecSum, SIG_SHIFT); - vecX = OP_CVTEPI16_EPI32_M64(_x + i); + vecX = OP_CVTEPI16_EPI32_M64(x + i); vecSum = _mm_add_epi32(vecSum, vecX); vecSum = _mm_packs_epi32(vecSum, vecSum); - _mm_storel_epi64((__m128i *)(_y + i), vecSum); + _mm_storel_epi64((__m128i *)(y + i), vecSum); } for (;i<N;i++) { opus_val32 sum = 0; for (j=0;j<ord;j++) - sum = MAC16_16(sum, rnum[j], x[i + j]); - _y[i] = SATURATE16(ADD32(EXTEND32(_x[i]), PSHR32(sum, SIG_SHIFT))); + sum = MAC16_16(sum, rnum[j], x[i+j-ord]); + y[i] = SATURATE16(ADD32(EXTEND32(x[i]), PSHR32(sum, SIG_SHIFT))); } -#endif RESTORE_STACK; } diff --git a/thirdparty/opus/celt/x86/vq_sse.h b/thirdparty/opus/celt/x86/vq_sse.h new file mode 100644 index 0000000000..b4efe8f249 --- /dev/null +++ b/thirdparty/opus/celt/x86/vq_sse.h @@ -0,0 +1,50 @@ +/* Copyright (c) 2016 Jean-Marc Valin */ +/* + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + - Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + - Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER + OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, + EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, + PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR + PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifndef VQ_SSE_H +#define VQ_SSE_H + +#if defined(OPUS_X86_MAY_HAVE_SSE2) && !defined(FIXED_POINT) +#define OVERRIDE_OP_PVQ_SEARCH + +opus_val16 op_pvq_search_sse2(celt_norm *_X, int *iy, int K, int N, int arch); + +#if defined(OPUS_X86_PRESUME_SSE2) +#define op_pvq_search(x, iy, K, N, arch) \ + (op_pvq_search_sse2(x, iy, K, N, arch)) + +#else + +extern opus_val16 (*const OP_PVQ_SEARCH_IMPL[OPUS_ARCHMASK + 1])( + celt_norm *_X, int *iy, int K, int N, int arch); + +# define op_pvq_search(X, iy, K, N, arch) \ + ((*OP_PVQ_SEARCH_IMPL[(arch) & OPUS_ARCHMASK])(X, iy, K, N, arch)) + +#endif +#endif + +#endif diff --git a/thirdparty/opus/celt/x86/vq_sse2.c b/thirdparty/opus/celt/x86/vq_sse2.c new file mode 100644 index 0000000000..775042860d --- /dev/null +++ b/thirdparty/opus/celt/x86/vq_sse2.c @@ -0,0 +1,217 @@ +/* Copyright (c) 2007-2008 CSIRO + Copyright (c) 2007-2009 Xiph.Org Foundation + Copyright (c) 2007-2016 Jean-Marc Valin */ +/* + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + - Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + - Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER + OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, + EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, + PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR + PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#include <xmmintrin.h> +#include <emmintrin.h> +#include "celt_lpc.h" +#include "stack_alloc.h" +#include "mathops.h" +#include "vq.h" +#include "x86cpu.h" + + +#ifndef FIXED_POINT + +opus_val16 op_pvq_search_sse2(celt_norm *_X, int *iy, int K, int N, int arch) +{ + int i, j; + int pulsesLeft; + float xy, yy; + VARDECL(celt_norm, y); + VARDECL(celt_norm, X); + VARDECL(float, signy); + __m128 signmask; + __m128 sums; + __m128i fours; + SAVE_STACK; + + (void)arch; + /* All bits set to zero, except for the sign bit. */ + signmask = _mm_set_ps1(-0.f); + fours = _mm_set_epi32(4, 4, 4, 4); + ALLOC(y, N+3, celt_norm); + ALLOC(X, N+3, celt_norm); + ALLOC(signy, N+3, float); + + OPUS_COPY(X, _X, N); + X[N] = X[N+1] = X[N+2] = 0; + sums = _mm_setzero_ps(); + for (j=0;j<N;j+=4) + { + __m128 x4, s4; + x4 = _mm_loadu_ps(&X[j]); + s4 = _mm_cmplt_ps(x4, _mm_setzero_ps()); + /* Get rid of the sign */ + x4 = _mm_andnot_ps(signmask, x4); + sums = _mm_add_ps(sums, x4); + /* Clear y and iy in case we don't do the projection. */ + _mm_storeu_ps(&y[j], _mm_setzero_ps()); + _mm_storeu_si128((__m128i*)&iy[j], _mm_setzero_si128()); + _mm_storeu_ps(&X[j], x4); + _mm_storeu_ps(&signy[j], s4); + } + sums = _mm_add_ps(sums, _mm_shuffle_ps(sums, sums, _MM_SHUFFLE(1, 0, 3, 2))); + sums = _mm_add_ps(sums, _mm_shuffle_ps(sums, sums, _MM_SHUFFLE(2, 3, 0, 1))); + + xy = yy = 0; + + pulsesLeft = K; + + /* Do a pre-search by projecting on the pyramid */ + if (K > (N>>1)) + { + __m128i pulses_sum; + __m128 yy4, xy4; + __m128 rcp4; + opus_val32 sum = _mm_cvtss_f32(sums); + /* If X is too small, just replace it with a pulse at 0 */ + /* Prevents infinities and NaNs from causing too many pulses + to be allocated. 64 is an approximation of infinity here. */ + if (!(sum > EPSILON && sum < 64)) + { + X[0] = QCONST16(1.f,14); + j=1; do + X[j]=0; + while (++j<N); + sums = _mm_set_ps1(1.f); + } + /* Using K+e with e < 1 guarantees we cannot get more than K pulses. */ + rcp4 = _mm_mul_ps(_mm_set_ps1((float)(K+.8)), _mm_rcp_ps(sums)); + xy4 = yy4 = _mm_setzero_ps(); + pulses_sum = _mm_setzero_si128(); + for (j=0;j<N;j+=4) + { + __m128 rx4, x4, y4; + __m128i iy4; + x4 = _mm_loadu_ps(&X[j]); + rx4 = _mm_mul_ps(x4, rcp4); + iy4 = _mm_cvttps_epi32(rx4); + pulses_sum = _mm_add_epi32(pulses_sum, iy4); + _mm_storeu_si128((__m128i*)&iy[j], iy4); + y4 = _mm_cvtepi32_ps(iy4); + xy4 = _mm_add_ps(xy4, _mm_mul_ps(x4, y4)); + yy4 = _mm_add_ps(yy4, _mm_mul_ps(y4, y4)); + /* double the y[] vector so we don't have to do it in the search loop. */ + _mm_storeu_ps(&y[j], _mm_add_ps(y4, y4)); + } + pulses_sum = _mm_add_epi32(pulses_sum, _mm_shuffle_epi32(pulses_sum, _MM_SHUFFLE(1, 0, 3, 2))); + pulses_sum = _mm_add_epi32(pulses_sum, _mm_shuffle_epi32(pulses_sum, _MM_SHUFFLE(2, 3, 0, 1))); + pulsesLeft -= _mm_cvtsi128_si32(pulses_sum); + xy4 = _mm_add_ps(xy4, _mm_shuffle_ps(xy4, xy4, _MM_SHUFFLE(1, 0, 3, 2))); + xy4 = _mm_add_ps(xy4, _mm_shuffle_ps(xy4, xy4, _MM_SHUFFLE(2, 3, 0, 1))); + xy = _mm_cvtss_f32(xy4); + yy4 = _mm_add_ps(yy4, _mm_shuffle_ps(yy4, yy4, _MM_SHUFFLE(1, 0, 3, 2))); + yy4 = _mm_add_ps(yy4, _mm_shuffle_ps(yy4, yy4, _MM_SHUFFLE(2, 3, 0, 1))); + yy = _mm_cvtss_f32(yy4); + } + X[N] = X[N+1] = X[N+2] = -100; + y[N] = y[N+1] = y[N+2] = 100; + celt_sig_assert(pulsesLeft>=0); + + /* This should never happen, but just in case it does (e.g. on silence) + we fill the first bin with pulses. */ + if (pulsesLeft > N+3) + { + opus_val16 tmp = (opus_val16)pulsesLeft; + yy = MAC16_16(yy, tmp, tmp); + yy = MAC16_16(yy, tmp, y[0]); + iy[0] += pulsesLeft; + pulsesLeft=0; + } + + for (i=0;i<pulsesLeft;i++) + { + int best_id; + __m128 xy4, yy4; + __m128 max, max2; + __m128i count; + __m128i pos; + /* The squared magnitude term gets added anyway, so we might as well + add it outside the loop */ + yy = ADD16(yy, 1); + xy4 = _mm_load1_ps(&xy); + yy4 = _mm_load1_ps(&yy); + max = _mm_setzero_ps(); + pos = _mm_setzero_si128(); + count = _mm_set_epi32(3, 2, 1, 0); + for (j=0;j<N;j+=4) + { + __m128 x4, y4, r4; + x4 = _mm_loadu_ps(&X[j]); + y4 = _mm_loadu_ps(&y[j]); + x4 = _mm_add_ps(x4, xy4); + y4 = _mm_add_ps(y4, yy4); + y4 = _mm_rsqrt_ps(y4); + r4 = _mm_mul_ps(x4, y4); + /* Update the index of the max. */ + pos = _mm_max_epi16(pos, _mm_and_si128(count, _mm_castps_si128(_mm_cmpgt_ps(r4, max)))); + /* Update the max. */ + max = _mm_max_ps(max, r4); + /* Update the indices (+4) */ + count = _mm_add_epi32(count, fours); + } + /* Horizontal max */ + max2 = _mm_max_ps(max, _mm_shuffle_ps(max, max, _MM_SHUFFLE(1, 0, 3, 2))); + max2 = _mm_max_ps(max2, _mm_shuffle_ps(max2, max2, _MM_SHUFFLE(2, 3, 0, 1))); + /* Now that max2 contains the max at all positions, look at which value(s) of the + partial max is equal to the global max. */ + pos = _mm_and_si128(pos, _mm_castps_si128(_mm_cmpeq_ps(max, max2))); + pos = _mm_max_epi16(pos, _mm_unpackhi_epi64(pos, pos)); + pos = _mm_max_epi16(pos, _mm_shufflelo_epi16(pos, _MM_SHUFFLE(1, 0, 3, 2))); + best_id = _mm_cvtsi128_si32(pos); + + /* Updating the sums of the new pulse(s) */ + xy = ADD32(xy, EXTEND32(X[best_id])); + /* We're multiplying y[j] by two so we don't have to do it here */ + yy = ADD16(yy, y[best_id]); + + /* Only now that we've made the final choice, update y/iy */ + /* Multiplying y[j] by 2 so we don't have to do it everywhere else */ + y[best_id] += 2; + iy[best_id]++; + } + + /* Put the original sign back */ + for (j=0;j<N;j+=4) + { + __m128i y4; + __m128i s4; + y4 = _mm_loadu_si128((__m128i*)&iy[j]); + s4 = _mm_castps_si128(_mm_loadu_ps(&signy[j])); + y4 = _mm_xor_si128(_mm_add_epi32(y4, s4), s4); + _mm_storeu_si128((__m128i*)&iy[j], y4); + } + RESTORE_STACK; + return yy; +} + +#endif diff --git a/thirdparty/opus/celt/x86/x86_celt_map.c b/thirdparty/opus/celt/x86/x86_celt_map.c index 47ba41b9ee..d39d88edec 100644 --- a/thirdparty/opus/celt/x86/x86_celt_map.c +++ b/thirdparty/opus/celt/x86/x86_celt_map.c @@ -33,6 +33,7 @@ #include "celt_lpc.h" #include "pitch.h" #include "pitch_sse.h" +#include "vq.h" #if defined(OPUS_HAVE_RTCD) @@ -46,7 +47,6 @@ void (*const CELT_FIR_IMPL[OPUS_ARCHMASK + 1])( opus_val16 *y, int N, int ord, - opus_val16 *mem, int arch ) = { celt_fir_c, /* non-sse */ @@ -151,5 +151,17 @@ void (*const COMB_FILTER_CONST_IMPL[OPUS_ARCHMASK + 1])( #endif +#if defined(OPUS_X86_MAY_HAVE_SSE2) && !defined(OPUS_X86_PRESUME_SSE2) +opus_val16 (*const OP_PVQ_SEARCH_IMPL[OPUS_ARCHMASK + 1])( + celt_norm *_X, int *iy, int K, int N, int arch +) = { + op_pvq_search_c, /* non-sse */ + op_pvq_search_c, + MAY_HAVE_SSE2(op_pvq_search), + MAY_HAVE_SSE2(op_pvq_search), + MAY_HAVE_SSE2(op_pvq_search) +}; +#endif + #endif #endif diff --git a/thirdparty/opus/celt/x86/x86cpu.h b/thirdparty/opus/celt/x86/x86cpu.h index 04fd48aac4..1e2bf17b9b 100644 --- a/thirdparty/opus/celt/x86/x86cpu.h +++ b/thirdparty/opus/celt/x86/x86cpu.h @@ -82,7 +82,9 @@ int opus_select_arch(void); (_mm_cvtepi8_epi32(*(__m128i *)(x))) #endif -# if !defined(__OPTIMIZE__) +/* similar reasoning about the instruction sequence as in the 32-bit macro above, + */ +# if defined(__clang__) || !defined(__OPTIMIZE__) # define OP_CVTEPI16_EPI32_M64(x) \ (_mm_cvtepi16_epi32(_mm_loadl_epi64((__m128i *)(x)))) # else diff --git a/thirdparty/opus/config.h b/thirdparty/opus/config.h index 7b9c92c6a8..3ed0874d4b 100644 --- a/thirdparty/opus/config.h +++ b/thirdparty/opus/config.h @@ -1,5 +1,44 @@ -/* Opus configuration header */ -/* Based on the output of libopus configure script */ +/* config.h. Generated from config.h.in by configure. */ +/* config.h.in. Generated from configure.ac by autoheader. */ + +/* Get CPU Info by asm method */ +#define CPU_INFO_BY_ASM 1 + +/* Get CPU Info by c method */ +/* #undef CPU_INFO_BY_C */ + +/* Custom modes */ +/* #undef CUSTOM_MODES */ + +/* Do not build the float API */ +/* #undef DISABLE_FLOAT_API */ + +/* Disable bitstream fixes from RFC 8251 */ +/* #undef DISABLE_UPDATE_DRAFT */ + +/* Assertions */ +/* #undef ENABLE_ASSERTIONS */ + +/* Hardening */ +#define ENABLE_HARDENING 1 + +/* Debug fixed-point implementation */ +/* #undef FIXED_DEBUG */ + +/* Compile as fixed-point (for machines without a fast enough FPU) */ +/* #undef FIXED_POINT */ + +/* Float approximations */ +/* #undef FLOAT_APPROX */ + +/* Fuzzing */ +/* #undef FUZZING */ + +/* Define to 1 if you have the <alloca.h> header file. */ +/* #undef HAVE_ALLOCA_H */ + +/* NE10 library is installed on host. Make sure it is on target! */ +/* #undef HAVE_ARM_NE10 */ /* Define to 1 if you have the <dlfcn.h> header file. */ #define HAVE_DLFCN_H 1 @@ -7,16 +46,12 @@ /* Define to 1 if you have the <inttypes.h> header file. */ #define HAVE_INTTYPES_H 1 -#if (!defined( _MSC_VER ) || ( _MSC_VER >= 1800 )) - /* Define to 1 if you have the `lrint' function. */ #define HAVE_LRINT 1 /* Define to 1 if you have the `lrintf' function. */ #define HAVE_LRINTF 1 -#endif - /* Define to 1 if you have the <memory.h> header file. */ #define HAVE_MEMORY_H 1 @@ -41,8 +76,10 @@ /* Define to 1 if you have the <unistd.h> header file. */ #define HAVE_UNISTD_H 1 -/* Define to the sub-directory in which libtool stores uninstalled libraries. - */ +/* Define to 1 if you have the `__malloc_hook' function. */ +#define HAVE___MALLOC_HOOK 1 + +/* Define to the sub-directory where libtool stores uninstalled libraries. */ #define LT_OBJDIR ".libs/" #ifdef OPUS_ARM_OPT @@ -92,9 +129,80 @@ #endif // OPUS_ARM64_OPT +/* Define if binary requires Aarch64 Neon Intrinsics */ +/* #undef OPUS_ARM_PRESUME_AARCH64_NEON_INTR */ + +/* Define if binary requires EDSP instruction support */ +/* #undef OPUS_ARM_PRESUME_EDSP */ + +/* Define if binary requires ARMv6 media instruction support */ +/* #undef OPUS_ARM_PRESUME_MEDIA */ + +/* Define if binary requires NEON instruction support */ +/* #undef OPUS_ARM_PRESUME_NEON */ + +/* Define if binary requires NEON intrinsics support */ +/* #undef OPUS_ARM_PRESUME_NEON_INTR */ + /* This is a build of OPUS */ #define OPUS_BUILD /**/ +/* Run bit-exactness checks between optimized and c implementations */ +/* #undef OPUS_CHECK_ASM */ + +#ifndef OPUS_ARM_OPT +/* Use run-time CPU capabilities detection */ +#define OPUS_HAVE_RTCD 1 +#endif + +/* Compiler supports X86 AVX Intrinsics */ +/* #define OPUS_X86_MAY_HAVE_AVX */ + +/* Compiler supports X86 SSE Intrinsics */ +/* #define OPUS_X86_MAY_HAVE_SSE */ + +/* Compiler supports X86 SSE2 Intrinsics */ +/* #define OPUS_X86_MAY_HAVE_SSE2 */ + +/* Compiler supports X86 SSE4.1 Intrinsics */ +/* #define OPUS_X86_MAY_HAVE_SSE4_1 */ + +/* Define if binary requires AVX intrinsics support */ +/* #undef OPUS_X86_PRESUME_AVX */ + +/* Define if binary requires SSE intrinsics support */ +#define OPUS_X86_PRESUME_SSE 1 + +/* Define if binary requires SSE2 intrinsics support */ +#define OPUS_X86_PRESUME_SSE2 1 + +/* Define if binary requires SSE4.1 intrinsics support */ +#define OPUS_X86_PRESUME_SSE4_1 1 + +/* Define to the address where bug reports for this package should be sent. */ +#define PACKAGE_BUGREPORT "opus@xiph.org" + +/* Define to the full name of this package. */ +#define PACKAGE_NAME "opus" + +/* Define to the full name and version of this package. */ +#define PACKAGE_STRING "opus 1.3.1" + +/* Define to the one symbol short name of this package. */ +#define PACKAGE_TARNAME "opus" + +/* Define to the home page for this package. */ +#define PACKAGE_URL "" + +/* Define to the version of this package. */ +#define PACKAGE_VERSION "1.3.1" + +/* Define to 1 if you have the ANSI C header files. */ +#define STDC_HEADERS 1 + +/* Make use of alloca */ +/* #undef USE_ALLOCA */ + #ifndef WIN32 /* Use C99 variable-size arrays */ #define VAR_ARRAYS 1 @@ -103,11 +211,13 @@ #define USE_ALLOCA 1 #endif +/* Define to empty if `const' does not conform to ANSI C. */ +/* #undef const */ + #ifndef OPUS_FIXED_POINT #define FLOAT_APPROX 1 #endif - /* Define to `__inline__' or `__inline' if that's what the C compiler calls it, or to nothing if 'inline' is not supported under any name. */ #ifndef __cplusplus @@ -117,11 +227,7 @@ /* Define to the equivalent of the C99 'restrict' keyword, or to nothing if this is not supported. Do not define if restrict is supported directly. */ -#if (!defined( _MSC_VER ) || ( _MSC_VER >= 1800 )) #define restrict __restrict -#else -#undef restrict -#endif /* Work around a bug in Sun C++: it does not support _Restrict or __restrict__, even though the corresponding Sun C compiler ends up with "#define restrict _Restrict" or "#define restrict __restrict__" in the diff --git a/thirdparty/opus/info.c b/thirdparty/opus/info.c index c36f9a9ee1..3a1a5bf75b 100644 --- a/thirdparty/opus/info.c +++ b/thirdparty/opus/info.c @@ -107,26 +107,32 @@ static int op_tags_ensure_capacity(OpusTags *_tags,size_t _ncomments){ char **user_comments; int *comment_lengths; int cur_ncomments; - char *binary_suffix_data; - int binary_suffix_len; size_t size; if(OP_UNLIKELY(_ncomments>=(size_t)INT_MAX))return OP_EFAULT; size=sizeof(*_tags->comment_lengths)*(_ncomments+1); if(size/sizeof(*_tags->comment_lengths)!=_ncomments+1)return OP_EFAULT; cur_ncomments=_tags->comments; - comment_lengths=_tags->comment_lengths; - binary_suffix_len=comment_lengths==NULL?0:comment_lengths[cur_ncomments]; + /*We only support growing. + Trimming requires cleaning up the allocated strings in the old space, and + is best handled separately if it's ever needed.*/ + OP_ASSERT(_ncomments>=(size_t)cur_ncomments); comment_lengths=(int *)_ogg_realloc(_tags->comment_lengths,size); if(OP_UNLIKELY(comment_lengths==NULL))return OP_EFAULT; - comment_lengths[_ncomments]=binary_suffix_len; + if(_tags->comment_lengths==NULL){ + OP_ASSERT(cur_ncomments==0); + comment_lengths[cur_ncomments]=0; + } + comment_lengths[_ncomments]=comment_lengths[cur_ncomments]; _tags->comment_lengths=comment_lengths; size=sizeof(*_tags->user_comments)*(_ncomments+1); if(size/sizeof(*_tags->user_comments)!=_ncomments+1)return OP_EFAULT; - user_comments=_tags->user_comments; - binary_suffix_data=user_comments==NULL?NULL:user_comments[cur_ncomments]; user_comments=(char **)_ogg_realloc(_tags->user_comments,size); if(OP_UNLIKELY(user_comments==NULL))return OP_EFAULT; - user_comments[_ncomments]=binary_suffix_data; + if(_tags->user_comments==NULL){ + OP_ASSERT(cur_ncomments==0); + user_comments[cur_ncomments]=NULL; + } + user_comments[_ncomments]=user_comments[cur_ncomments]; _tags->user_comments=user_comments; return 0; } @@ -275,28 +281,30 @@ int opus_tags_copy(OpusTags *_dst,const OpusTags *_src){ ret=opus_tags_copy_impl(&dst,_src); if(OP_UNLIKELY(ret<0))opus_tags_clear(&dst); else *_dst=*&dst; - return 0; + return ret; } int opus_tags_add(OpusTags *_tags,const char *_tag,const char *_value){ - char *comment; - int tag_len; - int value_len; - int ncomments; - int ret; + char *comment; + size_t tag_len; + size_t value_len; + int ncomments; + int ret; ncomments=_tags->comments; ret=op_tags_ensure_capacity(_tags,ncomments+1); if(OP_UNLIKELY(ret<0))return ret; tag_len=strlen(_tag); value_len=strlen(_value); /*+2 for '=' and '\0'.*/ + if(tag_len+value_len<tag_len)return OP_EFAULT; + if(tag_len+value_len>(size_t)INT_MAX-2)return OP_EFAULT; comment=(char *)_ogg_malloc(sizeof(*comment)*(tag_len+value_len+2)); if(OP_UNLIKELY(comment==NULL))return OP_EFAULT; memcpy(comment,_tag,sizeof(*comment)*tag_len); comment[tag_len]='='; memcpy(comment+tag_len+1,_value,sizeof(*comment)*(value_len+1)); _tags->user_comments[ncomments]=comment; - _tags->comment_lengths[ncomments]=tag_len+value_len+1; + _tags->comment_lengths[ncomments]=(int)(tag_len+value_len+1); _tags->comments=ncomments+1; return 0; } @@ -337,7 +345,10 @@ int opus_tags_set_binary_suffix(OpusTags *_tags, } int opus_tagcompare(const char *_tag_name,const char *_comment){ - return opus_tagncompare(_tag_name,strlen(_tag_name),_comment); + size_t tag_len; + tag_len=strlen(_tag_name); + if(OP_UNLIKELY(tag_len>(size_t)INT_MAX))return -1; + return opus_tagncompare(_tag_name,(int)tag_len,_comment); } int opus_tagncompare(const char *_tag_name,int _tag_len,const char *_comment){ @@ -348,17 +359,18 @@ int opus_tagncompare(const char *_tag_name,int _tag_len,const char *_comment){ } const char *opus_tags_query(const OpusTags *_tags,const char *_tag,int _count){ - char **user_comments; - int tag_len; - int found; - int ncomments; - int ci; + char **user_comments; + size_t tag_len; + int found; + int ncomments; + int ci; tag_len=strlen(_tag); + if(OP_UNLIKELY(tag_len>(size_t)INT_MAX))return NULL; ncomments=_tags->comments; user_comments=_tags->user_comments; found=0; for(ci=0;ci<ncomments;ci++){ - if(!opus_tagncompare(_tag,tag_len,user_comments[ci])){ + if(!opus_tagncompare(_tag,(int)tag_len,user_comments[ci])){ /*We return a pointer to the data, not a copy.*/ if(_count==found++)return user_comments[ci]+tag_len+1; } @@ -368,17 +380,18 @@ const char *opus_tags_query(const OpusTags *_tags,const char *_tag,int _count){ } int opus_tags_query_count(const OpusTags *_tags,const char *_tag){ - char **user_comments; - int tag_len; - int found; - int ncomments; - int ci; + char **user_comments; + size_t tag_len; + int found; + int ncomments; + int ci; tag_len=strlen(_tag); + if(OP_UNLIKELY(tag_len>(size_t)INT_MAX))return 0; ncomments=_tags->comments; user_comments=_tags->user_comments; found=0; for(ci=0;ci<ncomments;ci++){ - if(!opus_tagncompare(_tag,tag_len,user_comments[ci]))found++; + if(!opus_tagncompare(_tag,(int)tag_len,user_comments[ci]))found++; } return found; } @@ -403,7 +416,8 @@ static int opus_tags_get_gain(const OpusTags *_tags,int *_gain_q8, ncomments=_tags->comments; /*Look for the first valid tag with the name _tag_name and use that.*/ for(ci=0;ci<ncomments;ci++){ - if(opus_tagncompare(_tag_name,_tag_len,comments[ci])==0){ + OP_ASSERT(_tag_len<=(size_t)INT_MAX); + if(opus_tagncompare(_tag_name,(int)_tag_len,comments[ci])==0){ char *p; opus_int32 gain_q8; int negative; @@ -439,8 +453,7 @@ int opus_tags_get_track_gain(const OpusTags *_tags,int *_gain_q8){ } static int op_is_jpeg(const unsigned char *_buf,size_t _buf_sz){ - return _buf_sz>=11&&memcmp(_buf,"\xFF\xD8\xFF\xE0",4)==0 - &&(_buf[4]<<8|_buf[5])>=16&&memcmp(_buf+6,"JFIF",5)==0; + return _buf_sz>=3&&memcmp(_buf,"\xFF\xD8\xFF",3)==0; } /*Tries to extract the width, height, bits per pixel, and palette size of a diff --git a/thirdparty/opus/internal.h b/thirdparty/opus/internal.h index ee48ea34c9..9ac17e028f 100644 --- a/thirdparty/opus/internal.h +++ b/thirdparty/opus/internal.h @@ -136,6 +136,9 @@ struct OggOpusLink{ that end-trimming calculations work properly. This is only valid for seekable sources.*/ opus_int64 end_offset; + /*The total duration of all prior links. + This is always zero for non-seekable sources.*/ + ogg_int64_t pcm_file_offset; /*The granule position of the last sample. This is only valid for seekable sources.*/ ogg_int64_t pcm_end; @@ -150,23 +153,25 @@ struct OggOpusLink{ }; struct OggOpusFile{ - /*The callbacks used to access the data source.*/ + /*The callbacks used to access the stream.*/ OpusFileCallbacks callbacks; - /*A FILE *, memory bufer, etc.*/ - void *source; - /*Whether or not we can seek with this data source.*/ + /*A FILE *, memory buffer, etc.*/ + void *stream; + /*Whether or not we can seek with this stream.*/ int seekable; /*The number of links in this chained Ogg Opus file.*/ int nlinks; /*The cached information from each link in a chained Ogg Opus file. - If source isn't seekable (e.g., it's a pipe), only the current link + If stream isn't seekable (e.g., it's a pipe), only the current link appears.*/ OggOpusLink *links; /*The number of serial numbers from a single link.*/ int nserialnos; /*The capacity of the list of serial numbers from a single link.*/ int cserialnos; - /*Storage for the list of serial numbers from a single link.*/ + /*Storage for the list of serial numbers from a single link. + This is a scratch buffer used when scanning the BOS pages at the start of + each link.*/ ogg_uint32_t *serialnos; /*This is the current offset of the data processed by the ogg_sync_state. After a seek, this should be set to the target offset so that we can track @@ -174,9 +179,9 @@ struct OggOpusFile{ After a call to op_get_next_page(), this will point to the first byte after that page.*/ opus_int64 offset; - /*The total size of this data source, or -1 if it's unseekable.*/ + /*The total size of this stream, or -1 if it's unseekable.*/ opus_int64 end; - /*Used to locate pages in the data source.*/ + /*Used to locate pages in the stream.*/ ogg_sync_state oy; /*One of OP_NOTOPEN, OP_PARTOPEN, OP_OPENED, OP_STREAMSET, OP_INITSET.*/ int ready_state; @@ -227,7 +232,7 @@ struct OggOpusFile{ /*The number of valid samples in the decoded buffer.*/ int od_buffer_size; /*The type of gain offset to apply. - One of OP_HEADER_GAIN, OP_TRACK_GAIN, or OP_ABSOLUTE_GAIN.*/ + One of OP_HEADER_GAIN, OP_ALBUM_GAIN, OP_TRACK_GAIN, or OP_ABSOLUTE_GAIN.*/ int gain_type; /*The offset to apply to the gain.*/ opus_int32 gain_offset_q8; diff --git a/thirdparty/opus/mapping_matrix.c b/thirdparty/opus/mapping_matrix.c new file mode 100644 index 0000000000..31298af057 --- /dev/null +++ b/thirdparty/opus/mapping_matrix.c @@ -0,0 +1,378 @@ +/* Copyright (c) 2017 Google Inc. + Written by Andrew Allen */ +/* + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + - Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + - Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER + OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, + EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, + PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR + PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#include "arch.h" +#include "float_cast.h" +#include "opus_private.h" +#include "opus_defines.h" +#include "mapping_matrix.h" + +#define MATRIX_INDEX(nb_rows, row, col) (nb_rows * col + row) + +opus_int32 mapping_matrix_get_size(int rows, int cols) +{ + opus_int32 size; + + /* Mapping Matrix must only support up to 255 channels in or out. + * Additionally, the total cell count must be <= 65004 octets in order + * for the matrix to be stored in an OGG header. + */ + if (rows > 255 || cols > 255) + return 0; + size = rows * (opus_int32)cols * sizeof(opus_int16); + if (size > 65004) + return 0; + + return align(sizeof(MappingMatrix)) + align(size); +} + +opus_int16 *mapping_matrix_get_data(const MappingMatrix *matrix) +{ + /* void* cast avoids clang -Wcast-align warning */ + return (opus_int16*)(void*)((char*)matrix + align(sizeof(MappingMatrix))); +} + +void mapping_matrix_init(MappingMatrix * const matrix, + int rows, int cols, int gain, const opus_int16 *data, opus_int32 data_size) +{ + int i; + opus_int16 *ptr; + +#if !defined(ENABLE_ASSERTIONS) + (void)data_size; +#endif + celt_assert(align(data_size) == align(rows * cols * sizeof(opus_int16))); + + matrix->rows = rows; + matrix->cols = cols; + matrix->gain = gain; + ptr = mapping_matrix_get_data(matrix); + for (i = 0; i < rows * cols; i++) + { + ptr[i] = data[i]; + } +} + +#ifndef DISABLE_FLOAT_API +void mapping_matrix_multiply_channel_in_float( + const MappingMatrix *matrix, + const float *input, + int input_rows, + opus_val16 *output, + int output_row, + int output_rows, + int frame_size) +{ + /* Matrix data is ordered col-wise. */ + opus_int16* matrix_data; + int i, col; + + celt_assert(input_rows <= matrix->cols && output_rows <= matrix->rows); + + matrix_data = mapping_matrix_get_data(matrix); + + for (i = 0; i < frame_size; i++) + { + float tmp = 0; + for (col = 0; col < input_rows; col++) + { + tmp += + matrix_data[MATRIX_INDEX(matrix->rows, output_row, col)] * + input[MATRIX_INDEX(input_rows, col, i)]; + } +#if defined(FIXED_POINT) + output[output_rows * i] = FLOAT2INT16((1/32768.f)*tmp); +#else + output[output_rows * i] = (1/32768.f)*tmp; +#endif + } +} + +void mapping_matrix_multiply_channel_out_float( + const MappingMatrix *matrix, + const opus_val16 *input, + int input_row, + int input_rows, + float *output, + int output_rows, + int frame_size +) +{ + /* Matrix data is ordered col-wise. */ + opus_int16* matrix_data; + int i, row; + float input_sample; + + celt_assert(input_rows <= matrix->cols && output_rows <= matrix->rows); + + matrix_data = mapping_matrix_get_data(matrix); + + for (i = 0; i < frame_size; i++) + { +#if defined(FIXED_POINT) + input_sample = (1/32768.f)*input[input_rows * i]; +#else + input_sample = input[input_rows * i]; +#endif + for (row = 0; row < output_rows; row++) + { + float tmp = + (1/32768.f)*matrix_data[MATRIX_INDEX(matrix->rows, row, input_row)] * + input_sample; + output[MATRIX_INDEX(output_rows, row, i)] += tmp; + } + } +} +#endif /* DISABLE_FLOAT_API */ + +void mapping_matrix_multiply_channel_in_short( + const MappingMatrix *matrix, + const opus_int16 *input, + int input_rows, + opus_val16 *output, + int output_row, + int output_rows, + int frame_size) +{ + /* Matrix data is ordered col-wise. */ + opus_int16* matrix_data; + int i, col; + + celt_assert(input_rows <= matrix->cols && output_rows <= matrix->rows); + + matrix_data = mapping_matrix_get_data(matrix); + + for (i = 0; i < frame_size; i++) + { + opus_val32 tmp = 0; + for (col = 0; col < input_rows; col++) + { +#if defined(FIXED_POINT) + tmp += + ((opus_int32)matrix_data[MATRIX_INDEX(matrix->rows, output_row, col)] * + (opus_int32)input[MATRIX_INDEX(input_rows, col, i)]) >> 8; +#else + tmp += + matrix_data[MATRIX_INDEX(matrix->rows, output_row, col)] * + input[MATRIX_INDEX(input_rows, col, i)]; +#endif + } +#if defined(FIXED_POINT) + output[output_rows * i] = (opus_int16)((tmp + 64) >> 7); +#else + output[output_rows * i] = (1/(32768.f*32768.f))*tmp; +#endif + } +} + +void mapping_matrix_multiply_channel_out_short( + const MappingMatrix *matrix, + const opus_val16 *input, + int input_row, + int input_rows, + opus_int16 *output, + int output_rows, + int frame_size) +{ + /* Matrix data is ordered col-wise. */ + opus_int16* matrix_data; + int i, row; + opus_int32 input_sample; + + celt_assert(input_rows <= matrix->cols && output_rows <= matrix->rows); + + matrix_data = mapping_matrix_get_data(matrix); + + for (i = 0; i < frame_size; i++) + { +#if defined(FIXED_POINT) + input_sample = (opus_int32)input[input_rows * i]; +#else + input_sample = (opus_int32)FLOAT2INT16(input[input_rows * i]); +#endif + for (row = 0; row < output_rows; row++) + { + opus_int32 tmp = + (opus_int32)matrix_data[MATRIX_INDEX(matrix->rows, row, input_row)] * + input_sample; + output[MATRIX_INDEX(output_rows, row, i)] += (tmp + 16384) >> 15; + } + } +} + +const MappingMatrix mapping_matrix_foa_mixing = { 6, 6, 0 }; +const opus_int16 mapping_matrix_foa_mixing_data[36] = { + 16384, 0, -16384, 23170, 0, 0, 16384, 23170, + 16384, 0, 0, 0, 16384, 0, -16384, -23170, + 0, 0, 16384, -23170, 16384, 0, 0, 0, + 0, 0, 0, 0, 32767, 0, 0, 0, + 0, 0, 0, 32767 +}; + +const MappingMatrix mapping_matrix_soa_mixing = { 11, 11, 0 }; +const opus_int16 mapping_matrix_soa_mixing_data[121] = { + 10923, 7723, 13377, -13377, 11585, 9459, 7723, -16384, + -6689, 0, 0, 10923, 7723, 13377, 13377, -11585, + 9459, 7723, 16384, -6689, 0, 0, 10923, -15447, + 13377, 0, 0, -18919, 7723, 0, 13377, 0, + 0, 10923, 7723, -13377, -13377, 11585, -9459, 7723, + 16384, -6689, 0, 0, 10923, -7723, 0, 13377, + -16384, 0, -15447, 0, 9459, 0, 0, 10923, + -7723, 0, -13377, 16384, 0, -15447, 0, 9459, + 0, 0, 10923, 15447, 0, 0, 0, 0, + -15447, 0, -18919, 0, 0, 10923, 7723, -13377, + 13377, -11585, -9459, 7723, -16384, -6689, 0, 0, + 10923, -15447, -13377, 0, 0, 18919, 7723, 0, + 13377, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 32767, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 32767 +}; + +const MappingMatrix mapping_matrix_toa_mixing = { 18, 18, 0 }; +const opus_int16 mapping_matrix_toa_mixing_data[324] = { + 8208, 0, -881, 14369, 0, 0, -8192, -4163, + 13218, 0, 0, 0, 11095, -8836, -6218, 14833, + 0, 0, 8208, -10161, 881, 10161, -13218, -2944, + -8192, 2944, 0, -10488, -6218, 6248, -11095, -6248, + 0, -10488, 0, 0, 8208, 10161, 881, -10161, + -13218, 2944, -8192, -2944, 0, 10488, -6218, -6248, + -11095, 6248, 0, 10488, 0, 0, 8176, 5566, + -11552, 5566, 9681, -11205, 8192, -11205, 0, 4920, + -15158, 9756, -3334, 9756, 0, -4920, 0, 0, + 8176, 7871, 11552, 0, 0, 15846, 8192, 0, + -9681, -6958, 0, 13797, 3334, 0, -15158, 0, + 0, 0, 8176, 0, 11552, 7871, 0, 0, + 8192, 15846, 9681, 0, 0, 0, 3334, 13797, + 15158, 6958, 0, 0, 8176, 5566, -11552, -5566, + -9681, -11205, 8192, 11205, 0, 4920, 15158, 9756, + -3334, -9756, 0, 4920, 0, 0, 8208, 14369, + -881, 0, 0, -4163, -8192, 0, -13218, -14833, + 0, -8836, 11095, 0, 6218, 0, 0, 0, + 8208, 10161, 881, 10161, 13218, 2944, -8192, 2944, + 0, 10488, 6218, -6248, -11095, -6248, 0, -10488, + 0, 0, 8208, -14369, -881, 0, 0, 4163, + -8192, 0, -13218, 14833, 0, 8836, 11095, 0, + 6218, 0, 0, 0, 8208, 0, -881, -14369, + 0, 0, -8192, 4163, 13218, 0, 0, 0, + 11095, 8836, -6218, -14833, 0, 0, 8176, -5566, + -11552, 5566, -9681, 11205, 8192, -11205, 0, -4920, + 15158, -9756, -3334, 9756, 0, -4920, 0, 0, + 8176, 0, 11552, -7871, 0, 0, 8192, -15846, + 9681, 0, 0, 0, 3334, -13797, 15158, -6958, + 0, 0, 8176, -7871, 11552, 0, 0, -15846, + 8192, 0, -9681, 6958, 0, -13797, 3334, 0, + -15158, 0, 0, 0, 8176, -5566, -11552, -5566, + 9681, 11205, 8192, 11205, 0, -4920, -15158, -9756, + -3334, -9756, 0, 4920, 0, 0, 8208, -10161, + 881, -10161, 13218, -2944, -8192, -2944, 0, -10488, + 6218, 6248, -11095, 6248, 0, 10488, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 32767, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 32767 +}; + +const MappingMatrix mapping_matrix_foa_demixing = { 6, 6, 0 }; +const opus_int16 mapping_matrix_foa_demixing_data[36] = { + 16384, 16384, 16384, 16384, 0, 0, 0, 23170, + 0, -23170, 0, 0, -16384, 16384, -16384, 16384, + 0, 0, 23170, 0, -23170, 0, 0, 0, + 0, 0, 0, 0, 32767, 0, 0, 0, + 0, 0, 0, 32767 +}; + +const MappingMatrix mapping_matrix_soa_demixing = { 11, 11, 3050 }; +const opus_int16 mapping_matrix_soa_demixing_data[121] = { + 2771, 2771, 2771, 2771, 2771, 2771, 2771, 2771, + 2771, 0, 0, 10033, 10033, -20066, 10033, 14189, + 14189, -28378, 10033, -20066, 0, 0, 3393, 3393, + 3393, -3393, 0, 0, 0, -3393, -3393, 0, + 0, -17378, 17378, 0, -17378, -24576, 24576, 0, + 17378, 0, 0, 0, -14189, 14189, 0, -14189, + -28378, 28378, 0, 14189, 0, 0, 0, 2399, + 2399, -4799, -2399, 0, 0, 0, -2399, 4799, + 0, 0, 1959, 1959, 1959, 1959, -3918, -3918, + -3918, 1959, 1959, 0, 0, -4156, 4156, 0, + 4156, 0, 0, 0, -4156, 0, 0, 0, + 8192, 8192, -16384, 8192, 16384, 16384, -32768, 8192, + -16384, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 8312, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 8312 +}; + +const MappingMatrix mapping_matrix_toa_demixing = { 18, 18, 0 }; +const opus_int16 mapping_matrix_toa_demixing_data[324] = { + 8192, 8192, 8192, 8192, 8192, 8192, 8192, 8192, + 8192, 8192, 8192, 8192, 8192, 8192, 8192, 8192, + 0, 0, 0, -9779, 9779, 6263, 8857, 0, + 6263, 13829, 9779, -13829, 0, -6263, 0, -8857, + -6263, -9779, 0, 0, -3413, 3413, 3413, -11359, + 11359, 11359, -11359, -3413, 3413, -3413, -3413, -11359, + 11359, 11359, -11359, 3413, 0, 0, 13829, 9779, + -9779, 6263, 0, 8857, -6263, 0, 9779, 0, + -13829, 6263, -8857, 0, -6263, -9779, 0, 0, + 0, -15617, -15617, 6406, 0, 0, -6406, 0, + 15617, 0, 0, -6406, 0, 0, 6406, 15617, + 0, 0, 0, -5003, 5003, -10664, 15081, 0, + -10664, -7075, 5003, 7075, 0, 10664, 0, -15081, + 10664, -5003, 0, 0, -8176, -8176, -8176, 8208, + 8208, 8208, 8208, -8176, -8176, -8176, -8176, 8208, + 8208, 8208, 8208, -8176, 0, 0, -7075, 5003, + -5003, -10664, 0, 15081, 10664, 0, 5003, 0, + 7075, -10664, -15081, 0, 10664, -5003, 0, 0, + 15617, 0, 0, 0, -6406, 6406, 0, -15617, + 0, -15617, 15617, 0, 6406, -6406, 0, 0, + 0, 0, 0, -11393, 11393, 2993, -4233, 0, + 2993, -16112, 11393, 16112, 0, -2993, 0, 4233, + -2993, -11393, 0, 0, 0, -9974, -9974, -13617, + 0, 0, 13617, 0, 9974, 0, 0, 13617, + 0, 0, -13617, 9974, 0, 0, 0, 5579, + -5579, 10185, 14403, 0, 10185, -7890, -5579, 7890, + 0, -10185, 0, -14403, -10185, 5579, 0, 0, + 11826, -11826, -11826, -901, 901, 901, -901, 11826, + -11826, 11826, 11826, -901, 901, 901, -901, -11826, + 0, 0, -7890, -5579, 5579, 10185, 0, 14403, + -10185, 0, -5579, 0, 7890, 10185, -14403, 0, + -10185, 5579, 0, 0, -9974, 0, 0, 0, + -13617, 13617, 0, 9974, 0, 9974, -9974, 0, + 13617, -13617, 0, 0, 0, 0, 16112, -11393, + 11393, -2993, 0, 4233, 2993, 0, -11393, 0, + -16112, -2993, -4233, 0, 2993, 11393, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 32767, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 32767 +}; + diff --git a/thirdparty/opus/mapping_matrix.h b/thirdparty/opus/mapping_matrix.h new file mode 100644 index 0000000000..98bc82df3e --- /dev/null +++ b/thirdparty/opus/mapping_matrix.h @@ -0,0 +1,133 @@ +/* Copyright (c) 2017 Google Inc. + Written by Andrew Allen */ +/* + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + - Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + - Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER + OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, + EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, + PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR + PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +/** + * @file mapping_matrix.h + * @brief Opus reference implementation mapping matrix API + */ + +#ifndef MAPPING_MATRIX_H +#define MAPPING_MATRIX_H + +#include "opus_types.h" +#include "opus_projection.h" + +#ifdef __cplusplus +extern "C" { +#endif + +typedef struct MappingMatrix +{ + int rows; /* number of channels outputted from matrix. */ + int cols; /* number of channels inputted to matrix. */ + int gain; /* in dB. S7.8-format. */ + /* Matrix cell data goes here using col-wise ordering. */ +} MappingMatrix; + +opus_int32 mapping_matrix_get_size(int rows, int cols); + +opus_int16 *mapping_matrix_get_data(const MappingMatrix *matrix); + +void mapping_matrix_init( + MappingMatrix * const matrix, + int rows, + int cols, + int gain, + const opus_int16 *data, + opus_int32 data_size +); + +#ifndef DISABLE_FLOAT_API +void mapping_matrix_multiply_channel_in_float( + const MappingMatrix *matrix, + const float *input, + int input_rows, + opus_val16 *output, + int output_row, + int output_rows, + int frame_size +); + +void mapping_matrix_multiply_channel_out_float( + const MappingMatrix *matrix, + const opus_val16 *input, + int input_row, + int input_rows, + float *output, + int output_rows, + int frame_size +); +#endif /* DISABLE_FLOAT_API */ + +void mapping_matrix_multiply_channel_in_short( + const MappingMatrix *matrix, + const opus_int16 *input, + int input_rows, + opus_val16 *output, + int output_row, + int output_rows, + int frame_size +); + +void mapping_matrix_multiply_channel_out_short( + const MappingMatrix *matrix, + const opus_val16 *input, + int input_row, + int input_rows, + opus_int16 *output, + int output_rows, + int frame_size +); + +/* Pre-computed mixing and demixing matrices for 1st to 3rd-order ambisonics. + * foa: first-order ambisonics + * soa: second-order ambisonics + * toa: third-order ambisonics + */ +extern const MappingMatrix mapping_matrix_foa_mixing; +extern const opus_int16 mapping_matrix_foa_mixing_data[36]; + +extern const MappingMatrix mapping_matrix_soa_mixing; +extern const opus_int16 mapping_matrix_soa_mixing_data[121]; + +extern const MappingMatrix mapping_matrix_toa_mixing; +extern const opus_int16 mapping_matrix_toa_mixing_data[324]; + +extern const MappingMatrix mapping_matrix_foa_demixing; +extern const opus_int16 mapping_matrix_foa_demixing_data[36]; + +extern const MappingMatrix mapping_matrix_soa_demixing; +extern const opus_int16 mapping_matrix_soa_demixing_data[121]; + +extern const MappingMatrix mapping_matrix_toa_demixing; +extern const opus_int16 mapping_matrix_toa_demixing_data[324]; + +#ifdef __cplusplus +} +#endif + +#endif /* MAPPING_MATRIX_H */ diff --git a/thirdparty/opus/mlp.c b/thirdparty/opus/mlp.c index ff9e50df47..964c6a98f6 100644 --- a/thirdparty/opus/mlp.c +++ b/thirdparty/opus/mlp.c @@ -1,5 +1,5 @@ /* Copyright (c) 2008-2011 Octasic Inc. - Written by Jean-Marc Valin */ + 2012-2017 Jean-Marc Valin */ /* Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions @@ -29,42 +29,13 @@ #include "config.h" #endif +#include <math.h> #include "opus_types.h" #include "opus_defines.h" - -#include <math.h> -#include "mlp.h" #include "arch.h" #include "tansig_table.h" -#define MAX_NEURONS 100 +#include "mlp.h" -#if 0 -static OPUS_INLINE opus_val16 tansig_approx(opus_val32 _x) /* Q19 */ -{ - int i; - opus_val16 xx; /* Q11 */ - /*double x, y;*/ - opus_val16 dy, yy; /* Q14 */ - /*x = 1.9073e-06*_x;*/ - if (_x>=QCONST32(8,19)) - return QCONST32(1.,14); - if (_x<=-QCONST32(8,19)) - return -QCONST32(1.,14); - xx = EXTRACT16(SHR32(_x, 8)); - /*i = lrint(25*x);*/ - i = SHR32(ADD32(1024,MULT16_16(25, xx)),11); - /*x -= .04*i;*/ - xx -= EXTRACT16(SHR32(MULT16_16(20972,i),8)); - /*x = xx*(1./2048);*/ - /*y = tansig_table[250+i];*/ - yy = tansig_table[250+i]; - /*y = yy*(1./16384);*/ - dy = 16384-MULT16_16_Q14(yy,yy); - yy = yy + MULT16_16_Q14(MULT16_16_Q11(xx,dy),(16384 - MULT16_16_Q11(yy,xx))); - return yy; -} -#else -/*extern const float tansig_table[501];*/ static OPUS_INLINE float tansig_approx(float x) { int i; @@ -92,54 +63,82 @@ static OPUS_INLINE float tansig_approx(float x) y = y + x*dy*(1 - y*x); return sign*y; } -#endif -#if 0 -void mlp_process(const MLP *m, const opus_val16 *in, opus_val16 *out) +static OPUS_INLINE float sigmoid_approx(float x) { - int j; - opus_val16 hidden[MAX_NEURONS]; - const opus_val16 *W = m->weights; - /* Copy to tmp_in */ - for (j=0;j<m->topo[1];j++) - { - int k; - opus_val32 sum = SHL32(EXTEND32(*W++),8); - for (k=0;k<m->topo[0];k++) - sum = MAC16_16(sum, in[k],*W++); - hidden[j] = tansig_approx(sum); - } - for (j=0;j<m->topo[2];j++) - { - int k; - opus_val32 sum = SHL32(EXTEND32(*W++),14); - for (k=0;k<m->topo[1];k++) - sum = MAC16_16(sum, hidden[k], *W++); - out[j] = tansig_approx(EXTRACT16(PSHR32(sum,17))); - } + return .5f + .5f*tansig_approx(.5f*x); +} + +static void gemm_accum(float *out, const opus_int8 *weights, int rows, int cols, int col_stride, const float *x) +{ + int i, j; + for (i=0;i<rows;i++) + { + for (j=0;j<cols;j++) + out[i] += weights[j*col_stride + i]*x[j]; + } } -#else -void mlp_process(const MLP *m, const float *in, float *out) + +void compute_dense(const DenseLayer *layer, float *output, const float *input) { - int j; - float hidden[MAX_NEURONS]; - const float *W = m->weights; - /* Copy to tmp_in */ - for (j=0;j<m->topo[1];j++) - { - int k; - float sum = *W++; - for (k=0;k<m->topo[0];k++) - sum = sum + in[k]**W++; - hidden[j] = tansig_approx(sum); - } - for (j=0;j<m->topo[2];j++) - { - int k; - float sum = *W++; - for (k=0;k<m->topo[1];k++) - sum = sum + hidden[k]**W++; - out[j] = tansig_approx(sum); - } + int i; + int N, M; + int stride; + M = layer->nb_inputs; + N = layer->nb_neurons; + stride = N; + for (i=0;i<N;i++) + output[i] = layer->bias[i]; + gemm_accum(output, layer->input_weights, N, M, stride, input); + for (i=0;i<N;i++) + output[i] *= WEIGHTS_SCALE; + if (layer->sigmoid) { + for (i=0;i<N;i++) + output[i] = sigmoid_approx(output[i]); + } else { + for (i=0;i<N;i++) + output[i] = tansig_approx(output[i]); + } } -#endif + +void compute_gru(const GRULayer *gru, float *state, const float *input) +{ + int i; + int N, M; + int stride; + float tmp[MAX_NEURONS]; + float z[MAX_NEURONS]; + float r[MAX_NEURONS]; + float h[MAX_NEURONS]; + M = gru->nb_inputs; + N = gru->nb_neurons; + stride = 3*N; + /* Compute update gate. */ + for (i=0;i<N;i++) + z[i] = gru->bias[i]; + gemm_accum(z, gru->input_weights, N, M, stride, input); + gemm_accum(z, gru->recurrent_weights, N, N, stride, state); + for (i=0;i<N;i++) + z[i] = sigmoid_approx(WEIGHTS_SCALE*z[i]); + + /* Compute reset gate. */ + for (i=0;i<N;i++) + r[i] = gru->bias[N + i]; + gemm_accum(r, &gru->input_weights[N], N, M, stride, input); + gemm_accum(r, &gru->recurrent_weights[N], N, N, stride, state); + for (i=0;i<N;i++) + r[i] = sigmoid_approx(WEIGHTS_SCALE*r[i]); + + /* Compute output. */ + for (i=0;i<N;i++) + h[i] = gru->bias[2*N + i]; + for (i=0;i<N;i++) + tmp[i] = state[i] * r[i]; + gemm_accum(h, &gru->input_weights[2*N], N, M, stride, input); + gemm_accum(h, &gru->recurrent_weights[2*N], N, N, stride, tmp); + for (i=0;i<N;i++) + h[i] = z[i]*state[i] + (1-z[i])*tansig_approx(WEIGHTS_SCALE*h[i]); + for (i=0;i<N;i++) + state[i] = h[i]; +} + diff --git a/thirdparty/opus/mlp.h b/thirdparty/opus/mlp.h index 618e246e2c..d7670550fd 100644 --- a/thirdparty/opus/mlp.h +++ b/thirdparty/opus/mlp.h @@ -1,5 +1,4 @@ -/* Copyright (c) 2008-2011 Octasic Inc. - Written by Jean-Marc Valin */ +/* Copyright (c) 2017 Jean-Marc Valin */ /* Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions @@ -28,16 +27,34 @@ #ifndef _MLP_H_ #define _MLP_H_ -#include "arch.h" +#include "opus_types.h" + +#define WEIGHTS_SCALE (1.f/128) + +#define MAX_NEURONS 32 typedef struct { - int layers; - const int *topo; - const float *weights; -} MLP; + const opus_int8 *bias; + const opus_int8 *input_weights; + int nb_inputs; + int nb_neurons; + int sigmoid; +} DenseLayer; + +typedef struct { + const opus_int8 *bias; + const opus_int8 *input_weights; + const opus_int8 *recurrent_weights; + int nb_inputs; + int nb_neurons; +} GRULayer; + +extern const DenseLayer layer0; +extern const GRULayer layer1; +extern const DenseLayer layer2; -extern const MLP net; +void compute_dense(const DenseLayer *layer, float *output, const float *input); -void mlp_process(const MLP *m, const float *in, float *out); +void compute_gru(const GRULayer *gru, float *state, const float *input); #endif /* _MLP_H_ */ diff --git a/thirdparty/opus/mlp_data.c b/thirdparty/opus/mlp_data.c index c2fda4e2e5..ae4178df76 100644 --- a/thirdparty/opus/mlp_data.c +++ b/thirdparty/opus/mlp_data.c @@ -1,5 +1,4 @@ -/* The contents of this file was automatically generated by mlp_train.c - It contains multi-layer perceptron (MLP) weights. */ +/*This file is automatically generated from a Keras model*/ #ifdef HAVE_CONFIG_H #include "config.h" @@ -7,103 +6,667 @@ #include "mlp.h" -/* RMS error was 0.138320, seed was 1361535663 */ +static const opus_int8 layer0_weights[800] = { + -30, -9, 2, -12, 5, -1, 8, 9, + 9, 8, -13, 18, -17, -34, -5, 17, + -11, 0, -4, 10, 2, 10, 15, -8, + 2, -1, 0, 5, 13, -3, -16, 1, + -5, 3, 7, -28, -13, 6, 36, -3, + 19, -60, -17, -28, 7, -11, -30, -7, + 2, -42, -21, -3, 6, -22, 33, -9, + 7, -30, 21, -14, 24, -11, -20, -18, + -5, -12, 12, -49, -50, -49, 16, 9, + -37, -1, 9, 34, -13, -31, -31, 12, + 16, 44, -42, 2, -9, 8, -18, -6, + 9, 36, 19, 11, 13, 12, -21, 3, + -28, -12, 3, 33, 25, -14, 11, 1, + -94, -39, 18, -12, -11, -15, -7, 49, + 52, 10, -43, 9, 57, 8, 21, -6, + 14, -15, 44, -8, 7, -30, -13, -2, + -9, 25, -2, -127, 18, -11, -52, 26, + -27, 27, 10, -10, 7, 43, 6, -24, + 41, 10, -18, -27, 10, 17, 9, 10, + -17, -10, 20, -6, 22, 55, 35, -80, + 36, 25, -24, -36, 15, 9, -19, 88, + 19, 64, -51, -35, 17, 0, -7, 41, + -16, 27, 4, 15, -1, 18, -16, 47, + -39, -54, -8, 13, -25, -20, 102, -18, + -5, 44, 11, -28, 71, 2, -51, -5, + 5, 2, -83, -9, -29, 8, 21, -53, + 58, -37, -7, 13, 38, 9, 34, -1, + -41, 21, 4, -24, -36, -33, -21, 32, + 75, -2, 1, -68, -1, 47, -29, 32, + 20, 12, -65, -87, 5, 16, -12, 24, + 40, 15, 7, 19, -26, -17, 17, 6, + -2, -37, -30, -9, 32, -127, -39, 0, + -31, -27, 4, -22, 23, -6, -77, 35, + -61, 32, -37, -24, 13, -11, -1, -40, + -3, 17, -7, 13, 11, 59, -19, 10, + 6, -18, 0, 13, 3, -6, -23, 19, + 11, -17, 13, -1, -80, 40, -53, 69, + -29, -54, 0, -4, 33, -25, -2, 38, + 35, 36, -15, 46, 2, -13, -16, -8, + -8, 12, -24, -9, -55, -5, -9, 32, + 11, 7, 12, -18, -10, -86, -38, 54, + 37, -25, 18, -43, 7, -27, -27, -54, + 13, 9, 22, 70, 6, 35, -7, 23, + -15, -44, -6, 7, -66, -85, 32, 40, + -19, -9, -7, 12, -15, 7, 2, 6, + -35, 11, 28, 0, 26, 14, 1, 1, + 4, 12, 18, 35, 22, -18, -3, 14, + -1, 7, 14, -8, -14, -3, 4, -3, + -19, -7, -1, -25, -27, 25, -26, -2, + 33, -22, -27, -25, 4, -9, 7, 21, + 26, -30, 10, -9, -20, 11, 27, 10, + 5, -18, 14, -4, 2, -17, -5, -7, + -9, -13, 15, 29, 1, -10, -16, -10, + 35, 36, -7, -22, -44, 17, 30, 22, + 21, -1, 22, -11, 32, -8, -7, 5, + -10, 5, 30, -20, 29, -20, -34, 12, + -4, -6, 6, -13, 10, -5, -68, -1, + 24, 9, 19, -24, -64, 31, 19, 27, + -26, 75, -45, 41, 39, -42, 8, 6, + 23, -30, 16, -25, 30, 34, 8, -38, + -3, 18, 16, -31, 22, -4, -9, 1, + 20, 9, 38, -32, 0, -45, 0, -6, + -13, 11, -25, -32, -22, 31, -24, -11, + -11, -4, -4, 20, -34, 22, 20, 9, + -25, 27, -5, 28, -29, 29, 6, 21, + -6, -18, 54, 4, -46, 23, 21, -14, + -31, 36, -41, -24, 4, 22, 10, 11, + 7, 36, -32, -13, -52, -17, 24, 28, + -37, -36, -1, 24, 9, -38, 35, 48, + 18, 2, -1, 45, 10, 39, 24, -38, + 13, 8, -16, 8, 25, 11, 7, -29, + -11, 7, 20, -30, -38, -45, 14, -18, + -28, -9, 65, 61, 22, -53, -38, -16, + 36, 46, 20, -39, 32, -61, -6, -6, + -36, -33, -18, -28, 56, 101, 45, 11, + -28, -23, -29, -61, 20, -47, 2, 48, + 27, -17, 1, 40, 1, 3, -51, 15, + 35, 28, 22, 35, 53, -61, -29, 12, + -6, -21, 10, 3, -20, 2, -25, 1, + -6, 31, 11, -3, 1, -10, -52, 6, + 126, -105, 122, 127, -128, 127, 127, -128, + 127, 108, 12, 127, 48, -128, -36, -128, + 127, 127, -128, -128, 127, 89, -128, 127, + -128, -128, -128, 127, 127, -128, -128, -93, + -82, 20, 125, 65, -82, 127, 38, -74, + 81, 88, -88, 79, 51, -47, -111, -26, + 14, 83, -88, -112, 24, 35, -101, 98, + -99, -48, -45, 46, 83, -60, -79, 45, + -20, -41, 9, 4, 52, 54, 93, -10, + 4, 13, 3, 123, 6, 94, -111, -69, + -14, -31, 10, 12, 53, -79, -11, -21, + -2, -44, -72, 92, 65, -57, 56, -38, + 127, -56, -128, 127, 127, -128, 86, 117, + -75, -128, 127, -19, -99, -112, 127, -128, + 127, -48, 114, 118, -128, -128, 117, -17, + -6, 121, -128, 127, -128, 82, 54, -106, + 127, 127, -33, 100, -39, -23, 18, -78, + -34, -29, -1, -30, 127, -26, 127, -128, + 126, -128, 27, -23, -79, -120, -127, 127, + 72, 66, 29, 7, -66, -56, -117, -128 +}; + +static const opus_int8 layer0_bias[32] = { + 51, -16, 1, 13, -5, -6, -16, -7, + 11, -6, 106, 26, 28, -14, 21, -29, + 7, 18, -18, -17, 21, -17, -9, 20, + -25, -3, -34, 48, 11, -13, -31, -20 +}; + +static const opus_int8 layer1_weights[2304] = { + 22, -1, -7, 7, 29, -27, -31, -17, + -13, 33, 44, -8, 11, 33, 24, 78, + 15, 19, 30, -2, -24, 5, 49, 5, + 36, 29, -14, -11, -48, -33, 21, -42, + -38, -12, 55, -37, 54, -8, 1, 36, + 17, 0, 51, 31, 59, 7, -12, 53, + 4, 32, -14, 48, 5, -10, -16, -8, + 1, -16, -56, -24, -6, 18, -2, 23, + 6, 46, -6, -10, 20, 35, -44, -15, + -49, 36, 16, 5, -7, -79, -67, 12, + 70, -3, -79, -54, -85, -24, 47, -22, + 33, 21, 69, -1, 11, 22, 14, -16, + -16, -22, -28, -11, 11, -41, 31, -26, + -33, -19, -4, 27, 32, -50, 5, -10, + -38, -22, -8, 35, -31, 1, -41, -15, + -11, 44, 28, -17, -41, -23, 17, 2, + -23, -26, -13, -13, -17, 6, 14, -31, + -25, 9, -19, 39, -8, 4, 31, -1, + -45, -11, -28, -92, -46, -15, 21, 118, + -22, 45, -51, 11, -20, -20, -15, 13, + -21, -97, -29, -32, -23, -42, 94, 1, + 23, -8, 63, -3, -46, 19, -26, 32, + -40, -74, -26, 26, -4, -13, 30, -20, + -30, -25, -14, -31, -45, -43, 4, -60, + -48, -12, -34, 2, 2, 3, 13, 15, + 11, 16, 5, 46, -9, -55, -16, -57, + 29, 14, 38, -50, -2, -44, -11, -8, + 52, -27, -38, -7, 20, 47, 17, -59, + 0, 47, 46, -63, 35, -17, 19, 33, + 68, -19, 2, 15, -16, 28, -16, -103, + 26, -35, 47, -39, -60, 30, 31, -23, + -52, -13, 116, 47, -25, 30, 40, 30, + -22, 2, 12, -27, -18, 31, -10, 27, + -8, -66, 12, 14, 4, -26, -28, -13, + 3, 13, -26, -51, 37, 5, 2, -21, + 47, 3, 13, 25, -41, -27, -8, -4, + 5, -76, -33, 28, 10, 9, -46, -74, + 19, 28, 25, 31, 54, -55, 68, 38, + -24, -32, 2, 4, 68, 11, -1, 99, + 5, 16, -2, -74, 40, 26, -26, 33, + 31, -1, -68, 14, -6, 25, 9, 29, + 60, 61, 7, -7, 0, -24, 7, 77, + 4, -1, 16, -7, 13, -15, -19, 28, + -31, -24, -16, 37, 24, 13, 30, 10, + -30, 11, 11, -10, 22, 60, 28, 45, + -3, -40, -62, -5, -102, 9, -32, -27, + -54, 21, 15, -5, 37, -43, -11, 37, + -19, 47, -64, -128, -27, -114, 21, -66, + 59, 46, -3, -12, -87, -9, 4, 19, + -113, -36, 78, 57, -26, -38, -77, -10, + 6, 6, -75, 25, -97, -11, 33, -46, + 1, 13, -21, -33, -20, 16, -6, -3, + -11, -4, -27, 38, 8, -41, -2, -33, + 18, 19, -26, 1, -29, -22, -4, -14, + -55, -11, -80, -3, 11, 34, 90, 51, + 11, 17, 43, 36, 127, -32, 29, 103, + 9, 27, 13, 64, 56, 70, -14, 3, + -12, 10, 37, 3, 12, -22, -10, 46, + 28, 10, 20, 26, -24, 18, 9, 7, + 14, 34, -5, -7, 31, -14, -56, 11, + -18, -8, -17, -7, -10, -40, 10, -33, + -32, -43, 5, 9, 11, -4, 10, 50, + -12, -5, 46, 9, 7, 1, 11, 15, + 91, -17, 7, -50, 23, 6, -30, -99, + 0, -17, 14, 8, -10, -25, -30, -69, + -62, 31, 127, 114, -23, 101, -5, -54, + -6, -22, 7, -56, 39, 18, -29, 0, + 46, 8, -79, 4, -21, 18, -32, 62, + -12, -8, -12, -58, 31, -32, 17, 6, + -24, 25, 24, 9, -4, -19, 45, 6, + 17, -14, 5, -27, 16, -4, -41, 25, + -36, 5, 15, 12, 50, 27, 25, 23, + -44, -69, -9, -19, -48, -8, 4, 12, + -6, 13, -19, -30, -36, 26, 37, -1, + -3, -30, -42, -14, -10, -20, 26, -54, + -27, -44, 4, 73, -26, 90, 32, -69, + -29, -16, 3, 103, 15, -17, 37, 24, + -23, -31, 33, -37, -64, 25, 13, -81, + -28, -32, 27, 5, -35, -23, 15, -22, + 19, -7, 9, 30, 19, -23, 27, -13, + 43, 29, -29, -6, 9, -40, -33, -33, + -32, 9, 11, -48, -8, -23, -52, 46, + 17, -22, -42, 35, -15, -41, 16, 34, + 31, -42, -19, -11, 55, 7, -39, 89, + -11, -33, 20, -14, 22, 32, 3, -17, + -6, 14, 34, 1, 55, -21, -90, -8, + 18, 27, 13, -29, 21, 15, -33, -51, + -9, -11, 4, -16, -18, 23, -4, -4, + 48, 1, 7, 29, -14, -12, -16, 17, + 35, 8, 0, -7, -2, 9, 8, 17, + -6, 53, -32, -21, -50, 5, 99, -60, + -5, -53, 10, -31, 12, -5, 7, 80, + 36, 18, -31, 9, 98, 36, -63, -35, + 4, -13, -28, -24, 28, -13, 18, 16, + -1, -18, -34, 10, 20, 7, 4, 29, + 11, 25, -7, 36, 14, 45, 24, 1, + -16, 30, 6, 35, -6, -11, -24, 13, + -1, 27, 39, 20, 48, -11, -4, -13, + 28, 11, -31, -18, 31, -29, 22, -2, + -20, -16, 5, 30, -12, -28, -3, 93, + -16, 23, 18, -29, 6, -54, -37, 28, + -3, -3, -47, -3, -36, -55, -3, 41, + -10, 47, -2, 23, 42, -7, -71, -27, + 83, -64, 7, -24, 8, 26, -17, 15, + 12, 31, -30, -38, -13, -33, -56, 4, + -17, 20, 18, 1, -30, -5, -6, -31, + -14, -37, 0, 22, 10, -30, 37, -17, + 18, 6, 5, 23, -36, -32, 14, 18, + -13, -61, -52, -69, 44, -30, 16, 18, + -4, -25, 14, 81, 26, -8, -23, -59, + 52, -104, 17, 119, -32, 26, 17, 1, + 23, 45, 29, -64, -57, -14, 73, 21, + -13, -13, 9, -68, -7, -52, 3, 24, + -39, 44, -15, 27, 14, 19, -9, -28, + -11, 5, 3, -34, -2, 2, 22, -6, + -23, 4, 3, 13, -22, -13, -10, -18, + 29, 6, 44, -13, -24, -8, 2, 30, + 14, 43, 6, 17, -73, -6, -7, 20, + -80, -7, -7, -28, 15, -69, -38, -5, + -100, -35, 15, -79, 23, 29, -18, -27, + 21, -66, -37, 8, -22, -39, 48, 4, + -13, 1, -9, 11, -29, 22, 6, -49, + 32, -14, 47, -18, -4, 44, -52, -74, + 43, 30, 23, -14, 5, 0, -27, 4, + -7, 10, -4, 10, 1, -16, 11, -18, + -2, -5, 2, -11, 0, -20, -4, 38, + 74, 59, 39, 64, -10, 26, -3, -40, + -68, 3, -30, -51, 8, -19, -27, -46, + 51, 52, 54, 36, 90, 92, 14, 13, + -5, 0, 16, -62, 16, 11, -47, -37, + -6, -5, 21, 54, -57, 32, 42, -6, + 62, -9, 16, 21, 24, 9, -10, -4, + 33, 50, 13, -15, 1, -35, -48, 18, + -11, -17, -67, -13, 21, 38, -44, 36, + -16, 29, 17, 5, -10, 18, 17, -32, + 2, 8, 22, -56, -15, -32, 40, 43, + 19, 46, -7, -100, -96, 19, 53, 24, + 21, -26, -48, -101, -82, 61, 38, -85, + -28, -34, -1, 63, -5, -5, 39, 39, + -38, 32, -12, -28, 20, 40, -8, 2, + 31, 12, -35, -13, 20, -25, 30, 8, + 3, -13, -9, -20, 2, -13, 24, 37, + -10, 33, 6, 20, -16, -24, -6, -6, + -19, -5, 22, 21, 10, 11, -4, -39, + -1, 6, 49, 41, -15, -57, 21, -62, + 77, -69, -13, 0, -74, 1, -7, -38, + -8, 6, 63, 28, 4, 26, -52, 82, + 63, 13, 45, -33, 44, -52, -65, -21, + -46, -49, 64, -17, 32, 24, 68, -39, + -16, -5, -26, 28, 5, -61, -28, 2, + 24, 11, -12, -33, 9, -37, -3, -28, + 22, -37, -12, 19, 0, -18, -2, 14, + 1, 4, 8, -9, -2, 43, -17, -2, + -66, -31, 56, -40, -87, -36, -2, -4, + -42, -45, -1, 31, -43, -15, 27, 63, + -11, 32, -10, -33, 27, -19, 4, 15, + -26, -34, 29, -4, -39, -65, 14, -20, + -21, -17, -36, 13, 59, 47, -38, -33, + 13, -37, -8, -37, -7, -6, -76, -31, + -12, -46, 7, 24, -21, -30, -14, 9, + 15, -12, -13, 47, -27, -25, -1, -39, + 0, 20, -9, 6, 7, 4, 3, 7, + 39, 50, 22, -7, 14, -20, 1, 70, + -28, 29, -41, 10, -16, -5, -28, -2, + -37, 32, -18, 17, 62, -11, -20, -50, + 36, 21, -62, -12, -56, 52, 50, 17, + 3, 48, 44, -41, -25, 3, 16, -3, + 0, 33, -6, 15, 27, 34, -25, 22, + 9, 17, -11, 36, 16, -2, 12, 21, + -52, 45, -2, -10, 46, 21, -18, 67, + -28, -13, 30, 37, 42, 16, -9, 11, + 75, 7, -64, -40, -10, 29, 57, -23, + 5, 53, -77, 3, -17, -5, 47, -55, + -35, -36, -13, 52, -53, -71, 52, -111, + -23, -26, -28, 29, -43, 55, -19, 43, + -19, 54, -12, -33, -44, -39, -19, -10, + -31, -10, 21, 38, -57, -20, 2, -25, + 8, -6, 50, 12, 15, 25, -25, 15, + -30, -6, 9, 25, 37, 19, -4, 31, + -22, 2, 4, 2, 36, 7, 3, -34, + -80, 36, -10, -2, -5, 31, -36, 49, + -70, 20, -36, 21, 24, 25, -46, -51, + 36, -58, -48, -40, -10, 55, 71, 47, + 10, -1, 1, 2, -46, -68, 16, 13, + 0, -74, -29, 73, -52, -18, -11, 7, + -44, -82, -32, -70, -28, -1, -39, -68, + -6, -41, 12, -22, -16, 40, -11, -25, + 51, -9, 21, 4, 4, -34, 7, -78, + 16, 6, -38, -30, -2, -44, 32, 0, + 22, 64, 5, -72, -2, -14, -10, -16, + -8, -25, 12, 102, -58, 37, -10, -23, + 15, 49, 7, -7, 2, -20, -32, 45, + -6, 48, 28, 30, 33, -1, 22, -6, + 30, 65, -17, 29, 74, 37, -26, -10, + 15, -24, 19, -66, 22, -10, -31, -1, + -18, -9, 11, 37, -4, 45, 5, 41, + 17, 1, 1, 24, -58, 41, 5, -51, + 14, 8, 43, 16, -10, -1, 45, 32, + -64, 3, -33, -25, -3, -27, -68, 12, + 23, -11, -13, -37, -40, 4, -21, -12, + 32, -23, -19, 76, 41, -23, -24, -44, + -65, -1, -15, 1, 71, 63, 5, 20, + -3, 21, -23, 31, -32, 18, -2, 27, + 31, 46, -5, -39, -5, -35, 18, -18, + -40, -10, 3, 12, 2, -2, -22, 40, + 5, -6, 60, 36, 3, 29, -27, 10, + 25, -54, 5, 26, 39, 35, -24, -37, + 30, -91, 28, -4, -21, -27, -39, -6, + 5, 12, -128, 38, -16, 29, -95, -29, + 82, -2, 35, 2, 12, 8, -22, 10, + 80, -47, 2, -25, -73, -79, 16, -30, + -32, -66, 48, 21, -45, -11, -47, 14, + -27, -17, -7, 15, -44, -14, -44, -26, + -32, 26, -23, 17, -7, -28, 26, -6, + 28, 6, -26, 2, 13, -14, -23, -14, + 19, 46, 16, 2, -33, -21, 28, -17, + -42, 44, -37, 1, -39, 28, 84, -46, + 15, 10, 13, -44, 72, -26, 26, 32, + -28, -12, -83, 2, 10, -30, -44, -10, + -28, 53, 45, 65, 0, -25, 57, 36, + -33, 6, 29, 44, -53, 11, 19, -2, + -27, 35, 32, 49, 4, 23, 38, 36, + 24, 10, 51, -39, 4, -7, 26, 37, + -35, 11, -47, -18, 28, 16, -35, 42, + 17, -21, -41, 28, 14, -12, 11, -45, + 7, -43, -15, 18, -5, 38, -40, -50, + -30, -21, 9, -98, 13, 12, 23, 75, + -56, -7, -3, -4, -1, -34, 12, -49, + 11, 26, -18, -28, -17, 33, 13, -14, + 40, 24, -72, -37, 10, 17, -6, 22, + 16, 16, -6, -12, -30, -14, 10, 40, + -23, 12, 15, -3, -15, 13, -56, -4, + -30, 1, -3, -17, 27, 50, -5, 64, + -36, -19, 7, 29, 22, 25, 9, -16, + -58, -69, -40, -61, -71, -14, 42, 93, + 26, 11, -6, -58, -11, 70, -52, 19, + 9, -30, -33, 11, -37, -47, -21, -22, + -40, 10, 47, 4, -23, 17, 48, 41, + -48, 14, 10, 15, 34, -23, -2, -47, + 23, -32, -13, -10, -26, -26, -4, 16, + 38, -14, 0, -12, -7, -7, 20, 44, + -1, -32, -27, -16, 4, -6, -18, 14, + 5, 4, -29, 28, 7, -7, 15, -11, + -20, -45, -36, 16, 84, 34, -59, -30, + 22, 126, 8, 68, 79, -17, 21, -68, + 37, 5, 15, 63, 49, 127, -90, 85, + 43, 7, 16, 9, 6, -45, -57, -43, + 57, 11, -23, -11, -29, 60, -26, 0, + 7, 42, -24, 10, 23, -25, 8, -7, + -40, 19, -17, 35, 4, 27, -39, -91, + 27, -36, 34, 2, 16, -24, 25, 7, + -21, 5, 17, 10, -22, -30, 9, -17, + -61, -26, 33, 21, 58, -51, -14, 69, + -38, 20, 7, 80, -4, -65, -6, -27, + 53, -12, 47, -1, -15, 1, 60, 102, + -79, -4, 12, 9, 22, 37, -8, -4, + 37, 2, -3, -15, -16, -11, -5, 19, + -6, -43, 20, -25, -18, 10, -27, 0, + -28, -27, -11, 10, -18, -2, -4, -16, + 26, 14, -6, 7, -6, 1, 53, -2, + -29, 23, 9, -30, -6, -4, -6, 56, + 70, 0, -33, -20, -17, -9, -24, 46, + -5, -105, 47, -46, -51, 20, 20, -53, + -81, -1, -7, 75, -5, -21, -65, 12, + -52, 22, -50, -12, 49, 54, 76, -81, + 10, 45, -41, -59, 18, -19, 25, 14, + -31, -53, -5, 12, 31, 84, -23, 2, + 7, 2, 10, -32, 39, -2, -12, 1, + -9, 0, -10, -11, 9, 15, -8, -2, + 2, -1, 10, 14, -5, -40, 19, -7, + -7, 26, -4, 2, 1, -27, 35, 32, + 21, -31, 26, 43, -9, 4, -32, 40, + -62, -52, 36, 22, 38, 22, 36, -96, + 6, -10, -23, -49, 15, -33, -18, -3, + 0, 41, 21, -19, 21, 23, -39, -23, + -6, 6, 47, 56, 4, 74, 0, -98, + 29, -47, -14, -36, 21, -22, 22, 16, + 13, 12, 16, -5, 13, 17, -13, -15, + 1, -34, -26, 26, 12, 32, 27, 13, + -67, 27, 2, 8, 10, 18, 16, 20, + -17, -17, 57, -64, 5, 14, 19, 31, + -18, -44, -46, -16, 4, -25, 17, -126, + -24, 39, 4, 8, 55, -25, -34, 39, + -16, 3, 9, 71, 72, -31, -55, 6, + 10, -25, 32, -85, -21, 18, -8, 15, + 12, -27, -7, 1, -21, -2, -5, 48, + -16, 18, 1, -22, -26, 16, 14, -31, + 27, -6, -15, -21, 4, -14, 18, -36 +}; + +static const opus_int8 layer1_recur_weights[1728] = { + 20, 67, -99, 12, 41, -25, 49, -44, + 35, 81, 110, 47, 34, -66, -14, 14, + -60, 34, 29, -73, 10, 41, 35, 89, + 7, -35, 22, 7, 27, -20, -6, 56, + 26, 66, 6, 33, -55, 53, 1, -21, + 14, 17, 68, 55, 59, 0, 18, -9, + 5, -41, 6, -5, -114, -12, 29, 42, + -23, 10, 81, -27, 20, -53, -30, -62, + 40, 95, 25, -4, 3, 18, -8, -15, + -29, -82, 2, -57, -3, -61, -29, -29, + 49, 2, -55, 5, -69, -99, -49, -51, + 6, -25, 12, 89, 44, -33, 5, 41, + 1, 23, -37, -37, -28, -48, 3, 4, + -41, -30, -57, -35, -39, -1, -13, -56, + -5, 50, 49, 41, -4, -4, 33, -22, + -1, 33, 34, 18, 40, -42, 12, 1, + -6, -2, 18, 17, 39, 44, 11, 65, + -60, -45, 10, 91, 21, 9, -62, -11, + 8, 69, 37, 24, -30, 21, 26, -27, + 1, -28, 24, 66, -8, 6, -71, 34, + 24, 44, 58, -78, -19, 57, 17, -60, + 1, 12, -3, -1, -40, 22, 11, -5, + 25, 12, 1, 72, 79, 7, -50, 23, + 18, 13, 21, -11, -20, 5, 77, -94, + 24, 15, 57, -51, 3, 36, 53, -1, + 4, 14, 30, -31, 22, 40, 32, -11, + -34, -36, -59, 58, 25, 21, -54, -23, + 40, 46, 18, 0, 12, 54, -96, -99, + -59, 5, 119, -38, 50, 55, 12, -16, + 67, 0, 34, 35, 39, 35, -1, 69, + 24, 27, -30, -35, -4, -70, 2, -44, + -7, -6, 19, -9, 60, 44, -21, -10, + 37, 43, -16, -3, 30, -15, -65, 31, + -55, 18, -98, 76, 64, 25, 24, -18, + -7, -68, -10, 38, 27, -60, 36, 33, + 16, 30, 34, -39, -37, 31, 12, 53, + -54, 14, -26, -49, -128, -13, -5, -22, + -11, -85, 55, -8, -51, -11, -33, -10, + -31, -76, -41, 23, 44, -40, -54, -127, + -101, 19, -23, -15, 15, 27, 58, -60, + 8, 14, -33, 1, 48, -9, -11, -123, + 3, 53, 23, 4, -28, 22, 2, -29, + -67, 36, 12, 7, 55, -21, 88, 20, + -1, -21, -17, 3, 41, 32, -10, -14, + -5, -57, 67, 57, 21, 23, -2, -27, + -73, -24, 120, 21, 18, -35, 42, -7, + 3, -45, -25, 76, -34, 50, 11, -54, + -91, 3, -113, -20, -5, 47, 15, -47, + 17, 27, -3, -26, -7, 10, 7, 74, + -40, 64, -7, -5, -24, -49, -24, -3, + -10, 27, -17, -8, -3, 14, -27, 33, + 13, 39, 28, -7, -38, 29, 16, 44, + 19, 55, -3, 9, -13, -57, 43, 43, + 31, 0, -93, -17, 19, -56, 4, -12, + -25, 37, -85, -13, -118, 33, -17, 56, + 71, -80, -4, 6, -11, -18, 47, -52, + 25, 9, 48, -107, 1, 21, 20, -3, + 10, -16, -4, 24, 17, 31, -61, -18, + -50, 24, -10, 12, 71, 26, 11, -3, + 4, 1, 0, -7, -40, 18, 38, -34, + 38, 17, 8, -34, 2, 21, 123, -32, + -26, 43, 14, -34, -1, -9, 37, -16, + 6, -17, -62, 68, 22, 17, 11, -75, + 33, -80, 62, -9, -75, 76, 36, -41, + -8, -40, -11, -71, 40, -39, 62, -49, + -81, 16, -9, -52, 52, 61, 17, -103, + -27, -10, -8, -54, -57, 21, 23, -16, + -52, 36, 18, 10, -5, 8, 15, -29, + 5, -19, -37, 8, -53, 6, 19, -37, + 38, -17, 48, 10, 0, 81, 46, 70, + -29, 101, 11, 44, -44, -3, 24, 11, + 3, 14, -9, 11, 14, -45, 13, 46, + -3, -57, 68, 44, 63, 98, 25, -28, + -23, 15, 32, -10, 53, -6, -2, -9, + -6, 16, -107, -11, -11, -28, 59, 57, + -22, 38, 42, 83, 27, 5, 29, -30, + 12, -21, -13, 31, 38, -21, 58, -10, + -10, -15, -2, -5, 11, 12, -73, -28, + -38, 22, 2, -25, 73, -52, -12, -55, + 32, -63, 21, 51, 33, 52, -26, 55, + -26, -26, 57, -32, -4, -52, -61, 21, + -33, -91, -51, 69, -90, -53, -38, -44, + 12, -76, -20, 77, -45, -7, 86, 43, + -109, -33, -105, -40, -121, -10, 0, -72, + 45, -51, -75, -49, -38, -1, -62, 18, + -1, 30, -44, -14, -10, -67, 40, -10, + -34, 46, -64, -32, 29, -13, 33, 3, + -32, -5, 28, -27, -25, 93, 24, 68, + -40, 57, 23, -3, -21, -58, 17, -39, + -17, -22, -89, 11, 18, -46, 27, 24, + 46, 127, 61, 87, 31, 127, -36, 47, + -23, 47, 127, -24, 110, 122, 30, 100, + 0, 96, -12, 6, 50, 44, -13, 73, + 4, 55, -11, -15, 49, 42, -6, 20, + -35, 58, 18, 38, 42, 72, 19, -21, + 11, 9, -37, 7, 29, 31, 16, -17, + 13, -50, 19, 5, -23, 51, -16, -5, + 4, -24, 76, 10, -53, -28, -7, -65, + 74, 40, -16, -29, 32, -16, -49, -35, + -3, 59, -96, -50, -43, -43, -61, -15, + -8, -36, -34, -33, -14, 11, -3, -39, + 4, -114, -123, -11, -49, -21, 14, -56, + 1, 43, -63, 26, 40, 18, -10, -26, + -14, -15, -35, -35, -11, 32, -44, -67, + 2, 22, 7, 3, -9, -30, -51, -28, + 28, 6, -22, 16, 34, -25, -52, -54, + -8, -6, 5, 8, 20, -16, -17, -44, + 27, 3, 31, -5, -48, -1, -3, 116, + 11, 71, -31, -47, 109, 50, -22, -12, + -57, 32, 66, 8, -25, -93, -54, -10, + 19, -76, -34, 97, 48, -36, -18, -30, + -39, -26, -12, 28, 14, 12, -12, -31, + 38, 2, 10, 4, -40, 20, 16, -61, + 2, 64, 39, 5, 15, 33, 40, -61, + -49, 93, -10, 33, 28, -11, -27, -18, + 39, -62, -6, -6, 62, 11, -8, 38, + -67, 12, 27, 39, -27, 123, -18, -6, + -65, 83, -64, 20, 19, -11, 33, 24, + 17, 56, 78, 7, -15, 54, -101, -9, + 115, -96, 50, 51, 35, 34, 27, 37, + -40, -11, 8, -36, 42, -45, 2, -23, + 0, 67, -8, -9, -13, 50, -14, -27, + 4, 0, -8, -14, 30, -9, 29, 15, + 9, -38, 37, -8, 50, -46, 54, 41, + -11, -8, -11, -26, 39, 45, 14, -26, + -17, -27, 69, 38, 39, 98, 66, 0, + 42, 123, -101, -19, -83, 117, -32, 56, + 10, 12, -88, 79, -53, 56, 63, 95, + -62, 9, 36, -13, -79, -16, 37, -46, + 35, -34, 14, 17, -54, 5, 21, -7, + 7, 63, 56, 15, 27, -76, -25, 4, + -26, -63, 28, -67, -52, 43, -47, -70, + 40, -12, 40, -66, -37, 0, 35, 37, + -53, 4, -17, -51, 11, 21, 14, -34, + -4, 24, -42, 29, 22, 7, 28, 12, + 37, 39, -39, -19, 65, -60, -50, -2, + 1, 82, 39, 19, -23, -43, -22, -67, + -35, -34, 32, 102, 81, 127, 36, 67, + -45, 1, -67, -52, -4, 35, 20, 28, + 71, 86, -35, -9, -83, -34, 12, 9, + -23, 2, 14, 28, -23, 7, -25, 45, + 7, 17, -37, 0, -19, 31, 26, 40, + -27, -16, 17, 5, -21, 23, 24, 96, + -55, 52, -19, -14, -6, 1, 50, -34, + 86, -53, 38, 2, -52, -36, -13, 60, + -85, -120, 32, 7, -12, 22, 70, -7, + -94, 38, -76, -31, -20, 15, -28, 7, + 6, 40, 53, 88, 3, 38, 18, -8, + -22, -23, 51, 37, -9, 13, -32, 25, + -21, 27, 31, 20, 18, -9, -13, 1, + 21, -24, -13, 39, 15, -11, -29, -36, + 18, 15, 8, 27, 21, -94, -1, -22, + 49, 66, -1, 6, -3, -40, -18, 6, + 28, 12, 33, -59, 62, 60, -48, 90, + -1, 108, 9, 18, -2, 27, 77, -65, + 82, -48, -38, -19, -11, 127, 50, 66, + 18, -13, -22, 60, -38, 40, -14, -26, + -13, 38, 67, 57, 30, 33, 26, 36, + 38, -17, 27, -28, 20, 12, -64, 18, + 5, -33, -27, 13, -26, 32, 35, -5, + -48, -14, 92, 43, -47, -14, 40, 11, + 51, 66, 22, -63, -16, -61, 4, -28, + 27, 20, -33, -30, -21, -29, -53, 31, + -40, 24, 43, -4, -19, 21, 67, 20, + 100, -16, -93, 78, -6, -18, -52, -37, + -9, 66, -31, -8, 26, 18, 4, 24, + -22, 17, -2, -13, 27, 0, 8, -18, + -25, 5, -21, -24, -7, 18, -93, 21, + 7, 2, -75, 69, 50, -5, -15, -17, + 60, -42, 55, 1, -4, 3, 10, 46, + 16, -13, 45, -7, -10, -44, -108, 49, + 2, -15, -64, -12, -72, 32, -38, -45, + 10, -54, 13, -13, -27, -36, -64, 58, + -62, -101, 88, -86, -71, -39, -9, -128, + 32, 15, -4, 54, -16, -39, -26, -36, + 46, 48, -64, -10, 19, 30, -13, 34, + -8, 50, 60, -22, -6, -11, -30, 5, + 50, 32, 56, 0, 25, 6, 68, 11, + -29, 45, -9, -12, 4, 1, 18, -49, + 0, -38, -19, 90, 29, 35, 51, 8, + -48, 96, -1, -12, -9, -32, -63, -65, + -7, 38, 89, 28, -85, -28, -23, -25, + -128, 56, 79, -36, 99, -6, -37, 7, + -13, -69, -46, -29, 25, 64, -21, 17, + 1, 42, -66, 1, 80, 26, -32, 21, + 15, 15, 6, 6, -10, 15, 127, 5, + 38, 27, 87, -57, -25, 11, 72, -21, + -5, 11, -13, -66, 78, 36, -3, 41, + -21, 8, -33, 23, 73, 28, 57, -25, + -5, 4, -22, -47, 15, 4, -57, -72, + 33, 1, 18, 2, 53, -71, -99, -21, + -3, -111, 108, 71, -14, 82, 25, 61, + -48, 5, 9, -51, -20, -25, -3, 14, + -33, 14, -3, -34, 22, 12, -19, -38, + -16, 2, 21, 16, 26, -31, 75, 44, + -31, 16, 26, 66, 17, -9, -22, -22, + 22, -44, 22, 27, 2, 58, -14, 10, + -73, -42, 55, -25, -61, 72, -1, 30, + -58, -25, 63, 26, -48, -40, 26, -30, + 60, 8, -17, -1, -18, -20, 43, -20, + -4, -28, 127, -106, 29, 70, 64, -27, + 39, -33, -5, -88, -40, -52, 26, 44, + -17, 23, 2, -49, 22, -9, -8, 86, + 49, -43, -60, 1, 10, 45, 36, -53, + -4, 33, 38, 48, -72, 1, 19, 21, + -65, 4, -5, -62, 27, -25, 17, -6, + 6, -45, -39, -46, 4, 26, 127, -9, + 18, -33, -18, -3, 33, 2, -5, 15, + -26, -22, -117, -63, -17, -59, 61, -74, + 7, -47, -58, -128, -67, 15, -16, -128, + 12, 2, 20, 9, -48, -40, 43, 3, + -40, -16, -38, -6, -22, -28, -16, -59, + -22, 6, -5, 11, -12, -66, -40, 27, + -62, -44, -19, 38, -3, 39, -8, 40, + -24, 13, 21, 50, -60, -22, 53, -29, + -6, 1, 22, -59, 0, 17, -39, 115 +}; -static const float weights[422] = { +static const opus_int8 layer1_bias[72] = { + -42, 20, 16, 0, 105, 60, 1, -97, + 24, 60, 18, 13, 62, 25, 127, 34, + 79, 55, 118, 127, 95, 31, -4, 87, + 21, 12, 2, -14, 18, 23, 8, 17, + -1, -8, 5, 4, 24, 37, 21, 13, + 36, 13, 17, 18, 37, 30, 33, 1, + 8, -16, -11, -5, -31, -3, -5, 0, + 6, 3, 58, -7, -1, -16, 5, -13, + 16, 10, -2, -14, 11, -4, 3, -11 +}; -/* hidden layer */ --0.0941125f, -0.302976f, -0.603555f, -0.19393f, -0.185983f, --0.601617f, -0.0465317f, -0.114563f, -0.103599f, -0.618938f, --0.317859f, -0.169949f, -0.0702885f, 0.148065f, 0.409524f, -0.548432f, 0.367649f, -0.494393f, 0.764306f, -1.83957f, -0.170849f, 12.786f, -1.08848f, -1.27284f, -16.2606f, -24.1773f, -5.57454f, -0.17276f, -0.163388f, -0.224421f, --0.0948944f, -0.0728695f, -0.26557f, -0.100283f, -0.0515459f, --0.146142f, -0.120674f, -0.180655f, 0.12857f, 0.442138f, --0.493735f, 0.167767f, 0.206699f, -0.197567f, 0.417999f, -1.50364f, -0.773341f, -10.0401f, 0.401872f, 2.97966f, -15.2165f, -1.88905f, -1.19254f, 0.0285397f, -0.00405139f, -0.0707565f, 0.00825699f, -0.0927269f, -0.010393f, -0.00428882f, --0.00489743f, -0.0709731f, -0.00255992f, 0.0395619f, 0.226424f, -0.0325231f, 0.162175f, -0.100118f, 0.485789f, 0.12697f, -0.285937f, 0.0155637f, 0.10546f, 3.05558f, 1.15059f, --1.00904f, -1.83088f, 3.31766f, -3.42516f, -0.119135f, --0.0405654f, 0.00690068f, 0.0179877f, -0.0382487f, 0.00597941f, --0.0183611f, 0.00190395f, -0.144322f, -0.0435671f, 0.000990594f, -0.221087f, 0.142405f, 0.484066f, 0.404395f, 0.511955f, --0.237255f, 0.241742f, 0.35045f, -0.699428f, 10.3993f, -2.6507f, -2.43459f, -4.18838f, 1.05928f, 1.71067f, -0.00667811f, -0.0721335f, -0.0397346f, 0.0362704f, -0.11496f, --0.0235776f, 0.0082161f, -0.0141741f, -0.0329699f, -0.0354253f, -0.00277404f, -0.290654f, -1.14767f, -0.319157f, -0.686544f, -0.36897f, 0.478899f, 0.182579f, -0.411069f, 0.881104f, --4.60683f, 1.4697f, 0.335845f, -1.81905f, -30.1699f, -5.55225f, 0.0019508f, -0.123576f, -0.0727332f, -0.0641597f, --0.0534458f, -0.108166f, -0.0937368f, -0.0697883f, -0.0275475f, --0.192309f, -0.110074f, 0.285375f, -0.405597f, 0.0926724f, --0.287881f, -0.851193f, -0.099493f, -0.233764f, -1.2852f, -1.13611f, 3.12168f, -0.0699f, -1.86216f, 2.65292f, --7.31036f, 2.44776f, -0.00111802f, -0.0632786f, -0.0376296f, --0.149851f, 0.142963f, 0.184368f, 0.123433f, 0.0756158f, -0.117312f, 0.0933395f, 0.0692163f, 0.0842592f, 0.0704683f, -0.0589963f, 0.0942205f, -0.448862f, 0.0262677f, 0.270352f, --0.262317f, 0.172586f, 2.00227f, -0.159216f, 0.038422f, -10.2073f, 4.15536f, -2.3407f, -0.0550265f, 0.00964792f, --0.141336f, 0.0274501f, 0.0343921f, -0.0487428f, 0.0950172f, --0.00775017f, -0.0372492f, -0.00548121f, -0.0663695f, 0.0960506f, --0.200008f, -0.0412827f, 0.58728f, 0.0515787f, 0.337254f, -0.855024f, 0.668371f, -0.114904f, -3.62962f, -0.467477f, --0.215472f, 2.61537f, 0.406117f, -1.36373f, 0.0425394f, -0.12208f, 0.0934502f, 0.123055f, 0.0340935f, -0.142466f, -0.035037f, -0.0490666f, 0.0733208f, 0.0576672f, 0.123984f, --0.0517194f, -0.253018f, 0.590565f, 0.145849f, 0.315185f, -0.221534f, -0.149081f, 0.216161f, -0.349575f, 24.5664f, --0.994196f, 0.614289f, -18.7905f, -2.83277f, -0.716801f, --0.347201f, 0.479515f, -0.246027f, 0.0758683f, 0.137293f, --0.17781f, 0.118751f, -0.00108329f, -0.237334f, 0.355732f, --0.12991f, -0.0547627f, -0.318576f, -0.325524f, 0.180494f, --0.0625604f, 0.141219f, 0.344064f, 0.37658f, -0.591772f, -5.8427f, -0.38075f, 0.221894f, -1.41934f, -1.87943e+06f, -1.34114f, 0.0283355f, -0.0447856f, -0.0211466f, -0.0256927f, -0.0139618f, 0.0207934f, -0.0107666f, 0.0110969f, 0.0586069f, --0.0253545f, -0.0328433f, 0.11872f, -0.216943f, 0.145748f, -0.119808f, -0.0915211f, -0.120647f, -0.0787719f, -0.143644f, --0.595116f, -1.152f, -1.25335f, -1.17092f, 4.34023f, --975268.f, -1.37033f, -0.0401123f, 0.210602f, -0.136656f, -0.135962f, -0.0523293f, 0.0444604f, 0.0143928f, 0.00412666f, --0.0193003f, 0.218452f, -0.110204f, -2.02563f, 0.918238f, --2.45362f, 1.19542f, -0.061362f, -1.92243f, 0.308111f, -0.49764f, 0.912356f, 0.209272f, -2.34525f, 2.19326f, --6.47121f, 1.69771f, -0.725123f, 0.0118929f, 0.0377944f, -0.0554003f, 0.0226452f, -0.0704421f, -0.0300309f, 0.0122978f, --0.0041782f, -0.0686612f, 0.0313115f, 0.039111f, 0.364111f, --0.0945548f, 0.0229876f, -0.17414f, 0.329795f, 0.114714f, -0.30022f, 0.106997f, 0.132355f, 5.79932f, 0.908058f, --0.905324f, -3.3561f, 0.190647f, 0.184211f, -0.673648f, -0.231807f, -0.0586222f, 0.230752f, -0.438277f, 0.245857f, --0.17215f, 0.0876383f, -0.720512f, 0.162515f, 0.0170571f, -0.101781f, 0.388477f, 1.32931f, 1.08548f, -0.936301f, --2.36958f, -6.71988f, -3.44376f, 2.13818f, 14.2318f, -4.91459f, -3.09052f, -9.69191f, -0.768234f, 1.79604f, -0.0549653f, 0.163399f, 0.0797025f, 0.0343933f, -0.0555876f, --0.00505673f, 0.0187258f, 0.0326628f, 0.0231486f, 0.15573f, -0.0476223f, -0.254824f, 1.60155f, -0.801221f, 2.55496f, -0.737629f, -1.36249f, -0.695463f, -2.44301f, -1.73188f, -3.95279f, 1.89068f, 0.486087f, -11.3343f, 3.9416e+06f, +static const opus_int8 layer2_weights[48] = { + -113, -88, 31, -128, -126, -61, 85, -35, + 118, -128, -61, 127, -128, -17, -128, 127, + 104, -9, -128, 33, 45, 127, 5, 83, + 84, -128, -85, -128, -45, 48, -53, -128, + 46, 127, -17, 125, 117, -41, -117, -91, + -127, -68, -1, -89, -80, 32, 106, 7 +}; -/* output layer */ --0.381439f, 0.12115f, -0.906927f, 2.93878f, 1.6388f, -0.882811f, 0.874344f, 1.21726f, -0.874545f, 0.321706f, -0.785055f, 0.946558f, -0.575066f, -3.46553f, 0.884905f, -0.0924047f, -9.90712f, 0.391338f, 0.160103f, -2.04954f, -4.1455f, 0.0684029f, -0.144761f, -0.285282f, 0.379244f, --1.1584f, -0.0277241f, -9.85f, -4.82386f, 3.71333f, -3.87308f, 3.52558f}; +static const opus_int8 layer2_bias[2] = { + 14, 117 +}; -static const int topo[3] = {25, 15, 2}; +const DenseLayer layer0 = { + layer0_bias, + layer0_weights, + 25, 32, 0 +}; -const MLP net = { - 3, - topo, - weights +const GRULayer layer1 = { + layer1_bias, + layer1_weights, + layer1_recur_weights, + 32, 24 }; + +const DenseLayer layer2 = { + layer2_bias, + layer2_weights, + 24, 2, 1 +}; + diff --git a/thirdparty/opus/opus.c b/thirdparty/opus/opus.c index f76f125cfa..538b5ea74e 100644 --- a/thirdparty/opus/opus.c +++ b/thirdparty/opus/opus.c @@ -107,7 +107,7 @@ OPUS_EXPORT void opus_pcm_soft_clip(float *_x, int N, int C, float *declip_mem) /* Slightly boost "a" by 2^-22. This is just enough to ensure -ffast-math does not cause output values larger than +/-1, but small enough not to matter even for 24-bit output. */ - a += a*2.4e-7; + a += a*2.4e-7f; if (x[i*C]>0) a = -a; /* Apply soft clipping */ @@ -252,7 +252,7 @@ int opus_packet_parse_impl(const unsigned char *data, opus_int32 len, /* Number of frames encoded in bits 0 to 5 */ ch = *data++; count = ch&0x3F; - if (count <= 0 || framesize*count > 5760) + if (count <= 0 || framesize*(opus_int32)count > 5760) return OPUS_INVALID_PACKET; len--; /* Padding flag is bit 6 */ diff --git a/thirdparty/opus/opus/opus.h b/thirdparty/opus/opus/opus.h index 5be73ddf4e..d282f21d25 100644 --- a/thirdparty/opus/opus/opus.h +++ b/thirdparty/opus/opus/opus.h @@ -531,7 +531,7 @@ OPUS_EXPORT int opus_packet_parse( const unsigned char *frames[48], opus_int16 size[48], int *payload_offset -) OPUS_ARG_NONNULL(1) OPUS_ARG_NONNULL(4); +) OPUS_ARG_NONNULL(1) OPUS_ARG_NONNULL(5); /** Gets the bandwidth of an Opus packet. * @param [in] data <tt>char*</tt>: Opus packet diff --git a/thirdparty/opus/opus/opus_defines.h b/thirdparty/opus/opus/opus_defines.h index 315412dd1d..d141418b21 100644 --- a/thirdparty/opus/opus/opus_defines.h +++ b/thirdparty/opus/opus/opus_defines.h @@ -165,8 +165,13 @@ extern "C" { #define OPUS_GET_EXPERT_FRAME_DURATION_REQUEST 4041 #define OPUS_SET_PREDICTION_DISABLED_REQUEST 4042 #define OPUS_GET_PREDICTION_DISABLED_REQUEST 4043 - /* Don't use 4045, it's already taken by OPUS_GET_GAIN_REQUEST */ +#define OPUS_SET_PHASE_INVERSION_DISABLED_REQUEST 4046 +#define OPUS_GET_PHASE_INVERSION_DISABLED_REQUEST 4047 +#define OPUS_GET_IN_DTX_REQUEST 4049 + +/** Defines for the presence of extended APIs. */ +#define OPUS_HAVE_OPUS_PROJECTION_H /* Macros to trigger compilation errors when the wrong types are provided to a CTL */ #define __opus_check_int(x) (((void)((x) == (opus_int32)0)), (opus_int32)(x)) @@ -208,6 +213,9 @@ extern "C" { #define OPUS_FRAMESIZE_20_MS 5004 /**< Use 20 ms frames */ #define OPUS_FRAMESIZE_40_MS 5005 /**< Use 40 ms frames */ #define OPUS_FRAMESIZE_60_MS 5006 /**< Use 60 ms frames */ +#define OPUS_FRAMESIZE_80_MS 5007 /**< Use 80 ms frames */ +#define OPUS_FRAMESIZE_100_MS 5008 /**< Use 100 ms frames */ +#define OPUS_FRAMESIZE_120_MS 5009 /**< Use 120 ms frames */ /**@}*/ @@ -566,7 +574,9 @@ extern "C" { * <dt>OPUS_FRAMESIZE_20_MS</dt><dd>Use 20 ms frames.</dd> * <dt>OPUS_FRAMESIZE_40_MS</dt><dd>Use 40 ms frames.</dd> * <dt>OPUS_FRAMESIZE_60_MS</dt><dd>Use 60 ms frames.</dd> - * <dt>OPUS_FRAMESIZE_VARIABLE</dt><dd>Optimize the frame size dynamically.</dd> + * <dt>OPUS_FRAMESIZE_80_MS</dt><dd>Use 80 ms frames.</dd> + * <dt>OPUS_FRAMESIZE_100_MS</dt><dd>Use 100 ms frames.</dd> + * <dt>OPUS_FRAMESIZE_120_MS</dt><dd>Use 120 ms frames.</dd> * </dl> * @hideinitializer */ #define OPUS_SET_EXPERT_FRAME_DURATION(x) OPUS_SET_EXPERT_FRAME_DURATION_REQUEST, __opus_check_int(x) @@ -581,7 +591,9 @@ extern "C" { * <dt>OPUS_FRAMESIZE_20_MS</dt><dd>Use 20 ms frames.</dd> * <dt>OPUS_FRAMESIZE_40_MS</dt><dd>Use 40 ms frames.</dd> * <dt>OPUS_FRAMESIZE_60_MS</dt><dd>Use 60 ms frames.</dd> - * <dt>OPUS_FRAMESIZE_VARIABLE</dt><dd>Optimize the frame size dynamically.</dd> + * <dt>OPUS_FRAMESIZE_80_MS</dt><dd>Use 80 ms frames.</dd> + * <dt>OPUS_FRAMESIZE_100_MS</dt><dd>Use 100 ms frames.</dd> + * <dt>OPUS_FRAMESIZE_120_MS</dt><dd>Use 120 ms frames.</dd> * </dl> * @hideinitializer */ #define OPUS_GET_EXPERT_FRAME_DURATION(x) OPUS_GET_EXPERT_FRAME_DURATION_REQUEST, __opus_check_int_ptr(x) @@ -681,6 +693,40 @@ extern "C" { */ #define OPUS_GET_SAMPLE_RATE(x) OPUS_GET_SAMPLE_RATE_REQUEST, __opus_check_int_ptr(x) +/** If set to 1, disables the use of phase inversion for intensity stereo, + * improving the quality of mono downmixes, but slightly reducing normal + * stereo quality. Disabling phase inversion in the decoder does not comply + * with RFC 6716, although it does not cause any interoperability issue and + * is expected to become part of the Opus standard once RFC 6716 is updated + * by draft-ietf-codec-opus-update. + * @see OPUS_GET_PHASE_INVERSION_DISABLED + * @param[in] x <tt>opus_int32</tt>: Allowed values: + * <dl> + * <dt>0</dt><dd>Enable phase inversion (default).</dd> + * <dt>1</dt><dd>Disable phase inversion.</dd> + * </dl> + * @hideinitializer */ +#define OPUS_SET_PHASE_INVERSION_DISABLED(x) OPUS_SET_PHASE_INVERSION_DISABLED_REQUEST, __opus_check_int(x) +/** Gets the encoder's configured phase inversion status. + * @see OPUS_SET_PHASE_INVERSION_DISABLED + * @param[out] x <tt>opus_int32 *</tt>: Returns one of the following values: + * <dl> + * <dt>0</dt><dd>Stereo phase inversion enabled (default).</dd> + * <dt>1</dt><dd>Stereo phase inversion disabled.</dd> + * </dl> + * @hideinitializer */ +#define OPUS_GET_PHASE_INVERSION_DISABLED(x) OPUS_GET_PHASE_INVERSION_DISABLED_REQUEST, __opus_check_int_ptr(x) +/** Gets the DTX state of the encoder. + * Returns whether the last encoded frame was either a comfort noise update + * during DTX or not encoded because of DTX. + * @param[out] x <tt>opus_int32 *</tt>: Returns one of the following values: + * <dl> + * <dt>0</dt><dd>The encoder is not in DTX.</dd> + * <dt>1</dt><dd>The encoder is in DTX.</dd> + * </dl> + * @hideinitializer */ +#define OPUS_GET_IN_DTX(x) OPUS_GET_IN_DTX_REQUEST, __opus_check_int_ptr(x) + /**@}*/ /** @defgroup opus_decoderctls Decoder related CTLs diff --git a/thirdparty/opus/opus/opus_multistream.h b/thirdparty/opus/opus/opus_multistream.h index 3622e009fb..babcee6905 100644 --- a/thirdparty/opus/opus/opus_multistream.h +++ b/thirdparty/opus/opus/opus_multistream.h @@ -273,7 +273,7 @@ OPUS_EXPORT OPUS_WARN_UNUSED_RESULT OpusMSEncoder *opus_multistream_surround_enc unsigned char *mapping, int application, int *error -) OPUS_ARG_NONNULL(5); +) OPUS_ARG_NONNULL(4) OPUS_ARG_NONNULL(5) OPUS_ARG_NONNULL(6); /** Initialize a previously allocated multistream encoder state. * The memory pointed to by \a st must be at least the size returned by @@ -342,7 +342,7 @@ OPUS_EXPORT int opus_multistream_surround_encoder_init( int *coupled_streams, unsigned char *mapping, int application -) OPUS_ARG_NONNULL(1) OPUS_ARG_NONNULL(6); +) OPUS_ARG_NONNULL(1) OPUS_ARG_NONNULL(5) OPUS_ARG_NONNULL(6) OPUS_ARG_NONNULL(7); /** Encodes a multistream Opus frame. * @param st <tt>OpusMSEncoder*</tt>: Multistream encoder state. diff --git a/thirdparty/opus/opus/opus_projection.h b/thirdparty/opus/opus/opus_projection.h new file mode 100644 index 0000000000..9dabf4e85c --- /dev/null +++ b/thirdparty/opus/opus/opus_projection.h @@ -0,0 +1,568 @@ +/* Copyright (c) 2017 Google Inc. + Written by Andrew Allen */ +/* + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + - Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + - Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER + OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, + EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, + PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR + PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +/** + * @file opus_projection.h + * @brief Opus projection reference API + */ + +#ifndef OPUS_PROJECTION_H +#define OPUS_PROJECTION_H + +#include "opus_multistream.h" + +#ifdef __cplusplus +extern "C" { +#endif + +/** @cond OPUS_INTERNAL_DOC */ + +/** These are the actual encoder and decoder CTL ID numbers. + * They should not be used directly by applications.c + * In general, SETs should be even and GETs should be odd.*/ +/**@{*/ +#define OPUS_PROJECTION_GET_DEMIXING_MATRIX_GAIN_REQUEST 6001 +#define OPUS_PROJECTION_GET_DEMIXING_MATRIX_SIZE_REQUEST 6003 +#define OPUS_PROJECTION_GET_DEMIXING_MATRIX_REQUEST 6005 +/**@}*/ + + +/** @endcond */ + +/** @defgroup opus_projection_ctls Projection specific encoder and decoder CTLs + * + * These are convenience macros that are specific to the + * opus_projection_encoder_ctl() and opus_projection_decoder_ctl() + * interface. + * The CTLs from @ref opus_genericctls, @ref opus_encoderctls, + * @ref opus_decoderctls, and @ref opus_multistream_ctls may be applied to a + * projection encoder or decoder as well. + */ +/**@{*/ + +/** Gets the gain (in dB. S7.8-format) of the demixing matrix from the encoder. + * @param[out] x <tt>opus_int32 *</tt>: Returns the gain (in dB. S7.8-format) + * of the demixing matrix. + * @hideinitializer + */ +#define OPUS_PROJECTION_GET_DEMIXING_MATRIX_GAIN(x) OPUS_PROJECTION_GET_DEMIXING_MATRIX_GAIN_REQUEST, __opus_check_int_ptr(x) + + +/** Gets the size in bytes of the demixing matrix from the encoder. + * @param[out] x <tt>opus_int32 *</tt>: Returns the size in bytes of the + * demixing matrix. + * @hideinitializer + */ +#define OPUS_PROJECTION_GET_DEMIXING_MATRIX_SIZE(x) OPUS_PROJECTION_GET_DEMIXING_MATRIX_SIZE_REQUEST, __opus_check_int_ptr(x) + + +/** Copies the demixing matrix to the supplied pointer location. + * @param[out] x <tt>unsigned char *</tt>: Returns the demixing matrix to the + * supplied pointer location. + * @param y <tt>opus_int32</tt>: The size in bytes of the reserved memory at the + * pointer location. + * @hideinitializer + */ +#define OPUS_PROJECTION_GET_DEMIXING_MATRIX(x,y) OPUS_PROJECTION_GET_DEMIXING_MATRIX_REQUEST, x, __opus_check_int(y) + + +/**@}*/ + +/** Opus projection encoder state. + * This contains the complete state of a projection Opus encoder. + * It is position independent and can be freely copied. + * @see opus_projection_ambisonics_encoder_create + */ +typedef struct OpusProjectionEncoder OpusProjectionEncoder; + + +/** Opus projection decoder state. + * This contains the complete state of a projection Opus decoder. + * It is position independent and can be freely copied. + * @see opus_projection_decoder_create + * @see opus_projection_decoder_init + */ +typedef struct OpusProjectionDecoder OpusProjectionDecoder; + + +/**\name Projection encoder functions */ +/**@{*/ + +/** Gets the size of an OpusProjectionEncoder structure. + * @param channels <tt>int</tt>: The total number of input channels to encode. + * This must be no more than 255. + * @param mapping_family <tt>int</tt>: The mapping family to use for selecting + * the appropriate projection. + * @returns The size in bytes on success, or a negative error code + * (see @ref opus_errorcodes) on error. + */ +OPUS_EXPORT OPUS_WARN_UNUSED_RESULT opus_int32 opus_projection_ambisonics_encoder_get_size( + int channels, + int mapping_family +); + + +/** Allocates and initializes a projection encoder state. + * Call opus_projection_encoder_destroy() to release + * this object when finished. + * @param Fs <tt>opus_int32</tt>: Sampling rate of the input signal (in Hz). + * This must be one of 8000, 12000, 16000, + * 24000, or 48000. + * @param channels <tt>int</tt>: Number of channels in the input signal. + * This must be at most 255. + * It may be greater than the number of + * coded channels (<code>streams + + * coupled_streams</code>). + * @param mapping_family <tt>int</tt>: The mapping family to use for selecting + * the appropriate projection. + * @param[out] streams <tt>int *</tt>: The total number of streams that will + * be encoded from the input. + * @param[out] coupled_streams <tt>int *</tt>: Number of coupled (2 channel) + * streams that will be encoded from the input. + * @param application <tt>int</tt>: The target encoder application. + * This must be one of the following: + * <dl> + * <dt>#OPUS_APPLICATION_VOIP</dt> + * <dd>Process signal for improved speech intelligibility.</dd> + * <dt>#OPUS_APPLICATION_AUDIO</dt> + * <dd>Favor faithfulness to the original input.</dd> + * <dt>#OPUS_APPLICATION_RESTRICTED_LOWDELAY</dt> + * <dd>Configure the minimum possible coding delay by disabling certain modes + * of operation.</dd> + * </dl> + * @param[out] error <tt>int *</tt>: Returns #OPUS_OK on success, or an error + * code (see @ref opus_errorcodes) on + * failure. + */ +OPUS_EXPORT OPUS_WARN_UNUSED_RESULT OpusProjectionEncoder *opus_projection_ambisonics_encoder_create( + opus_int32 Fs, + int channels, + int mapping_family, + int *streams, + int *coupled_streams, + int application, + int *error +) OPUS_ARG_NONNULL(4) OPUS_ARG_NONNULL(5); + + +/** Initialize a previously allocated projection encoder state. + * The memory pointed to by \a st must be at least the size returned by + * opus_projection_ambisonics_encoder_get_size(). + * This is intended for applications which use their own allocator instead of + * malloc. + * To reset a previously initialized state, use the #OPUS_RESET_STATE CTL. + * @see opus_projection_ambisonics_encoder_create + * @see opus_projection_ambisonics_encoder_get_size + * @param st <tt>OpusProjectionEncoder*</tt>: Projection encoder state to initialize. + * @param Fs <tt>opus_int32</tt>: Sampling rate of the input signal (in Hz). + * This must be one of 8000, 12000, 16000, + * 24000, or 48000. + * @param channels <tt>int</tt>: Number of channels in the input signal. + * This must be at most 255. + * It may be greater than the number of + * coded channels (<code>streams + + * coupled_streams</code>). + * @param streams <tt>int</tt>: The total number of streams to encode from the + * input. + * This must be no more than the number of channels. + * @param coupled_streams <tt>int</tt>: Number of coupled (2 channel) streams + * to encode. + * This must be no larger than the total + * number of streams. + * Additionally, The total number of + * encoded channels (<code>streams + + * coupled_streams</code>) must be no + * more than the number of input channels. + * @param application <tt>int</tt>: The target encoder application. + * This must be one of the following: + * <dl> + * <dt>#OPUS_APPLICATION_VOIP</dt> + * <dd>Process signal for improved speech intelligibility.</dd> + * <dt>#OPUS_APPLICATION_AUDIO</dt> + * <dd>Favor faithfulness to the original input.</dd> + * <dt>#OPUS_APPLICATION_RESTRICTED_LOWDELAY</dt> + * <dd>Configure the minimum possible coding delay by disabling certain modes + * of operation.</dd> + * </dl> + * @returns #OPUS_OK on success, or an error code (see @ref opus_errorcodes) + * on failure. + */ +OPUS_EXPORT int opus_projection_ambisonics_encoder_init( + OpusProjectionEncoder *st, + opus_int32 Fs, + int channels, + int mapping_family, + int *streams, + int *coupled_streams, + int application +) OPUS_ARG_NONNULL(1) OPUS_ARG_NONNULL(5) OPUS_ARG_NONNULL(6); + + +/** Encodes a projection Opus frame. + * @param st <tt>OpusProjectionEncoder*</tt>: Projection encoder state. + * @param[in] pcm <tt>const opus_int16*</tt>: The input signal as interleaved + * samples. + * This must contain + * <code>frame_size*channels</code> + * samples. + * @param frame_size <tt>int</tt>: Number of samples per channel in the input + * signal. + * This must be an Opus frame size for the + * encoder's sampling rate. + * For example, at 48 kHz the permitted values + * are 120, 240, 480, 960, 1920, and 2880. + * Passing in a duration of less than 10 ms + * (480 samples at 48 kHz) will prevent the + * encoder from using the LPC or hybrid modes. + * @param[out] data <tt>unsigned char*</tt>: Output payload. + * This must contain storage for at + * least \a max_data_bytes. + * @param [in] max_data_bytes <tt>opus_int32</tt>: Size of the allocated + * memory for the output + * payload. This may be + * used to impose an upper limit on + * the instant bitrate, but should + * not be used as the only bitrate + * control. Use #OPUS_SET_BITRATE to + * control the bitrate. + * @returns The length of the encoded packet (in bytes) on success or a + * negative error code (see @ref opus_errorcodes) on failure. + */ +OPUS_EXPORT OPUS_WARN_UNUSED_RESULT int opus_projection_encode( + OpusProjectionEncoder *st, + const opus_int16 *pcm, + int frame_size, + unsigned char *data, + opus_int32 max_data_bytes +) OPUS_ARG_NONNULL(1) OPUS_ARG_NONNULL(2) OPUS_ARG_NONNULL(4); + + +/** Encodes a projection Opus frame from floating point input. + * @param st <tt>OpusProjectionEncoder*</tt>: Projection encoder state. + * @param[in] pcm <tt>const float*</tt>: The input signal as interleaved + * samples with a normal range of + * +/-1.0. + * Samples with a range beyond +/-1.0 + * are supported but will be clipped by + * decoders using the integer API and + * should only be used if it is known + * that the far end supports extended + * dynamic range. + * This must contain + * <code>frame_size*channels</code> + * samples. + * @param frame_size <tt>int</tt>: Number of samples per channel in the input + * signal. + * This must be an Opus frame size for the + * encoder's sampling rate. + * For example, at 48 kHz the permitted values + * are 120, 240, 480, 960, 1920, and 2880. + * Passing in a duration of less than 10 ms + * (480 samples at 48 kHz) will prevent the + * encoder from using the LPC or hybrid modes. + * @param[out] data <tt>unsigned char*</tt>: Output payload. + * This must contain storage for at + * least \a max_data_bytes. + * @param [in] max_data_bytes <tt>opus_int32</tt>: Size of the allocated + * memory for the output + * payload. This may be + * used to impose an upper limit on + * the instant bitrate, but should + * not be used as the only bitrate + * control. Use #OPUS_SET_BITRATE to + * control the bitrate. + * @returns The length of the encoded packet (in bytes) on success or a + * negative error code (see @ref opus_errorcodes) on failure. + */ +OPUS_EXPORT OPUS_WARN_UNUSED_RESULT int opus_projection_encode_float( + OpusProjectionEncoder *st, + const float *pcm, + int frame_size, + unsigned char *data, + opus_int32 max_data_bytes +) OPUS_ARG_NONNULL(1) OPUS_ARG_NONNULL(2) OPUS_ARG_NONNULL(4); + + +/** Frees an <code>OpusProjectionEncoder</code> allocated by + * opus_projection_ambisonics_encoder_create(). + * @param st <tt>OpusProjectionEncoder*</tt>: Projection encoder state to be freed. + */ +OPUS_EXPORT void opus_projection_encoder_destroy(OpusProjectionEncoder *st); + + +/** Perform a CTL function on a projection Opus encoder. + * + * Generally the request and subsequent arguments are generated by a + * convenience macro. + * @param st <tt>OpusProjectionEncoder*</tt>: Projection encoder state. + * @param request This and all remaining parameters should be replaced by one + * of the convenience macros in @ref opus_genericctls, + * @ref opus_encoderctls, @ref opus_multistream_ctls, or + * @ref opus_projection_ctls + * @see opus_genericctls + * @see opus_encoderctls + * @see opus_multistream_ctls + * @see opus_projection_ctls + */ +OPUS_EXPORT int opus_projection_encoder_ctl(OpusProjectionEncoder *st, int request, ...) OPUS_ARG_NONNULL(1); + + +/**@}*/ + +/**\name Projection decoder functions */ +/**@{*/ + +/** Gets the size of an <code>OpusProjectionDecoder</code> structure. + * @param channels <tt>int</tt>: The total number of output channels. + * This must be no more than 255. + * @param streams <tt>int</tt>: The total number of streams coded in the + * input. + * This must be no more than 255. + * @param coupled_streams <tt>int</tt>: Number streams to decode as coupled + * (2 channel) streams. + * This must be no larger than the total + * number of streams. + * Additionally, The total number of + * coded channels (<code>streams + + * coupled_streams</code>) must be no + * more than 255. + * @returns The size in bytes on success, or a negative error code + * (see @ref opus_errorcodes) on error. + */ +OPUS_EXPORT OPUS_WARN_UNUSED_RESULT opus_int32 opus_projection_decoder_get_size( + int channels, + int streams, + int coupled_streams +); + + +/** Allocates and initializes a projection decoder state. + * Call opus_projection_decoder_destroy() to release + * this object when finished. + * @param Fs <tt>opus_int32</tt>: Sampling rate to decode at (in Hz). + * This must be one of 8000, 12000, 16000, + * 24000, or 48000. + * @param channels <tt>int</tt>: Number of channels to output. + * This must be at most 255. + * It may be different from the number of coded + * channels (<code>streams + + * coupled_streams</code>). + * @param streams <tt>int</tt>: The total number of streams coded in the + * input. + * This must be no more than 255. + * @param coupled_streams <tt>int</tt>: Number of streams to decode as coupled + * (2 channel) streams. + * This must be no larger than the total + * number of streams. + * Additionally, The total number of + * coded channels (<code>streams + + * coupled_streams</code>) must be no + * more than 255. + * @param[in] demixing_matrix <tt>const unsigned char[demixing_matrix_size]</tt>: Demixing matrix + * that mapping from coded channels to output channels, + * as described in @ref opus_projection and + * @ref opus_projection_ctls. + * @param demixing_matrix_size <tt>opus_int32</tt>: The size in bytes of the + * demixing matrix, as + * described in @ref + * opus_projection_ctls. + * @param[out] error <tt>int *</tt>: Returns #OPUS_OK on success, or an error + * code (see @ref opus_errorcodes) on + * failure. + */ +OPUS_EXPORT OPUS_WARN_UNUSED_RESULT OpusProjectionDecoder *opus_projection_decoder_create( + opus_int32 Fs, + int channels, + int streams, + int coupled_streams, + unsigned char *demixing_matrix, + opus_int32 demixing_matrix_size, + int *error +) OPUS_ARG_NONNULL(5); + + +/** Intialize a previously allocated projection decoder state object. + * The memory pointed to by \a st must be at least the size returned by + * opus_projection_decoder_get_size(). + * This is intended for applications which use their own allocator instead of + * malloc. + * To reset a previously initialized state, use the #OPUS_RESET_STATE CTL. + * @see opus_projection_decoder_create + * @see opus_projection_deocder_get_size + * @param st <tt>OpusProjectionDecoder*</tt>: Projection encoder state to initialize. + * @param Fs <tt>opus_int32</tt>: Sampling rate to decode at (in Hz). + * This must be one of 8000, 12000, 16000, + * 24000, or 48000. + * @param channels <tt>int</tt>: Number of channels to output. + * This must be at most 255. + * It may be different from the number of coded + * channels (<code>streams + + * coupled_streams</code>). + * @param streams <tt>int</tt>: The total number of streams coded in the + * input. + * This must be no more than 255. + * @param coupled_streams <tt>int</tt>: Number of streams to decode as coupled + * (2 channel) streams. + * This must be no larger than the total + * number of streams. + * Additionally, The total number of + * coded channels (<code>streams + + * coupled_streams</code>) must be no + * more than 255. + * @param[in] demixing_matrix <tt>const unsigned char[demixing_matrix_size]</tt>: Demixing matrix + * that mapping from coded channels to output channels, + * as described in @ref opus_projection and + * @ref opus_projection_ctls. + * @param demixing_matrix_size <tt>opus_int32</tt>: The size in bytes of the + * demixing matrix, as + * described in @ref + * opus_projection_ctls. + * @returns #OPUS_OK on success, or an error code (see @ref opus_errorcodes) + * on failure. + */ +OPUS_EXPORT int opus_projection_decoder_init( + OpusProjectionDecoder *st, + opus_int32 Fs, + int channels, + int streams, + int coupled_streams, + unsigned char *demixing_matrix, + opus_int32 demixing_matrix_size +) OPUS_ARG_NONNULL(1) OPUS_ARG_NONNULL(6); + + +/** Decode a projection Opus packet. + * @param st <tt>OpusProjectionDecoder*</tt>: Projection decoder state. + * @param[in] data <tt>const unsigned char*</tt>: Input payload. + * Use a <code>NULL</code> + * pointer to indicate packet + * loss. + * @param len <tt>opus_int32</tt>: Number of bytes in payload. + * @param[out] pcm <tt>opus_int16*</tt>: Output signal, with interleaved + * samples. + * This must contain room for + * <code>frame_size*channels</code> + * samples. + * @param frame_size <tt>int</tt>: The number of samples per channel of + * available space in \a pcm. + * If this is less than the maximum packet duration + * (120 ms; 5760 for 48kHz), this function will not be capable + * of decoding some packets. In the case of PLC (data==NULL) + * or FEC (decode_fec=1), then frame_size needs to be exactly + * the duration of audio that is missing, otherwise the + * decoder will not be in the optimal state to decode the + * next incoming packet. For the PLC and FEC cases, frame_size + * <b>must</b> be a multiple of 2.5 ms. + * @param decode_fec <tt>int</tt>: Flag (0 or 1) to request that any in-band + * forward error correction data be decoded. + * If no such data is available, the frame is + * decoded as if it were lost. + * @returns Number of samples decoded on success or a negative error code + * (see @ref opus_errorcodes) on failure. + */ +OPUS_EXPORT OPUS_WARN_UNUSED_RESULT int opus_projection_decode( + OpusProjectionDecoder *st, + const unsigned char *data, + opus_int32 len, + opus_int16 *pcm, + int frame_size, + int decode_fec +) OPUS_ARG_NONNULL(1) OPUS_ARG_NONNULL(4); + + +/** Decode a projection Opus packet with floating point output. + * @param st <tt>OpusProjectionDecoder*</tt>: Projection decoder state. + * @param[in] data <tt>const unsigned char*</tt>: Input payload. + * Use a <code>NULL</code> + * pointer to indicate packet + * loss. + * @param len <tt>opus_int32</tt>: Number of bytes in payload. + * @param[out] pcm <tt>opus_int16*</tt>: Output signal, with interleaved + * samples. + * This must contain room for + * <code>frame_size*channels</code> + * samples. + * @param frame_size <tt>int</tt>: The number of samples per channel of + * available space in \a pcm. + * If this is less than the maximum packet duration + * (120 ms; 5760 for 48kHz), this function will not be capable + * of decoding some packets. In the case of PLC (data==NULL) + * or FEC (decode_fec=1), then frame_size needs to be exactly + * the duration of audio that is missing, otherwise the + * decoder will not be in the optimal state to decode the + * next incoming packet. For the PLC and FEC cases, frame_size + * <b>must</b> be a multiple of 2.5 ms. + * @param decode_fec <tt>int</tt>: Flag (0 or 1) to request that any in-band + * forward error correction data be decoded. + * If no such data is available, the frame is + * decoded as if it were lost. + * @returns Number of samples decoded on success or a negative error code + * (see @ref opus_errorcodes) on failure. + */ +OPUS_EXPORT OPUS_WARN_UNUSED_RESULT int opus_projection_decode_float( + OpusProjectionDecoder *st, + const unsigned char *data, + opus_int32 len, + float *pcm, + int frame_size, + int decode_fec +) OPUS_ARG_NONNULL(1) OPUS_ARG_NONNULL(4); + + +/** Perform a CTL function on a projection Opus decoder. + * + * Generally the request and subsequent arguments are generated by a + * convenience macro. + * @param st <tt>OpusProjectionDecoder*</tt>: Projection decoder state. + * @param request This and all remaining parameters should be replaced by one + * of the convenience macros in @ref opus_genericctls, + * @ref opus_decoderctls, @ref opus_multistream_ctls, or + * @ref opus_projection_ctls. + * @see opus_genericctls + * @see opus_decoderctls + * @see opus_multistream_ctls + * @see opus_projection_ctls + */ +OPUS_EXPORT int opus_projection_decoder_ctl(OpusProjectionDecoder *st, int request, ...) OPUS_ARG_NONNULL(1); + + +/** Frees an <code>OpusProjectionDecoder</code> allocated by + * opus_projection_decoder_create(). + * @param st <tt>OpusProjectionDecoder</tt>: Projection decoder state to be freed. + */ +OPUS_EXPORT void opus_projection_decoder_destroy(OpusProjectionDecoder *st); + + +/**@}*/ + +/**@}*/ + +#ifdef __cplusplus +} +#endif + +#endif /* OPUS_PROJECTION_H */ diff --git a/thirdparty/opus/opus/opus_types.h b/thirdparty/opus/opus/opus_types.h index b28e03aea2..7cf675580f 100644 --- a/thirdparty/opus/opus/opus_types.h +++ b/thirdparty/opus/opus/opus_types.h @@ -33,14 +33,29 @@ #ifndef OPUS_TYPES_H #define OPUS_TYPES_H +#define opus_int int /* used for counters etc; at least 16 bits */ +#define opus_int64 long long +#define opus_int8 signed char + +#define opus_uint unsigned int /* used for counters etc; at least 16 bits */ +#define opus_uint64 unsigned long long +#define opus_uint8 unsigned char + /* Use the real stdint.h if it's there (taken from Paul Hsieh's pstdint.h) */ -#if (defined(__STDC__) && __STDC__ && __STDC_VERSION__ >= 199901L) || (defined(__GNUC__) && (defined(_STDINT_H) || defined(_STDINT_H_)) || defined (HAVE_STDINT_H)) +#if (defined(__STDC__) && __STDC__ && defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L) || (defined(__GNUC__) && (defined(_STDINT_H) || defined(_STDINT_H_)) || defined (HAVE_STDINT_H)) #include <stdint.h> - +# undef opus_int64 +# undef opus_int8 +# undef opus_uint64 +# undef opus_uint8 + typedef int8_t opus_int8; + typedef uint8_t opus_uint8; typedef int16_t opus_int16; typedef uint16_t opus_uint16; typedef int32_t opus_int32; typedef uint32_t opus_uint32; + typedef int64_t opus_int64; + typedef uint64_t opus_uint64; #elif defined(_WIN32) # if defined(__CYGWIN__) @@ -148,12 +163,4 @@ #endif -#define opus_int int /* used for counters etc; at least 16 bits */ -#define opus_int64 long long -#define opus_int8 signed char - -#define opus_uint unsigned int /* used for counters etc; at least 16 bits */ -#define opus_uint64 unsigned long long -#define opus_uint8 unsigned char - #endif /* OPUS_TYPES_H */ diff --git a/thirdparty/opus/opus/opusfile.h b/thirdparty/opus/opus/opusfile.h index 4bf2fba926..e3a3dc8389 100644 --- a/thirdparty/opus/opus/opusfile.h +++ b/thirdparty/opus/opus/opusfile.h @@ -239,7 +239,8 @@ struct OpusHead{ -32768...32767. The <tt>libopusfile</tt> API will automatically apply this gain to the decoded output before returning it, scaling it by - <code>pow(10,output_gain/(20.0*256))</code>.*/ + <code>pow(10,output_gain/(20.0*256))</code>. + You can adjust this behavior with op_set_gain_offset().*/ int output_gain; /**The channel mapping family, in the range 0...255. Channel mapping family 0 covers mono or stereo in a single stream. @@ -1154,16 +1155,18 @@ OP_WARN_UNUSED_RESULT OggOpusFile *op_open_url(const char *_url, int *_error,...) OP_ARG_NONNULL(1); /**Open a stream using the given set of callbacks to access it. - \param _source The stream to read from (e.g., a <code>FILE *</code>). + \param _stream The stream to read from (e.g., a <code>FILE *</code>). + This value will be passed verbatim as the first + argument to all of the callbacks. \param _cb The callbacks with which to access the stream. <code><a href="#op_read_func">read()</a></code> must be implemented. <code><a href="#op_seek_func">seek()</a></code> and <code><a href="#op_tell_func">tell()</a></code> may be <code>NULL</code>, or may always return -1 to - indicate a source is unseekable, but if + indicate a stream is unseekable, but if <code><a href="#op_seek_func">seek()</a></code> is - implemented and succeeds on a particular source, then + implemented and succeeds on a particular stream, then <code><a href="#op_tell_func">tell()</a></code> must also. <code><a href="#op_close_func">close()</a></code> may @@ -1226,11 +1229,11 @@ OP_WARN_UNUSED_RESULT OggOpusFile *op_open_url(const char *_url, basic validity checks.</dd> </dl> \return A freshly opened \c OggOpusFile, or <code>NULL</code> on error. - <tt>libopusfile</tt> does <em>not</em> take ownership of the source + <tt>libopusfile</tt> does <em>not</em> take ownership of the stream if the call fails. - The calling application is responsible for closing the source if + The calling application is responsible for closing the stream if this call returns an error.*/ -OP_WARN_UNUSED_RESULT OggOpusFile *op_open_callbacks(void *_source, +OP_WARN_UNUSED_RESULT OggOpusFile *op_open_callbacks(void *_stream, const OpusFileCallbacks *_cb,const unsigned char *_initial_data, size_t _initial_bytes,int *_error) OP_ARG_NONNULL(2); @@ -1332,18 +1335,20 @@ OP_WARN_UNUSED_RESULT OggOpusFile *op_test_url(const char *_url, For new code, you are likely better off using op_test() instead, which is less resource-intensive, requires less data to succeed, and imposes a hard limit on the amount of data it examines (important for unseekable - sources, where all such data must be buffered until you are sure of the + streams, where all such data must be buffered until you are sure of the stream type). - \param _source The stream to read from (e.g., a <code>FILE *</code>). + \param _stream The stream to read from (e.g., a <code>FILE *</code>). + This value will be passed verbatim as the first + argument to all of the callbacks. \param _cb The callbacks with which to access the stream. <code><a href="#op_read_func">read()</a></code> must be implemented. <code><a href="#op_seek_func">seek()</a></code> and <code><a href="#op_tell_func">tell()</a></code> may be <code>NULL</code>, or may always return -1 to - indicate a source is unseekable, but if + indicate a stream is unseekable, but if <code><a href="#op_seek_func">seek()</a></code> is - implemented and succeeds on a particular source, then + implemented and succeeds on a particular stream, then <code><a href="#op_tell_func">tell()</a></code> must also. <code><a href="#op_close_func">close()</a></code> may @@ -1373,11 +1378,11 @@ OP_WARN_UNUSED_RESULT OggOpusFile *op_test_url(const char *_url, See op_open_callbacks() for a full list of failure codes. \return A partially opened \c OggOpusFile, or <code>NULL</code> on error. - <tt>libopusfile</tt> does <em>not</em> take ownership of the source + <tt>libopusfile</tt> does <em>not</em> take ownership of the stream if the call fails. - The calling application is responsible for closing the source if + The calling application is responsible for closing the stream if this call returns an error.*/ -OP_WARN_UNUSED_RESULT OggOpusFile *op_test_callbacks(void *_source, +OP_WARN_UNUSED_RESULT OggOpusFile *op_test_callbacks(void *_stream, const OpusFileCallbacks *_cb,const unsigned char *_initial_data, size_t _initial_bytes,int *_error) OP_ARG_NONNULL(2); @@ -1434,7 +1439,7 @@ void op_free(OggOpusFile *_of); Their documention will indicate so explicitly.*/ /*@{*/ -/**Returns whether or not the data source being read is seekable. +/**Returns whether or not the stream being read is seekable. This is true if <ol> <li>The <code><a href="#op_seek_func">seek()</a></code> and @@ -1455,9 +1460,9 @@ int op_seekable(const OggOpusFile *_of) OP_ARG_NONNULL(1); return 1. The actual number of links is not known until the stream is fully opened. \param _of The \c OggOpusFile from which to retrieve the link count. - \return For fully-open seekable sources, this returns the total number of + \return For fully-open seekable streams, this returns the total number of links in the whole stream, which will be at least 1. - For partially-open or unseekable sources, this always returns 1.*/ + For partially-open or unseekable streams, this always returns 1.*/ int op_link_count(const OggOpusFile *_of) OP_ARG_NONNULL(1); /**Get the serial number of the given link in a (possibly-chained) Ogg Opus @@ -1471,7 +1476,7 @@ int op_link_count(const OggOpusFile *_of) OP_ARG_NONNULL(1); \return The serial number of the given link. If \a _li is greater than the total number of links, this returns the serial number of the last link. - If the source is not seekable, this always returns the serial number + If the stream is not seekable, this always returns the serial number of the current link.*/ opus_uint32 op_serialno(const OggOpusFile *_of,int _li) OP_ARG_NONNULL(1); @@ -1488,7 +1493,7 @@ opus_uint32 op_serialno(const OggOpusFile *_of,int _li) OP_ARG_NONNULL(1); \return The channel count of the given link. If \a _li is greater than the total number of links, this returns the channel count of the last link. - If the source is not seekable, this always returns the channel count + If the stream is not seekable, this always returns the channel count of the current link.*/ int op_channel_count(const OggOpusFile *_of,int _li) OP_ARG_NONNULL(1); @@ -1507,9 +1512,9 @@ int op_channel_count(const OggOpusFile *_of,int _li) OP_ARG_NONNULL(1); compressed size of link \a _li if it is non-negative, or a negative value on error. The compressed size of the entire stream may be smaller than that - of the underlying source if trailing garbage was detected in the + of the underlying stream if trailing garbage was detected in the file. - \retval #OP_EINVAL The source is not seekable (so we can't know the length), + \retval #OP_EINVAL The stream is not seekable (so we can't know the length), \a _li wasn't less than the total number of links in the stream, or the stream was only partially open.*/ opus_int64 op_raw_total(const OggOpusFile *_of,int _li) OP_ARG_NONNULL(1); @@ -1527,7 +1532,7 @@ opus_int64 op_raw_total(const OggOpusFile *_of,int _li) OP_ARG_NONNULL(1); \return The PCM length of the entire stream if \a _li is negative, the PCM length of link \a _li if it is non-negative, or a negative value on error. - \retval #OP_EINVAL The source is not seekable (so we can't know the length), + \retval #OP_EINVAL The stream is not seekable (so we can't know the length), \a _li wasn't less than the total number of links in the stream, or the stream was only partially open.*/ ogg_int64_t op_pcm_total(const OggOpusFile *_of,int _li) OP_ARG_NONNULL(1); @@ -1575,8 +1580,8 @@ const OpusTags *op_tags(const OggOpusFile *_of,int _li) OP_ARG_NONNULL(1); \param _of The \c OggOpusFile from which to retrieve the current link index. \return The index of the current link on success, or a negative value on failure. - For seekable streams, this is a number between 0 and the value - returned by op_link_count(). + For seekable streams, this is a number between 0 (inclusive) and the + value returned by op_link_count() (exclusive). For unseekable streams, this value starts at 0 and increments by one each time a new link is encountered (even though op_link_count() always returns 1). @@ -1640,10 +1645,10 @@ ogg_int64_t op_pcm_tell(const OggOpusFile *_of) OP_ARG_NONNULL(1); /*@{*/ /**\name Functions for seeking in Opus streams - These functions let you seek in Opus streams, if the underlying source + These functions let you seek in Opus streams, if the underlying stream support it. Seeking is implemented for all built-in stream I/O routines, though some - individual sources may not be seekable (pipes, live HTTP streams, or HTTP + individual streams may not be seekable (pipes, live HTTP streams, or HTTP streams from a server that does not support <code>Range</code> requests). op_raw_seek() is the fastest: it is guaranteed to perform at most one @@ -1670,6 +1675,8 @@ ogg_int64_t op_pcm_tell(const OggOpusFile *_of) OP_ARG_NONNULL(1); packets out of the tail of the link to which it seeks. \param _of The \c OggOpusFile in which to seek. \param _byte_offset The byte position to seek to. + This must be between 0 and #op_raw_total(\a _of,\c -1) + (inclusive). \return 0 on success, or a negative error code on failure. \retval #OP_EREAD The underlying seek operation failed. \retval #OP_EINVAL The stream was only partially open, or the target was diff --git a/thirdparty/opus/opus_compare.c b/thirdparty/opus/opus_compare.c index 06c67d752f..1956e08fa5 100644 --- a/thirdparty/opus/opus_compare.c +++ b/thirdparty/opus/opus_compare.c @@ -363,6 +363,9 @@ int main(int _argc,const char **_argv){ Ef*=Ef; err+=Ef*Ef; } + free(xb); + free(X); + free(Y); err=pow(err/nframes,1.0/16); Q=100*(1-0.5*log(1+err)/log(1.13)); if(Q<0){ diff --git a/thirdparty/opus/opus_decoder.c b/thirdparty/opus/opus_decoder.c index 080bec5072..9113638a00 100644 --- a/thirdparty/opus/opus_decoder.c +++ b/thirdparty/opus/opus_decoder.c @@ -78,6 +78,26 @@ struct OpusDecoder { opus_uint32 rangeFinal; }; +#if defined(ENABLE_HARDENING) || defined(ENABLE_ASSERTIONS) +static void validate_opus_decoder(OpusDecoder *st) +{ + celt_assert(st->channels == 1 || st->channels == 2); + celt_assert(st->Fs == 48000 || st->Fs == 24000 || st->Fs == 16000 || st->Fs == 12000 || st->Fs == 8000); + celt_assert(st->DecControl.API_sampleRate == st->Fs); + celt_assert(st->DecControl.internalSampleRate == 0 || st->DecControl.internalSampleRate == 16000 || st->DecControl.internalSampleRate == 12000 || st->DecControl.internalSampleRate == 8000); + celt_assert(st->DecControl.nChannelsAPI == st->channels); + celt_assert(st->DecControl.nChannelsInternal == 0 || st->DecControl.nChannelsInternal == 1 || st->DecControl.nChannelsInternal == 2); + celt_assert(st->DecControl.payloadSize_ms == 0 || st->DecControl.payloadSize_ms == 10 || st->DecControl.payloadSize_ms == 20 || st->DecControl.payloadSize_ms == 40 || st->DecControl.payloadSize_ms == 60); +#ifdef OPUS_ARCHMASK + celt_assert(st->arch >= 0); + celt_assert(st->arch <= OPUS_ARCHMASK); +#endif + celt_assert(st->stream_channels == 1 || st->stream_channels == 2); +} +#define VALIDATE_OPUS_DECODER(st) validate_opus_decoder(st) +#else +#define VALIDATE_OPUS_DECODER(st) +#endif int opus_decoder_get_size(int channels) { @@ -104,7 +124,7 @@ int opus_decoder_init(OpusDecoder *st, opus_int32 Fs, int channels) return OPUS_BAD_ARG; OPUS_CLEAR((char*)st, opus_decoder_get_size(channels)); - /* Initialize SILK encoder */ + /* Initialize SILK decoder */ ret = silk_Get_Decoder_Size(&silkDecSizeBytes); if (ret) return OPUS_INTERNAL_ERROR; @@ -217,6 +237,7 @@ static int opus_decode_frame(OpusDecoder *st, const unsigned char *data, int audiosize; int mode; + int bandwidth; int transition=0; int start_band; int redundancy=0; @@ -253,10 +274,12 @@ static int opus_decode_frame(OpusDecoder *st, const unsigned char *data, { audiosize = st->frame_size; mode = st->mode; + bandwidth = st->bandwidth; ec_dec_init(&dec,(unsigned char*)data,len); } else { audiosize = frame_size; mode = st->prev_mode; + bandwidth = 0; if (mode == 0) { @@ -355,15 +378,15 @@ static int opus_decode_frame(OpusDecoder *st, const unsigned char *data, { st->DecControl.nChannelsInternal = st->stream_channels; if( mode == MODE_SILK_ONLY ) { - if( st->bandwidth == OPUS_BANDWIDTH_NARROWBAND ) { + if( bandwidth == OPUS_BANDWIDTH_NARROWBAND ) { st->DecControl.internalSampleRate = 8000; - } else if( st->bandwidth == OPUS_BANDWIDTH_MEDIUMBAND ) { + } else if( bandwidth == OPUS_BANDWIDTH_MEDIUMBAND ) { st->DecControl.internalSampleRate = 12000; - } else if( st->bandwidth == OPUS_BANDWIDTH_WIDEBAND ) { + } else if( bandwidth == OPUS_BANDWIDTH_WIDEBAND ) { st->DecControl.internalSampleRate = 16000; } else { st->DecControl.internalSampleRate = 16000; - silk_assert( 0 ); + celt_assert( 0 ); } } else { /* Hybrid mode */ @@ -427,10 +450,26 @@ static int opus_decode_frame(OpusDecoder *st, const unsigned char *data, if (mode != MODE_CELT_ONLY) start_band = 17; + if (redundancy) + { + transition = 0; + pcm_transition_silk_size=ALLOC_NONE; + } + + ALLOC(pcm_transition_silk, pcm_transition_silk_size, opus_val16); + + if (transition && mode != MODE_CELT_ONLY) + { + pcm_transition = pcm_transition_silk; + opus_decode_frame(st, NULL, 0, pcm_transition, IMIN(F5, audiosize), 0); + } + + + if (bandwidth) { int endband=21; - switch(st->bandwidth) + switch(bandwidth) { case OPUS_BANDWIDTH_NARROWBAND: endband = 13; @@ -445,24 +484,13 @@ static int opus_decode_frame(OpusDecoder *st, const unsigned char *data, case OPUS_BANDWIDTH_FULLBAND: endband = 21; break; + default: + celt_assert(0); + break; } - celt_decoder_ctl(celt_dec, CELT_SET_END_BAND(endband)); - celt_decoder_ctl(celt_dec, CELT_SET_CHANNELS(st->stream_channels)); - } - - if (redundancy) - { - transition = 0; - pcm_transition_silk_size=ALLOC_NONE; - } - - ALLOC(pcm_transition_silk, pcm_transition_silk_size, opus_val16); - - if (transition && mode != MODE_CELT_ONLY) - { - pcm_transition = pcm_transition_silk; - opus_decode_frame(st, NULL, 0, pcm_transition, IMIN(F5, audiosize), 0); + MUST_SUCCEED(celt_decoder_ctl(celt_dec, CELT_SET_END_BAND(endband))); } + MUST_SUCCEED(celt_decoder_ctl(celt_dec, CELT_SET_CHANNELS(st->stream_channels))); /* Only allocation memory for redundancy if/when needed */ redundant_audio_size = redundancy ? F5*st->channels : ALLOC_NONE; @@ -471,21 +499,21 @@ static int opus_decode_frame(OpusDecoder *st, const unsigned char *data, /* 5 ms redundant frame for CELT->SILK*/ if (redundancy && celt_to_silk) { - celt_decoder_ctl(celt_dec, CELT_SET_START_BAND(0)); + MUST_SUCCEED(celt_decoder_ctl(celt_dec, CELT_SET_START_BAND(0))); celt_decode_with_ec(celt_dec, data+len, redundancy_bytes, redundant_audio, F5, NULL, 0); - celt_decoder_ctl(celt_dec, OPUS_GET_FINAL_RANGE(&redundant_rng)); + MUST_SUCCEED(celt_decoder_ctl(celt_dec, OPUS_GET_FINAL_RANGE(&redundant_rng))); } /* MUST be after PLC */ - celt_decoder_ctl(celt_dec, CELT_SET_START_BAND(start_band)); + MUST_SUCCEED(celt_decoder_ctl(celt_dec, CELT_SET_START_BAND(start_band))); if (mode != MODE_SILK_ONLY) { int celt_frame_size = IMIN(F20, frame_size); /* Make sure to discard any previous CELT state */ if (mode != st->prev_mode && st->prev_mode > 0 && !st->prev_redundancy) - celt_decoder_ctl(celt_dec, OPUS_RESET_STATE); + MUST_SUCCEED(celt_decoder_ctl(celt_dec, OPUS_RESET_STATE)); /* Decode CELT */ celt_ret = celt_decode_with_ec(celt_dec, decode_fec ? NULL : data, len, pcm, celt_frame_size, &dec, celt_accum); @@ -500,7 +528,7 @@ static int opus_decode_frame(OpusDecoder *st, const unsigned char *data, do a fade-out by decoding a silence frame */ if (st->prev_mode == MODE_HYBRID && !(redundancy && celt_to_silk && st->prev_redundancy) ) { - celt_decoder_ctl(celt_dec, CELT_SET_START_BAND(0)); + MUST_SUCCEED(celt_decoder_ctl(celt_dec, CELT_SET_START_BAND(0))); celt_decode_with_ec(celt_dec, silence, 2, pcm, F2_5, NULL, celt_accum); } } @@ -518,18 +546,18 @@ static int opus_decode_frame(OpusDecoder *st, const unsigned char *data, { const CELTMode *celt_mode; - celt_decoder_ctl(celt_dec, CELT_GET_MODE(&celt_mode)); + MUST_SUCCEED(celt_decoder_ctl(celt_dec, CELT_GET_MODE(&celt_mode))); window = celt_mode->window; } /* 5 ms redundant frame for SILK->CELT */ if (redundancy && !celt_to_silk) { - celt_decoder_ctl(celt_dec, OPUS_RESET_STATE); - celt_decoder_ctl(celt_dec, CELT_SET_START_BAND(0)); + MUST_SUCCEED(celt_decoder_ctl(celt_dec, OPUS_RESET_STATE)); + MUST_SUCCEED(celt_decoder_ctl(celt_dec, CELT_SET_START_BAND(0))); celt_decode_with_ec(celt_dec, data+len, redundancy_bytes, redundant_audio, F5, NULL, 0); - celt_decoder_ctl(celt_dec, OPUS_GET_FINAL_RANGE(&redundant_rng)); + MUST_SUCCEED(celt_decoder_ctl(celt_dec, OPUS_GET_FINAL_RANGE(&redundant_rng))); smooth_fade(pcm+st->channels*(frame_size-F2_5), redundant_audio+st->channels*F2_5, pcm+st->channels*(frame_size-F2_5), F2_5, st->channels, window, st->Fs); } @@ -605,6 +633,7 @@ int opus_decode_native(OpusDecoder *st, const unsigned char *data, int packet_frame_size, packet_bandwidth, packet_mode, packet_stream_channels; /* 48 x 2.5 ms = 120 ms */ opus_int16 size[48]; + VALIDATE_OPUS_DECODER(st); if (decode_fec<0 || decode_fec>1) return OPUS_BAD_ARG; /* For FEC/PLC, frame_size has to be to have a multiple of 2.5 ms */ @@ -740,6 +769,7 @@ int opus_decode_float(OpusDecoder *st, const unsigned char *data, else return OPUS_INVALID_PACKET; } + celt_assert(st->channels == 1 || st->channels == 2); ALLOC(out, frame_size*st->channels, opus_int16); ret = opus_decode_native(st, data, len, out, frame_size, decode_fec, 0, NULL, 0); @@ -777,6 +807,7 @@ int opus_decode(OpusDecoder *st, const unsigned char *data, else return OPUS_INVALID_PACKET; } + celt_assert(st->channels == 1 || st->channels == 2); ALLOC(out, frame_size*st->channels, float); ret = opus_decode_native(st, data, len, out, frame_size, decode_fec, 0, NULL, 1); @@ -864,7 +895,7 @@ int opus_decoder_ctl(OpusDecoder *st, int request, ...) goto bad_arg; } if (st->prev_mode == MODE_CELT_ONLY) - celt_decoder_ctl(celt_dec, OPUS_GET_PITCH(value)); + ret = celt_decoder_ctl(celt_dec, OPUS_GET_PITCH(value)); else *value = st->DecControl.prevPitchLag; } @@ -891,7 +922,7 @@ int opus_decoder_ctl(OpusDecoder *st, int request, ...) break; case OPUS_GET_LAST_PACKET_DURATION_REQUEST: { - opus_uint32 *value = va_arg(ap, opus_uint32*); + opus_int32 *value = va_arg(ap, opus_int32*); if (!value) { goto bad_arg; @@ -899,6 +930,26 @@ int opus_decoder_ctl(OpusDecoder *st, int request, ...) *value = st->last_packet_duration; } break; + case OPUS_SET_PHASE_INVERSION_DISABLED_REQUEST: + { + opus_int32 value = va_arg(ap, opus_int32); + if(value<0 || value>1) + { + goto bad_arg; + } + ret = celt_decoder_ctl(celt_dec, OPUS_SET_PHASE_INVERSION_DISABLED(value)); + } + break; + case OPUS_GET_PHASE_INVERSION_DISABLED_REQUEST: + { + opus_int32 *value = va_arg(ap, opus_int32*); + if (!value) + { + goto bad_arg; + } + ret = celt_decoder_ctl(celt_dec, OPUS_GET_PHASE_INVERSION_DISABLED(value)); + } + break; default: /*fprintf(stderr, "unknown opus_decoder_ctl() request: %d", request);*/ ret = OPUS_UNIMPLEMENTED; diff --git a/thirdparty/opus/opus_encoder.c b/thirdparty/opus/opus_encoder.c index 9a516a884a..e98ac5b8d0 100644 --- a/thirdparty/opus/opus_encoder.c +++ b/thirdparty/opus/opus_encoder.c @@ -53,6 +53,10 @@ #define MAX_ENCODER_BUFFER 480 +#ifndef DISABLE_FLOAT_API +#define PSEUDO_SNR_THRESHOLD 316.23f /* 10^(25/10) */ +#endif + typedef struct { opus_val32 XX, XY, YY; opus_val16 smoothed_width; @@ -82,6 +86,7 @@ struct OpusEncoder { int encoder_buffer; int lfe; int arch; + int use_dtx; /* general DTX for both SILK and CELT */ #ifndef DISABLE_FLOAT_API TonalityAnalysisState analysis; #endif @@ -97,6 +102,8 @@ struct OpusEncoder { int prev_channels; int prev_framesize; int bandwidth; + /* Bandwidth determined automatically from the rate (before any other adjustment) */ + int auto_bandwidth; int silk_bw_switch; /* Sampling rate (at the API level) */ int first; @@ -105,7 +112,10 @@ struct OpusEncoder { opus_val16 delay_buffer[MAX_ENCODER_BUFFER*2]; #ifndef DISABLE_FLOAT_API int detected_bandwidth; + int nb_no_activity_frames; + opus_val32 peak_signal_energy; #endif + int nonfinal_frame; /* current frame is not the final in a packet */ opus_uint32 rangeFinal; }; @@ -113,38 +123,46 @@ struct OpusEncoder { middle (memoriless) threshold. The second column is the hysteresis (difference with the middle) */ static const opus_int32 mono_voice_bandwidth_thresholds[8] = { - 11000, 1000, /* NB<->MB */ - 14000, 1000, /* MB<->WB */ - 17000, 1000, /* WB<->SWB */ - 21000, 2000, /* SWB<->FB */ + 9000, 700, /* NB<->MB */ + 9000, 700, /* MB<->WB */ + 13500, 1000, /* WB<->SWB */ + 14000, 2000, /* SWB<->FB */ }; static const opus_int32 mono_music_bandwidth_thresholds[8] = { - 12000, 1000, /* NB<->MB */ - 15000, 1000, /* MB<->WB */ - 18000, 2000, /* WB<->SWB */ - 22000, 2000, /* SWB<->FB */ + 9000, 700, /* NB<->MB */ + 9000, 700, /* MB<->WB */ + 11000, 1000, /* WB<->SWB */ + 12000, 2000, /* SWB<->FB */ }; static const opus_int32 stereo_voice_bandwidth_thresholds[8] = { - 11000, 1000, /* NB<->MB */ - 14000, 1000, /* MB<->WB */ - 21000, 2000, /* WB<->SWB */ - 28000, 2000, /* SWB<->FB */ + 9000, 700, /* NB<->MB */ + 9000, 700, /* MB<->WB */ + 13500, 1000, /* WB<->SWB */ + 14000, 2000, /* SWB<->FB */ }; static const opus_int32 stereo_music_bandwidth_thresholds[8] = { - 12000, 1000, /* NB<->MB */ - 18000, 2000, /* MB<->WB */ - 21000, 2000, /* WB<->SWB */ - 30000, 2000, /* SWB<->FB */ + 9000, 700, /* NB<->MB */ + 9000, 700, /* MB<->WB */ + 11000, 1000, /* WB<->SWB */ + 12000, 2000, /* SWB<->FB */ }; /* Threshold bit-rates for switching between mono and stereo */ -static const opus_int32 stereo_voice_threshold = 30000; -static const opus_int32 stereo_music_threshold = 30000; +static const opus_int32 stereo_voice_threshold = 19000; +static const opus_int32 stereo_music_threshold = 17000; /* Threshold bit-rate for switching between SILK/hybrid and CELT-only */ static const opus_int32 mode_thresholds[2][2] = { /* voice */ /* music */ - { 64000, 16000}, /* mono */ - { 36000, 16000}, /* stereo */ + { 64000, 10000}, /* mono */ + { 44000, 10000}, /* stereo */ +}; + +static const opus_int32 fec_thresholds[] = { + 12000, 1000, /* NB */ + 14000, 1000, /* MB */ + 16000, 1000, /* WB */ + 20000, 1000, /* SWB */ + 22000, 1000, /* FB */ }; int opus_encoder_get_size(int channels) @@ -245,7 +263,8 @@ int opus_encoder_init(OpusEncoder* st, opus_int32 Fs, int channels, int applicat st->bandwidth = OPUS_BANDWIDTH_FULLBAND; #ifndef DISABLE_FLOAT_API - tonality_analysis_init(&st->analysis); + tonality_analysis_init(&st->analysis, st->Fs); + st->analysis.application = st->application; #endif return OPUS_OK; @@ -323,10 +342,11 @@ static void silk_biquad_float( } #endif -static void hp_cutoff(const opus_val16 *in, opus_int32 cutoff_Hz, opus_val16 *out, opus_val32 *hp_mem, int len, int channels, opus_int32 Fs) +static void hp_cutoff(const opus_val16 *in, opus_int32 cutoff_Hz, opus_val16 *out, opus_val32 *hp_mem, int len, int channels, opus_int32 Fs, int arch) { opus_int32 B_Q28[ 3 ], A_Q28[ 2 ]; opus_int32 Fc_Q19, r_Q28, r_Q22; + (void)arch; silk_assert( cutoff_Hz <= silk_int32_MAX / SILK_FIX_CONST( 1.5 * 3.14159 / 1000, 19 ) ); Fc_Q19 = silk_DIV32_16( silk_SMULBB( SILK_FIX_CONST( 1.5 * 3.14159 / 1000, 19 ), cutoff_Hz ), Fs/1000 ); @@ -346,9 +366,10 @@ static void hp_cutoff(const opus_val16 *in, opus_int32 cutoff_Hz, opus_val16 *ou A_Q28[ 1 ] = silk_SMULWW( r_Q22, r_Q22 ); #ifdef FIXED_POINT - silk_biquad_alt( in, B_Q28, A_Q28, hp_mem, out, len, channels ); - if( channels == 2 ) { - silk_biquad_alt( in+1, B_Q28, A_Q28, hp_mem+2, out+1, len, channels ); + if( channels == 1 ) { + silk_biquad_alt_stride1( in, B_Q28, A_Q28, hp_mem, out, len ); + } else { + silk_biquad_alt_stride2( in, B_Q28, A_Q28, hp_mem, out, len, arch ); } #else silk_biquad_float( in, B_Q28, A_Q28, hp_mem, out, len, channels ); @@ -364,21 +385,17 @@ static void dc_reject(const opus_val16 *in, opus_int32 cutoff_Hz, opus_val16 *ou int c, i; int shift; - /* Approximates -round(log2(4.*cutoff_Hz/Fs)) */ - shift=celt_ilog2(Fs/(cutoff_Hz*3)); + /* Approximates -round(log2(6.3*cutoff_Hz/Fs)) */ + shift=celt_ilog2(Fs/(cutoff_Hz*4)); for (c=0;c<channels;c++) { for (i=0;i<len;i++) { - opus_val32 x, tmp, y; - x = SHL32(EXTEND32(in[channels*i+c]), 15); - /* First stage */ - tmp = x-hp_mem[2*c]; + opus_val32 x, y; + x = SHL32(EXTEND32(in[channels*i+c]), 14); + y = x-hp_mem[2*c]; hp_mem[2*c] = hp_mem[2*c] + PSHR32(x - hp_mem[2*c], shift); - /* Second stage */ - y = tmp - hp_mem[2*c+1]; - hp_mem[2*c+1] = hp_mem[2*c+1] + PSHR32(tmp - hp_mem[2*c+1], shift); - out[channels*i+c] = EXTRACT16(SATURATE(PSHR32(y, 15), 32767)); + out[channels*i+c] = EXTRACT16(SATURATE(PSHR32(y, 14), 32767)); } } } @@ -386,24 +403,41 @@ static void dc_reject(const opus_val16 *in, opus_int32 cutoff_Hz, opus_val16 *ou #else static void dc_reject(const opus_val16 *in, opus_int32 cutoff_Hz, opus_val16 *out, opus_val32 *hp_mem, int len, int channels, opus_int32 Fs) { - int c, i; - float coef; - - coef = 4.0f*cutoff_Hz/Fs; - for (c=0;c<channels;c++) + int i; + float coef, coef2; + coef = 6.3f*cutoff_Hz/Fs; + coef2 = 1-coef; + if (channels==2) { + float m0, m2; + m0 = hp_mem[0]; + m2 = hp_mem[2]; for (i=0;i<len;i++) { - opus_val32 x, tmp, y; - x = in[channels*i+c]; - /* First stage */ - tmp = x-hp_mem[2*c]; - hp_mem[2*c] = hp_mem[2*c] + coef*(x - hp_mem[2*c]) + VERY_SMALL; - /* Second stage */ - y = tmp - hp_mem[2*c+1]; - hp_mem[2*c+1] = hp_mem[2*c+1] + coef*(tmp - hp_mem[2*c+1]) + VERY_SMALL; - out[channels*i+c] = y; + opus_val32 x0, x1, out0, out1; + x0 = in[2*i+0]; + x1 = in[2*i+1]; + out0 = x0-m0; + out1 = x1-m2; + m0 = coef*x0 + VERY_SMALL + coef2*m0; + m2 = coef*x1 + VERY_SMALL + coef2*m2; + out[2*i+0] = out0; + out[2*i+1] = out1; } + hp_mem[0] = m0; + hp_mem[2] = m2; + } else { + float m0; + m0 = hp_mem[0]; + for (i=0;i<len;i++) + { + opus_val32 x, y; + x = in[i]; + y = x-m0; + m0 = coef*x + VERY_SMALL + coef2*m0; + out[i] = y; + } + hp_mem[0] = m0; } } #endif @@ -521,287 +555,57 @@ static opus_int32 user_bitrate_to_bitrate(OpusEncoder *st, int frame_size, int m } #ifndef DISABLE_FLOAT_API -/* Don't use more than 60 ms for the frame size analysis */ -#define MAX_DYNAMIC_FRAMESIZE 24 -/* Estimates how much the bitrate will be boosted based on the sub-frame energy */ -static float transient_boost(const float *E, const float *E_1, int LM, int maxM) -{ - int i; - int M; - float sumE=0, sumE_1=0; - float metric; - - M = IMIN(maxM, (1<<LM)+1); - for (i=0;i<M;i++) - { - sumE += E[i]; - sumE_1 += E_1[i]; - } - metric = sumE*sumE_1/(M*M); - /*if (LM==3) - printf("%f\n", metric);*/ - /*return metric>10 ? 1 : 0;*/ - /*return MAX16(0,1-exp(-.25*(metric-2.)));*/ - return MIN16(1,(float)sqrt(MAX16(0,.05f*(metric-2)))); -} - -/* Viterbi decoding trying to find the best frame size combination using look-ahead - - State numbering: - 0: unused - 1: 2.5 ms - 2: 5 ms (#1) - 3: 5 ms (#2) - 4: 10 ms (#1) - 5: 10 ms (#2) - 6: 10 ms (#3) - 7: 10 ms (#4) - 8: 20 ms (#1) - 9: 20 ms (#2) - 10: 20 ms (#3) - 11: 20 ms (#4) - 12: 20 ms (#5) - 13: 20 ms (#6) - 14: 20 ms (#7) - 15: 20 ms (#8) -*/ -static int transient_viterbi(const float *E, const float *E_1, int N, int frame_cost, int rate) -{ - int i; - float cost[MAX_DYNAMIC_FRAMESIZE][16]; - int states[MAX_DYNAMIC_FRAMESIZE][16]; - float best_cost; - int best_state; - float factor; - /* Take into account that we damp VBR in the 32 kb/s to 64 kb/s range. */ - if (rate<80) - factor=0; - else if (rate>160) - factor=1; - else - factor = (rate-80.f)/80.f; - /* Makes variable framesize less aggressive at lower bitrates, but I can't - find any valid theoretical justification for this (other than it seems - to help) */ - for (i=0;i<16;i++) - { - /* Impossible state */ - states[0][i] = -1; - cost[0][i] = 1e10; - } - for (i=0;i<4;i++) - { - cost[0][1<<i] = (frame_cost + rate*(1<<i))*(1+factor*transient_boost(E, E_1, i, N+1)); - states[0][1<<i] = i; - } - for (i=1;i<N;i++) - { - int j; - - /* Follow continuations */ - for (j=2;j<16;j++) - { - cost[i][j] = cost[i-1][j-1]; - states[i][j] = j-1; - } - - /* New frames */ - for(j=0;j<4;j++) - { - int k; - float min_cost; - float curr_cost; - states[i][1<<j] = 1; - min_cost = cost[i-1][1]; - for(k=1;k<4;k++) - { - float tmp = cost[i-1][(1<<(k+1))-1]; - if (tmp < min_cost) - { - states[i][1<<j] = (1<<(k+1))-1; - min_cost = tmp; - } - } - curr_cost = (frame_cost + rate*(1<<j))*(1+factor*transient_boost(E+i, E_1+i, j, N-i+1)); - cost[i][1<<j] = min_cost; - /* If part of the frame is outside the analysis window, only count part of the cost */ - if (N-i < (1<<j)) - cost[i][1<<j] += curr_cost*(float)(N-i)/(1<<j); - else - cost[i][1<<j] += curr_cost; - } - } - - best_state=1; - best_cost = cost[N-1][1]; - /* Find best end state (doesn't force a frame to end at N-1) */ - for (i=2;i<16;i++) - { - if (cost[N-1][i]<best_cost) - { - best_cost = cost[N-1][i]; - best_state = i; - } - } - - /* Follow transitions back */ - for (i=N-1;i>=0;i--) - { - /*printf("%d ", best_state);*/ - best_state = states[i][best_state]; - } - /*printf("%d\n", best_state);*/ - return best_state; -} - -static int optimize_framesize(const void *x, int len, int C, opus_int32 Fs, - int bitrate, opus_val16 tonality, float *mem, int buffering, - downmix_func downmix) -{ - int N; - int i; - float e[MAX_DYNAMIC_FRAMESIZE+4]; - float e_1[MAX_DYNAMIC_FRAMESIZE+3]; - opus_val32 memx; - int bestLM=0; - int subframe; - int pos; - int offset; - VARDECL(opus_val32, sub); - - subframe = Fs/400; - ALLOC(sub, subframe, opus_val32); - e[0]=mem[0]; - e_1[0]=1.f/(EPSILON+mem[0]); - if (buffering) - { - /* Consider the CELT delay when not in restricted-lowdelay */ - /* We assume the buffering is between 2.5 and 5 ms */ - offset = 2*subframe - buffering; - celt_assert(offset>=0 && offset <= subframe); - len -= offset; - e[1]=mem[1]; - e_1[1]=1.f/(EPSILON+mem[1]); - e[2]=mem[2]; - e_1[2]=1.f/(EPSILON+mem[2]); - pos = 3; - } else { - pos=1; - offset=0; - } - N=IMIN(len/subframe, MAX_DYNAMIC_FRAMESIZE); - /* Just silencing a warning, it's really initialized later */ - memx = 0; - for (i=0;i<N;i++) - { - float tmp; - opus_val32 tmpx; - int j; - tmp=EPSILON; - - downmix(x, sub, subframe, i*subframe+offset, 0, -2, C); - if (i==0) - memx = sub[0]; - for (j=0;j<subframe;j++) - { - tmpx = sub[j]; - tmp += (tmpx-memx)*(float)(tmpx-memx); - memx = tmpx; - } - e[i+pos] = tmp; - e_1[i+pos] = 1.f/tmp; - } - /* Hack to get 20 ms working with APPLICATION_AUDIO - The real problem is that the corresponding memory needs to use 1.5 ms - from this frame and 1 ms from the next frame */ - e[i+pos] = e[i+pos-1]; - if (buffering) - N=IMIN(MAX_DYNAMIC_FRAMESIZE, N+2); - bestLM = transient_viterbi(e, e_1, N, (int)((1.f+.5f*tonality)*(60*C+40)), bitrate/400); - mem[0] = e[1<<bestLM]; - if (buffering) - { - mem[1] = e[(1<<bestLM)+1]; - mem[2] = e[(1<<bestLM)+2]; - } - return bestLM; -} - -#endif - -#ifndef DISABLE_FLOAT_API #ifdef FIXED_POINT #define PCM2VAL(x) FLOAT2INT16(x) #else #define PCM2VAL(x) SCALEIN(x) #endif -void downmix_float(const void *_x, opus_val32 *sub, int subframe, int offset, int c1, int c2, int C) + +void downmix_float(const void *_x, opus_val32 *y, int subframe, int offset, int c1, int c2, int C) { const float *x; - opus_val32 scale; int j; + x = (const float *)_x; for (j=0;j<subframe;j++) - sub[j] = PCM2VAL(x[(j+offset)*C+c1]); + y[j] = PCM2VAL(x[(j+offset)*C+c1]); if (c2>-1) { for (j=0;j<subframe;j++) - sub[j] += PCM2VAL(x[(j+offset)*C+c2]); + y[j] += PCM2VAL(x[(j+offset)*C+c2]); } else if (c2==-2) { int c; for (c=1;c<C;c++) { for (j=0;j<subframe;j++) - sub[j] += PCM2VAL(x[(j+offset)*C+c]); + y[j] += PCM2VAL(x[(j+offset)*C+c]); } } -#ifdef FIXED_POINT - scale = (1<<SIG_SHIFT); -#else - scale = 1.f; -#endif - if (C==-2) - scale /= C; - else - scale /= 2; - for (j=0;j<subframe;j++) - sub[j] *= scale; } #endif -void downmix_int(const void *_x, opus_val32 *sub, int subframe, int offset, int c1, int c2, int C) +void downmix_int(const void *_x, opus_val32 *y, int subframe, int offset, int c1, int c2, int C) { const opus_int16 *x; - opus_val32 scale; int j; + x = (const opus_int16 *)_x; for (j=0;j<subframe;j++) - sub[j] = x[(j+offset)*C+c1]; + y[j] = x[(j+offset)*C+c1]; if (c2>-1) { for (j=0;j<subframe;j++) - sub[j] += x[(j+offset)*C+c2]; + y[j] += x[(j+offset)*C+c2]; } else if (c2==-2) { int c; for (c=1;c<C;c++) { for (j=0;j<subframe;j++) - sub[j] += x[(j+offset)*C+c]; + y[j] += x[(j+offset)*C+c]; } } -#ifdef FIXED_POINT - scale = (1<<SIG_SHIFT); -#else - scale = 1.f/32768; -#endif - if (C==-2) - scale /= C; - else - scale /= 2; - for (j=0;j<subframe;j++) - sub[j] *= scale; } opus_int32 frame_size_select(opus_int32 frame_size, int variable_duration, opus_int32 Fs) @@ -811,53 +615,24 @@ opus_int32 frame_size_select(opus_int32 frame_size, int variable_duration, opus_ return -1; if (variable_duration == OPUS_FRAMESIZE_ARG) new_size = frame_size; - else if (variable_duration == OPUS_FRAMESIZE_VARIABLE) - new_size = Fs/50; - else if (variable_duration >= OPUS_FRAMESIZE_2_5_MS && variable_duration <= OPUS_FRAMESIZE_60_MS) - new_size = IMIN(3*Fs/50, (Fs/400)<<(variable_duration-OPUS_FRAMESIZE_2_5_MS)); + else if (variable_duration >= OPUS_FRAMESIZE_2_5_MS && variable_duration <= OPUS_FRAMESIZE_120_MS) + { + if (variable_duration <= OPUS_FRAMESIZE_40_MS) + new_size = (Fs/400)<<(variable_duration-OPUS_FRAMESIZE_2_5_MS); + else + new_size = (variable_duration-OPUS_FRAMESIZE_2_5_MS-2)*Fs/50; + } else return -1; if (new_size>frame_size) return -1; - if (400*new_size!=Fs && 200*new_size!=Fs && 100*new_size!=Fs && - 50*new_size!=Fs && 25*new_size!=Fs && 50*new_size!=3*Fs) + if (400*new_size!=Fs && 200*new_size!=Fs && 100*new_size!=Fs && + 50*new_size!=Fs && 25*new_size!=Fs && 50*new_size!=3*Fs && + 50*new_size!=4*Fs && 50*new_size!=5*Fs && 50*new_size!=6*Fs) return -1; return new_size; } -opus_int32 compute_frame_size(const void *analysis_pcm, int frame_size, - int variable_duration, int C, opus_int32 Fs, int bitrate_bps, - int delay_compensation, downmix_func downmix -#ifndef DISABLE_FLOAT_API - , float *subframe_mem -#endif - ) -{ -#ifndef DISABLE_FLOAT_API - if (variable_duration == OPUS_FRAMESIZE_VARIABLE && frame_size >= Fs/200) - { - int LM = 3; - LM = optimize_framesize(analysis_pcm, frame_size, C, Fs, bitrate_bps, - 0, subframe_mem, delay_compensation, downmix); - while ((Fs/400<<LM)>frame_size) - LM--; - frame_size = (Fs/400<<LM); - } else -#else - (void)analysis_pcm; - (void)C; - (void)bitrate_bps; - (void)delay_compensation; - (void)downmix; -#endif - { - frame_size = frame_size_select(frame_size, variable_duration, Fs); - } - if (frame_size<0) - return -1; - return frame_size; -} - opus_val16 compute_stereo_width(const opus_val16 *pcm, int frame_size, opus_int32 Fs, StereoWidthState *mem) { opus_val32 xx, xy, yy; @@ -904,6 +679,12 @@ opus_val16 compute_stereo_width(const opus_val16 *pcm, int frame_size, opus_int3 xy += SHR32(pxy, 10); yy += SHR32(pyy, 10); } +#ifndef FIXED_POINT + if (!(xx < 1e9f) || celt_isnan(xx) || !(yy < 1e9f) || celt_isnan(yy)) + { + xy = xx = yy = 0; + } +#endif mem->XX += MULT16_32_Q15(short_alpha, xx-mem->XX); mem->XY += MULT16_32_Q15(short_alpha, xy-mem->XY); mem->YY += MULT16_32_Q15(short_alpha, yy-mem->YY); @@ -934,6 +715,354 @@ opus_val16 compute_stereo_width(const opus_val16 *pcm, int frame_size, opus_int3 return EXTRACT16(MIN32(Q15ONE, MULT16_16(20, mem->max_follower))); } +static int decide_fec(int useInBandFEC, int PacketLoss_perc, int last_fec, int mode, int *bandwidth, opus_int32 rate) +{ + int orig_bandwidth; + if (!useInBandFEC || PacketLoss_perc == 0 || mode == MODE_CELT_ONLY) + return 0; + orig_bandwidth = *bandwidth; + for (;;) + { + opus_int32 hysteresis; + opus_int32 LBRR_rate_thres_bps; + /* Compute threshold for using FEC at the current bandwidth setting */ + LBRR_rate_thres_bps = fec_thresholds[2*(*bandwidth - OPUS_BANDWIDTH_NARROWBAND)]; + hysteresis = fec_thresholds[2*(*bandwidth - OPUS_BANDWIDTH_NARROWBAND) + 1]; + if (last_fec == 1) LBRR_rate_thres_bps -= hysteresis; + if (last_fec == 0) LBRR_rate_thres_bps += hysteresis; + LBRR_rate_thres_bps = silk_SMULWB( silk_MUL( LBRR_rate_thres_bps, + 125 - silk_min( PacketLoss_perc, 25 ) ), SILK_FIX_CONST( 0.01, 16 ) ); + /* If loss <= 5%, we look at whether we have enough rate to enable FEC. + If loss > 5%, we decrease the bandwidth until we can enable FEC. */ + if (rate > LBRR_rate_thres_bps) + return 1; + else if (PacketLoss_perc <= 5) + return 0; + else if (*bandwidth > OPUS_BANDWIDTH_NARROWBAND) + (*bandwidth)--; + else + break; + } + /* Couldn't find any bandwidth to enable FEC, keep original bandwidth. */ + *bandwidth = orig_bandwidth; + return 0; +} + +static int compute_silk_rate_for_hybrid(int rate, int bandwidth, int frame20ms, int vbr, int fec, int channels) { + int entry; + int i; + int N; + int silk_rate; + static int rate_table[][5] = { + /* |total| |-------- SILK------------| + |-- No FEC -| |--- FEC ---| + 10ms 20ms 10ms 20ms */ + { 0, 0, 0, 0, 0}, + {12000, 10000, 10000, 11000, 11000}, + {16000, 13500, 13500, 15000, 15000}, + {20000, 16000, 16000, 18000, 18000}, + {24000, 18000, 18000, 21000, 21000}, + {32000, 22000, 22000, 28000, 28000}, + {64000, 38000, 38000, 50000, 50000} + }; + /* Do the allocation per-channel. */ + rate /= channels; + entry = 1 + frame20ms + 2*fec; + N = sizeof(rate_table)/sizeof(rate_table[0]); + for (i=1;i<N;i++) + { + if (rate_table[i][0] > rate) break; + } + if (i == N) + { + silk_rate = rate_table[i-1][entry]; + /* For now, just give 50% of the extra bits to SILK. */ + silk_rate += (rate-rate_table[i-1][0])/2; + } else { + opus_int32 lo, hi, x0, x1; + lo = rate_table[i-1][entry]; + hi = rate_table[i][entry]; + x0 = rate_table[i-1][0]; + x1 = rate_table[i][0]; + silk_rate = (lo*(x1-rate) + hi*(rate-x0))/(x1-x0); + } + if (!vbr) + { + /* Tiny boost to SILK for CBR. We should probably tune this better. */ + silk_rate += 100; + } + if (bandwidth==OPUS_BANDWIDTH_SUPERWIDEBAND) + silk_rate += 300; + silk_rate *= channels; + /* Small adjustment for stereo (calibrated for 32 kb/s, haven't tried other bitrates). */ + if (channels == 2 && rate >= 12000) + silk_rate -= 1000; + return silk_rate; +} + +/* Returns the equivalent bitrate corresponding to 20 ms frames, + complexity 10 VBR operation. */ +static opus_int32 compute_equiv_rate(opus_int32 bitrate, int channels, + int frame_rate, int vbr, int mode, int complexity, int loss) +{ + opus_int32 equiv; + equiv = bitrate; + /* Take into account overhead from smaller frames. */ + if (frame_rate > 50) + equiv -= (40*channels+20)*(frame_rate - 50); + /* CBR is about a 8% penalty for both SILK and CELT. */ + if (!vbr) + equiv -= equiv/12; + /* Complexity makes about 10% difference (from 0 to 10) in general. */ + equiv = equiv * (90+complexity)/100; + if (mode == MODE_SILK_ONLY || mode == MODE_HYBRID) + { + /* SILK complexity 0-1 uses the non-delayed-decision NSQ, which + costs about 20%. */ + if (complexity<2) + equiv = equiv*4/5; + equiv -= equiv*loss/(6*loss + 10); + } else if (mode == MODE_CELT_ONLY) { + /* CELT complexity 0-4 doesn't have the pitch filter, which costs + about 10%. */ + if (complexity<5) + equiv = equiv*9/10; + } else { + /* Mode not known yet */ + /* Half the SILK loss*/ + equiv -= equiv*loss/(12*loss + 20); + } + return equiv; +} + +#ifndef DISABLE_FLOAT_API + +int is_digital_silence(const opus_val16* pcm, int frame_size, int channels, int lsb_depth) +{ + int silence = 0; + opus_val32 sample_max = 0; +#ifdef MLP_TRAINING + return 0; +#endif + sample_max = celt_maxabs16(pcm, frame_size*channels); + +#ifdef FIXED_POINT + silence = (sample_max == 0); + (void)lsb_depth; +#else + silence = (sample_max <= (opus_val16) 1 / (1 << lsb_depth)); +#endif + + return silence; +} + +#ifdef FIXED_POINT +static opus_val32 compute_frame_energy(const opus_val16 *pcm, int frame_size, int channels, int arch) +{ + int i; + opus_val32 sample_max; + int max_shift; + int shift; + opus_val32 energy = 0; + int len = frame_size*channels; + (void)arch; + /* Max amplitude in the signal */ + sample_max = celt_maxabs16(pcm, len); + + /* Compute the right shift required in the MAC to avoid an overflow */ + max_shift = celt_ilog2(len); + shift = IMAX(0, (celt_ilog2(sample_max) << 1) + max_shift - 28); + + /* Compute the energy */ + for (i=0; i<len; i++) + energy += SHR32(MULT16_16(pcm[i], pcm[i]), shift); + + /* Normalize energy by the frame size and left-shift back to the original position */ + energy /= len; + energy = SHL32(energy, shift); + + return energy; +} +#else +static opus_val32 compute_frame_energy(const opus_val16 *pcm, int frame_size, int channels, int arch) +{ + int len = frame_size*channels; + return celt_inner_prod(pcm, pcm, len, arch)/len; +} +#endif + +/* Decides if DTX should be turned on (=1) or off (=0) */ +static int decide_dtx_mode(float activity_probability, /* probability that current frame contains speech/music */ + int *nb_no_activity_frames, /* number of consecutive frames with no activity */ + opus_val32 peak_signal_energy, /* peak energy of desired signal detected so far */ + const opus_val16 *pcm, /* input pcm signal */ + int frame_size, /* frame size */ + int channels, + int is_silence, /* only digital silence detected in this frame */ + int arch + ) +{ + opus_val32 noise_energy; + + if (!is_silence) + { + if (activity_probability < DTX_ACTIVITY_THRESHOLD) /* is noise */ + { + noise_energy = compute_frame_energy(pcm, frame_size, channels, arch); + + /* but is sufficiently quiet */ + is_silence = peak_signal_energy >= (PSEUDO_SNR_THRESHOLD * noise_energy); + } + } + + if (is_silence) + { + /* The number of consecutive DTX frames should be within the allowed bounds */ + (*nb_no_activity_frames)++; + + if (*nb_no_activity_frames > NB_SPEECH_FRAMES_BEFORE_DTX) + { + if (*nb_no_activity_frames <= (NB_SPEECH_FRAMES_BEFORE_DTX + MAX_CONSECUTIVE_DTX)) + /* Valid frame for DTX! */ + return 1; + else + (*nb_no_activity_frames) = NB_SPEECH_FRAMES_BEFORE_DTX; + } + } else + (*nb_no_activity_frames) = 0; + + return 0; +} + +#endif + +static opus_int32 encode_multiframe_packet(OpusEncoder *st, + const opus_val16 *pcm, + int nb_frames, + int frame_size, + unsigned char *data, + opus_int32 out_data_bytes, + int to_celt, + int lsb_depth, + int float_api) +{ + int i; + int ret = 0; + VARDECL(unsigned char, tmp_data); + int bak_mode, bak_bandwidth, bak_channels, bak_to_mono; + VARDECL(OpusRepacketizer, rp); + int max_header_bytes; + opus_int32 bytes_per_frame; + opus_int32 cbr_bytes; + opus_int32 repacketize_len; + int tmp_len; + ALLOC_STACK; + + /* Worst cases: + * 2 frames: Code 2 with different compressed sizes + * >2 frames: Code 3 VBR */ + max_header_bytes = nb_frames == 2 ? 3 : (2+(nb_frames-1)*2); + + if (st->use_vbr || st->user_bitrate_bps==OPUS_BITRATE_MAX) + repacketize_len = out_data_bytes; + else { + cbr_bytes = 3*st->bitrate_bps/(3*8*st->Fs/(frame_size*nb_frames)); + repacketize_len = IMIN(cbr_bytes, out_data_bytes); + } + bytes_per_frame = IMIN(1276, 1+(repacketize_len-max_header_bytes)/nb_frames); + + ALLOC(tmp_data, nb_frames*bytes_per_frame, unsigned char); + ALLOC(rp, 1, OpusRepacketizer); + opus_repacketizer_init(rp); + + bak_mode = st->user_forced_mode; + bak_bandwidth = st->user_bandwidth; + bak_channels = st->force_channels; + + st->user_forced_mode = st->mode; + st->user_bandwidth = st->bandwidth; + st->force_channels = st->stream_channels; + + bak_to_mono = st->silk_mode.toMono; + if (bak_to_mono) + st->force_channels = 1; + else + st->prev_channels = st->stream_channels; + + for (i=0;i<nb_frames;i++) + { + st->silk_mode.toMono = 0; + st->nonfinal_frame = i<(nb_frames-1); + + /* When switching from SILK/Hybrid to CELT, only ask for a switch at the last frame */ + if (to_celt && i==nb_frames-1) + st->user_forced_mode = MODE_CELT_ONLY; + + tmp_len = opus_encode_native(st, pcm+i*(st->channels*frame_size), frame_size, + tmp_data+i*bytes_per_frame, bytes_per_frame, lsb_depth, NULL, 0, 0, 0, 0, + NULL, float_api); + + if (tmp_len<0) + { + RESTORE_STACK; + return OPUS_INTERNAL_ERROR; + } + + ret = opus_repacketizer_cat(rp, tmp_data+i*bytes_per_frame, tmp_len); + + if (ret<0) + { + RESTORE_STACK; + return OPUS_INTERNAL_ERROR; + } + } + + ret = opus_repacketizer_out_range_impl(rp, 0, nb_frames, data, repacketize_len, 0, !st->use_vbr); + + if (ret<0) + { + RESTORE_STACK; + return OPUS_INTERNAL_ERROR; + } + + /* Discard configs that were forced locally for the purpose of repacketization */ + st->user_forced_mode = bak_mode; + st->user_bandwidth = bak_bandwidth; + st->force_channels = bak_channels; + st->silk_mode.toMono = bak_to_mono; + + RESTORE_STACK; + return ret; +} + +static int compute_redundancy_bytes(opus_int32 max_data_bytes, opus_int32 bitrate_bps, int frame_rate, int channels) +{ + int redundancy_bytes_cap; + int redundancy_bytes; + opus_int32 redundancy_rate; + int base_bits; + opus_int32 available_bits; + base_bits = (40*channels+20); + + /* Equivalent rate for 5 ms frames. */ + redundancy_rate = bitrate_bps + base_bits*(200 - frame_rate); + /* For VBR, further increase the bitrate if we can afford it. It's pretty short + and we'll avoid artefacts. */ + redundancy_rate = 3*redundancy_rate/2; + redundancy_bytes = redundancy_rate/1600; + + /* Compute the max rate we can use given CBR or VBR with cap. */ + available_bits = max_data_bytes*8 - 2*base_bits; + redundancy_bytes_cap = (available_bits*240/(240+48000/frame_rate) + base_bits)/8; + redundancy_bytes = IMIN(redundancy_bytes, redundancy_bytes_cap); + /* It we can't get enough bits for redundancy to be worth it, rely on the decoder PLC. */ + if (redundancy_bytes > 4 + 8*channels) + redundancy_bytes = IMIN(257, redundancy_bytes); + else + redundancy_bytes = 0; + return redundancy_bytes; +} + opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_size, unsigned char *data, opus_int32 out_data_bytes, int lsb_depth, const void *analysis_pcm, opus_int32 analysis_size, int c1, int c2, @@ -971,6 +1100,7 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ AnalysisInfo analysis_info; int analysis_read_pos_bak=-1; int analysis_read_subframe_bak=-1; + int is_silence = 0; #endif VARDECL(opus_val16, tmp_prefill); @@ -979,15 +1109,19 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ max_data_bytes = IMIN(1276, out_data_bytes); st->rangeFinal = 0; - if ((!st->variable_duration && 400*frame_size != st->Fs && 200*frame_size != st->Fs && 100*frame_size != st->Fs && - 50*frame_size != st->Fs && 25*frame_size != st->Fs && 50*frame_size != 3*st->Fs) - || (400*frame_size < st->Fs) - || max_data_bytes<=0 - ) + if (frame_size <= 0 || max_data_bytes <= 0) { RESTORE_STACK; return OPUS_BAD_ARG; } + + /* Cannot encode 100 ms in 1 byte */ + if (max_data_bytes==1 && st->Fs==(frame_size*10)) + { + RESTORE_STACK; + return OPUS_BUFFER_TOO_SMALL; + } + silk_enc = (char*)st+st->silk_enc_offset; celt_enc = (CELTEncoder*)((char*)st+st->celt_enc_offset); if (st->application == OPUS_APPLICATION_RESTRICTED_LOWDELAY) @@ -1001,31 +1135,55 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ #ifndef DISABLE_FLOAT_API analysis_info.valid = 0; #ifdef FIXED_POINT - if (st->silk_mode.complexity >= 10 && st->Fs==48000) + if (st->silk_mode.complexity >= 10 && st->Fs>=16000) #else - if (st->silk_mode.complexity >= 7 && st->Fs==48000) + if (st->silk_mode.complexity >= 7 && st->Fs>=16000) #endif { + is_silence = is_digital_silence(pcm, frame_size, st->channels, lsb_depth); analysis_read_pos_bak = st->analysis.read_pos; analysis_read_subframe_bak = st->analysis.read_subframe; run_analysis(&st->analysis, celt_mode, analysis_pcm, analysis_size, frame_size, c1, c2, analysis_channels, st->Fs, lsb_depth, downmix, &analysis_info); + + /* Track the peak signal energy */ + if (!is_silence && analysis_info.activity_probability > DTX_ACTIVITY_THRESHOLD) + st->peak_signal_energy = MAX32(MULT16_32_Q15(QCONST16(0.999f, 15), st->peak_signal_energy), + compute_frame_energy(pcm, frame_size, st->channels, st->arch)); + } else if (st->analysis.initialized) { + tonality_analysis_reset(&st->analysis); } #else (void)analysis_pcm; (void)analysis_size; + (void)c1; + (void)c2; + (void)analysis_channels; + (void)downmix; #endif - st->voice_ratio = -1; - #ifndef DISABLE_FLOAT_API + /* Reset voice_ratio if this frame is not silent or if analysis is disabled. + * Otherwise, preserve voice_ratio from the last non-silent frame */ + if (!is_silence) + st->voice_ratio = -1; + st->detected_bandwidth = 0; if (analysis_info.valid) { int analysis_bandwidth; if (st->signal_type == OPUS_AUTO) - st->voice_ratio = (int)floor(.5+100*(1-analysis_info.music_prob)); + { + float prob; + if (st->prev_mode == 0) + prob = analysis_info.music_prob; + else if (st->prev_mode == MODE_CELT_ONLY) + prob = analysis_info.music_prob_max; + else + prob = analysis_info.music_prob_min; + st->voice_ratio = (int)floor(.5+100*(1-prob)); + } analysis_bandwidth = analysis_info.bandwidth; if (analysis_bandwidth<=12) @@ -1039,6 +1197,8 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ else st->detected_bandwidth = OPUS_BANDWIDTH_FULLBAND; } +#else + st->voice_ratio = -1; #endif if (st->channels==2 && st->force_channels!=1) @@ -1052,12 +1212,13 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ if (!st->use_vbr) { int cbrBytes; - /* Multiply by 3 to make sure the division is exact. */ - int frame_rate3 = 3*st->Fs/frame_size; + /* Multiply by 12 to make sure the division is exact. */ + int frame_rate12 = 12*st->Fs/frame_size; /* We need to make sure that "int" values always fit in 16 bits. */ - cbrBytes = IMIN( (3*st->bitrate_bps/8 + frame_rate3/2)/frame_rate3, max_data_bytes); - st->bitrate_bps = cbrBytes*(opus_int32)frame_rate3*8/3; - max_data_bytes = cbrBytes; + cbrBytes = IMIN( (12*st->bitrate_bps/8 + frame_rate12/2)/frame_rate12, max_data_bytes); + st->bitrate_bps = cbrBytes*(opus_int32)frame_rate12*8/12; + /* Make sure we provide at least one byte to avoid failing. */ + max_data_bytes = IMAX(1, cbrBytes); } if (max_data_bytes<3 || st->bitrate_bps < 3*frame_rate*8 || (frame_rate<50 && (max_data_bytes*frame_rate<300 || st->bitrate_bps < 2400))) @@ -1065,25 +1226,63 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ /*If the space is too low to do something useful, emit 'PLC' frames.*/ int tocmode = st->mode; int bw = st->bandwidth == 0 ? OPUS_BANDWIDTH_NARROWBAND : st->bandwidth; + int packet_code = 0; + int num_multiframes = 0; + if (tocmode==0) tocmode = MODE_SILK_ONLY; if (frame_rate>100) tocmode = MODE_CELT_ONLY; - if (frame_rate < 50) - tocmode = MODE_SILK_ONLY; + /* 40 ms -> 2 x 20 ms if in CELT_ONLY or HYBRID mode */ + if (frame_rate==25 && tocmode!=MODE_SILK_ONLY) + { + frame_rate = 50; + packet_code = 1; + } + + /* >= 60 ms frames */ + if (frame_rate<=16) + { + /* 1 x 60 ms, 2 x 40 ms, 2 x 60 ms */ + if (out_data_bytes==1 || (tocmode==MODE_SILK_ONLY && frame_rate!=10)) + { + tocmode = MODE_SILK_ONLY; + + packet_code = frame_rate <= 12; + frame_rate = frame_rate == 12 ? 25 : 16; + } + else + { + num_multiframes = 50/frame_rate; + frame_rate = 50; + packet_code = 3; + } + } + if(tocmode==MODE_SILK_ONLY&&bw>OPUS_BANDWIDTH_WIDEBAND) bw=OPUS_BANDWIDTH_WIDEBAND; else if (tocmode==MODE_CELT_ONLY&&bw==OPUS_BANDWIDTH_MEDIUMBAND) bw=OPUS_BANDWIDTH_NARROWBAND; else if (tocmode==MODE_HYBRID&&bw<=OPUS_BANDWIDTH_SUPERWIDEBAND) bw=OPUS_BANDWIDTH_SUPERWIDEBAND; + data[0] = gen_toc(tocmode, frame_rate, bw, st->stream_channels); - ret = 1; + data[0] |= packet_code; + + ret = packet_code <= 1 ? 1 : 2; + + max_data_bytes = IMAX(max_data_bytes, ret); + + if (packet_code==3) + data[1] = num_multiframes; + if (!st->use_vbr) { ret = opus_packet_pad(data, ret, max_data_bytes); if (ret == OPUS_OK) ret = max_data_bytes; + else + ret = OPUS_INTERNAL_ERROR; } RESTORE_STACK; return ret; @@ -1091,7 +1290,8 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ max_rate = frame_rate*max_data_bytes*8; /* Equivalent 20-ms rate for mode/channel/bandwidth decisions */ - equiv_rate = st->bitrate_bps - (40*st->channels+20)*(st->Fs/frame_size - 50); + equiv_rate = compute_equiv_rate(st->bitrate_bps, st->channels, st->Fs/frame_size, + st->use_vbr, 0, st->silk_mode.complexity, st->silk_mode.packetLossPercentage); if (st->signal_type == OPUS_SIGNAL_VOICE) voice_est = 127; @@ -1132,7 +1332,17 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ } #endif } - equiv_rate = st->bitrate_bps - (40*st->stream_channels+20)*(st->Fs/frame_size - 50); + /* Update equivalent rate for channels decision. */ + equiv_rate = compute_equiv_rate(st->bitrate_bps, st->stream_channels, st->Fs/frame_size, + st->use_vbr, 0, st->silk_mode.complexity, st->silk_mode.packetLossPercentage); + + /* Allow SILK DTX if DTX is enabled but the generalized DTX cannot be used, + e.g. because of the complexity setting or sample rate. */ +#ifndef DISABLE_FLOAT_API + st->silk_mode.useDTX = st->use_dtx && !(analysis_info.valid || is_silence); +#else + st->silk_mode.useDTX = st->use_dtx; +#endif /* Mode selection depending on application and signal type */ if (st->application == OPUS_APPLICATION_RESTRICTED_LOWDELAY) @@ -1181,10 +1391,15 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ /* When FEC is enabled and there's enough packet loss, use SILK */ if (st->silk_mode.useInBandFEC && st->silk_mode.packetLossPercentage > (128-voice_est)>>4) st->mode = MODE_SILK_ONLY; - /* When encoding voice and DTX is enabled, set the encoder to SILK mode (at least for now) */ + /* When encoding voice and DTX is enabled but the generalized DTX cannot be used, + use SILK in order to make use of its DTX. */ if (st->silk_mode.useDTX && voice_est > 100) st->mode = MODE_SILK_ONLY; #endif + + /* If max_data_bytes represents less than 6 kb/s, switch to CELT-only mode */ + if (max_data_bytes < (frame_rate > 50 ? 9000 : 6000)*frame_size / (st->Fs * 8)) + st->mode = MODE_CELT_ONLY; } else { st->mode = st->user_forced_mode; } @@ -1194,19 +1409,6 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ st->mode = MODE_CELT_ONLY; if (st->lfe) st->mode = MODE_CELT_ONLY; - /* If max_data_bytes represents less than 8 kb/s, switch to CELT-only mode */ - if (max_data_bytes < (frame_rate > 50 ? 12000 : 8000)*frame_size / (st->Fs * 8)) - st->mode = MODE_CELT_ONLY; - - if (st->stream_channels == 1 && st->prev_channels ==2 && st->silk_mode.toMono==0 - && st->mode != MODE_CELT_ONLY && st->prev_mode != MODE_CELT_ONLY) - { - /* Delay stereo->mono transition by two frames so that SILK can do a smooth downmix */ - st->silk_mode.toMono = 1; - st->stream_channels = 2; - } else { - st->silk_mode.toMono = 0; - } if (st->prev_mode > 0 && ((st->mode != MODE_CELT_ONLY && st->prev_mode == MODE_CELT_ONLY) || @@ -1226,24 +1428,23 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ } } } - /* For the first frame at a new SILK bandwidth */ - if (st->silk_bw_switch) - { - redundancy = 1; - celt_to_silk = 1; - st->silk_bw_switch = 0; - prefill=1; - } - if (redundancy) + /* When encoding multiframes, we can ask for a switch to CELT only in the last frame. This switch + * is processed above as the requested mode shouldn't interrupt stereo->mono transition. */ + if (st->stream_channels == 1 && st->prev_channels ==2 && st->silk_mode.toMono==0 + && st->mode != MODE_CELT_ONLY && st->prev_mode != MODE_CELT_ONLY) { - /* Fair share of the max size allowed */ - redundancy_bytes = IMIN(257, max_data_bytes*(opus_int32)(st->Fs/200)/(frame_size+st->Fs/200)); - /* For VBR, target the actual bitrate (subject to the limit above) */ - if (st->use_vbr) - redundancy_bytes = IMIN(redundancy_bytes, st->bitrate_bps/1600); + /* Delay stereo->mono transition by two frames so that SILK can do a smooth downmix */ + st->silk_mode.toMono = 1; + st->stream_channels = 2; + } else { + st->silk_mode.toMono = 0; } + /* Update equivalent rate with mode decision. */ + equiv_rate = compute_equiv_rate(st->bitrate_bps, st->stream_channels, st->Fs/frame_size, + st->use_vbr, st->mode, st->silk_mode.complexity, st->silk_mode.packetLossPercentage); + if (st->mode != MODE_CELT_ONLY && st->prev_mode == MODE_CELT_ONLY) { silk_EncControlStruct dummy; @@ -1257,17 +1458,7 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ const opus_int32 *voice_bandwidth_thresholds, *music_bandwidth_thresholds; opus_int32 bandwidth_thresholds[8]; int bandwidth = OPUS_BANDWIDTH_FULLBAND; - opus_int32 equiv_rate2; - equiv_rate2 = equiv_rate; - if (st->mode != MODE_CELT_ONLY) - { - /* Adjust the threshold +/- 10% depending on complexity */ - equiv_rate2 = equiv_rate2 * (45+st->silk_mode.complexity)/50; - /* CBR is less efficient by ~1 kb/s */ - if (!st->use_vbr) - equiv_rate2 -= 1000; - } if (st->channels==2 && st->force_channels!=1) { voice_bandwidth_thresholds = stereo_voice_bandwidth_thresholds; @@ -1288,15 +1479,19 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ hysteresis = bandwidth_thresholds[2*(bandwidth-OPUS_BANDWIDTH_MEDIUMBAND)+1]; if (!st->first) { - if (st->bandwidth >= bandwidth) + if (st->auto_bandwidth >= bandwidth) threshold -= hysteresis; else threshold += hysteresis; } - if (equiv_rate2 >= threshold) + if (equiv_rate >= threshold) break; } while (--bandwidth>OPUS_BANDWIDTH_NARROWBAND); - st->bandwidth = bandwidth; + /* We don't use mediumband anymore, except when explicitly requested or during + mode transitions. */ + if (bandwidth == OPUS_BANDWIDTH_MEDIUMBAND) + bandwidth = OPUS_BANDWIDTH_WIDEBAND; + st->bandwidth = st->auto_bandwidth = bandwidth; /* Prevents any transition to SWB/FB until the SILK layer has fully switched to WB mode and turned the variable LP filter off */ if (!st->first && st->mode != MODE_CELT_ONLY && !st->silk_mode.inWBmodeWithoutVariableLP && st->bandwidth > OPUS_BANDWIDTH_WIDEBAND) @@ -1349,6 +1544,8 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ st->bandwidth = IMIN(st->bandwidth, st->detected_bandwidth); } #endif + st->silk_mode.LBRR_coded = decide_fec(st->silk_mode.useInBandFEC, st->silk_mode.packetLossPercentage, + st->silk_mode.LBRR_coded, st->mode, &st->bandwidth, equiv_rate); celt_encoder_ctl(celt_enc, OPUS_SET_LSB_DEPTH(lsb_depth)); /* CELT mode doesn't support mediumband, use wideband instead */ @@ -1357,15 +1554,34 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ if (st->lfe) st->bandwidth = OPUS_BANDWIDTH_NARROWBAND; - /* Can't support higher than wideband for >20 ms frames */ - if (frame_size > st->Fs/50 && (st->mode == MODE_CELT_ONLY || st->bandwidth > OPUS_BANDWIDTH_WIDEBAND)) + curr_bandwidth = st->bandwidth; + + /* Chooses the appropriate mode for speech + *NEVER* switch to/from CELT-only mode here as this will invalidate some assumptions */ + if (st->mode == MODE_SILK_ONLY && curr_bandwidth > OPUS_BANDWIDTH_WIDEBAND) + st->mode = MODE_HYBRID; + if (st->mode == MODE_HYBRID && curr_bandwidth <= OPUS_BANDWIDTH_WIDEBAND) + st->mode = MODE_SILK_ONLY; + + /* Can't support higher than >60 ms frames, and >20 ms when in Hybrid or CELT-only modes */ + if ((frame_size > st->Fs/50 && (st->mode != MODE_SILK_ONLY)) || frame_size > 3*st->Fs/50) { - VARDECL(unsigned char, tmp_data); + int enc_frame_size; int nb_frames; - int bak_mode, bak_bandwidth, bak_channels, bak_to_mono; - VARDECL(OpusRepacketizer, rp); - opus_int32 bytes_per_frame; - opus_int32 repacketize_len; + + if (st->mode == MODE_SILK_ONLY) + { + if (frame_size == 2*st->Fs/25) /* 80 ms -> 2x 40 ms */ + enc_frame_size = st->Fs/25; + else if (frame_size == 3*st->Fs/25) /* 120 ms -> 2x 60 ms */ + enc_frame_size = 3*st->Fs/50; + else /* 100 ms -> 5x 20 ms */ + enc_frame_size = st->Fs/50; + } + else + enc_frame_size = st->Fs/50; + + nb_frames = frame_size/enc_frame_size; #ifndef DISABLE_FLOAT_API if (analysis_read_pos_bak!= -1) @@ -1375,74 +1591,34 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ } #endif - nb_frames = frame_size > st->Fs/25 ? 3 : 2; - bytes_per_frame = IMIN(1276,(out_data_bytes-3)/nb_frames); - - ALLOC(tmp_data, nb_frames*bytes_per_frame, unsigned char); - - ALLOC(rp, 1, OpusRepacketizer); - opus_repacketizer_init(rp); - - bak_mode = st->user_forced_mode; - bak_bandwidth = st->user_bandwidth; - bak_channels = st->force_channels; + ret = encode_multiframe_packet(st, pcm, nb_frames, enc_frame_size, data, + out_data_bytes, to_celt, lsb_depth, float_api); - st->user_forced_mode = st->mode; - st->user_bandwidth = st->bandwidth; - st->force_channels = st->stream_channels; - bak_to_mono = st->silk_mode.toMono; - - if (bak_to_mono) - st->force_channels = 1; - else - st->prev_channels = st->stream_channels; - for (i=0;i<nb_frames;i++) - { - int tmp_len; - st->silk_mode.toMono = 0; - /* When switching from SILK/Hybrid to CELT, only ask for a switch at the last frame */ - if (to_celt && i==nb_frames-1) - st->user_forced_mode = MODE_CELT_ONLY; - tmp_len = opus_encode_native(st, pcm+i*(st->channels*st->Fs/50), st->Fs/50, - tmp_data+i*bytes_per_frame, bytes_per_frame, lsb_depth, - NULL, 0, c1, c2, analysis_channels, downmix, float_api); - if (tmp_len<0) - { - RESTORE_STACK; - return OPUS_INTERNAL_ERROR; - } - ret = opus_repacketizer_cat(rp, tmp_data+i*bytes_per_frame, tmp_len); - if (ret<0) - { - RESTORE_STACK; - return OPUS_INTERNAL_ERROR; - } - } - if (st->use_vbr) - repacketize_len = out_data_bytes; - else - repacketize_len = IMIN(3*st->bitrate_bps/(3*8*50/nb_frames), out_data_bytes); - ret = opus_repacketizer_out_range_impl(rp, 0, nb_frames, data, repacketize_len, 0, !st->use_vbr); - if (ret<0) - { - RESTORE_STACK; - return OPUS_INTERNAL_ERROR; - } - st->user_forced_mode = bak_mode; - st->user_bandwidth = bak_bandwidth; - st->force_channels = bak_channels; - st->silk_mode.toMono = bak_to_mono; RESTORE_STACK; return ret; } - curr_bandwidth = st->bandwidth; - /* Chooses the appropriate mode for speech - *NEVER* switch to/from CELT-only mode here as this will invalidate some assumptions */ - if (st->mode == MODE_SILK_ONLY && curr_bandwidth > OPUS_BANDWIDTH_WIDEBAND) - st->mode = MODE_HYBRID; - if (st->mode == MODE_HYBRID && curr_bandwidth <= OPUS_BANDWIDTH_WIDEBAND) - st->mode = MODE_SILK_ONLY; + /* For the first frame at a new SILK bandwidth */ + if (st->silk_bw_switch) + { + redundancy = 1; + celt_to_silk = 1; + st->silk_bw_switch = 0; + /* Do a prefill without reseting the sampling rate control. */ + prefill=2; + } + + /* If we decided to go with CELT, make sure redundancy is off, no matter what + we decided earlier. */ + if (st->mode == MODE_CELT_ONLY) + redundancy = 0; + + if (redundancy) + { + redundancy_bytes = compute_redundancy_bytes(max_data_bytes, st->bitrate_bps, frame_rate, st->stream_channels); + if (redundancy_bytes == 0) + redundancy = 0; + } /* printf("%d %d %d %d\n", st->bitrate_bps, st->stream_channels, st->mode, curr_bandwidth); */ bytes_target = IMIN(max_data_bytes-redundancy_bytes, st->bitrate_bps * frame_size / (st->Fs * 8)) - 1; @@ -1467,7 +1643,7 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ if (st->application == OPUS_APPLICATION_VOIP) { - hp_cutoff(pcm, cutoff_Hz, &pcm_buf[total_buffer*st->channels], st->hp_mem, frame_size, st->channels, st->Fs); + hp_cutoff(pcm, cutoff_Hz, &pcm_buf[total_buffer*st->channels], st->hp_mem, frame_size, st->channels, st->Fs, st->arch); } else { dc_reject(pcm, 3, &pcm_buf[total_buffer*st->channels], st->hp_mem, frame_size, st->channels, st->Fs); } @@ -1492,6 +1668,7 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ if (st->mode != MODE_CELT_ONLY) { opus_int32 total_bitRate, celt_rate; + opus_int activity; #ifdef FIXED_POINT const opus_int16 *pcm_silk; #else @@ -1499,30 +1676,26 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ ALLOC(pcm_silk, st->channels*frame_size, opus_int16); #endif + activity = VAD_NO_DECISION; +#ifndef DISABLE_FLOAT_API + if( analysis_info.valid ) { + /* Inform SILK about the Opus VAD decision */ + activity = ( analysis_info.activity_probability >= DTX_ACTIVITY_THRESHOLD ); + } +#endif + /* Distribute bits between SILK and CELT */ total_bitRate = 8 * bytes_target * frame_rate; if( st->mode == MODE_HYBRID ) { - int HB_gain_ref; /* Base rate for SILK */ - st->silk_mode.bitRate = st->stream_channels * ( 5000 + 1000 * ( st->Fs == 100 * frame_size ) ); - if( curr_bandwidth == OPUS_BANDWIDTH_SUPERWIDEBAND ) { - /* SILK gets 2/3 of the remaining bits */ - st->silk_mode.bitRate += ( total_bitRate - st->silk_mode.bitRate ) * 2 / 3; - } else { /* FULLBAND */ - /* SILK gets 3/5 of the remaining bits */ - st->silk_mode.bitRate += ( total_bitRate - st->silk_mode.bitRate ) * 3 / 5; - } - /* Don't let SILK use more than 80% */ - if( st->silk_mode.bitRate > total_bitRate * 4/5 ) { - st->silk_mode.bitRate = total_bitRate * 4/5; - } + st->silk_mode.bitRate = compute_silk_rate_for_hybrid(total_bitRate, + curr_bandwidth, st->Fs == 50 * frame_size, st->use_vbr, st->silk_mode.LBRR_coded, + st->stream_channels); if (!st->energy_masking) { /* Increasingly attenuate high band when it gets allocated fewer bits */ celt_rate = total_bitRate - st->silk_mode.bitRate; - HB_gain_ref = (curr_bandwidth == OPUS_BANDWIDTH_SUPERWIDEBAND) ? 3000 : 3600; - HB_gain = SHL32((opus_val32)celt_rate, 9) / SHR32((opus_val32)celt_rate + st->stream_channels * HB_gain_ref, 6); - HB_gain = HB_gain < (opus_val32)Q15ONE*6/7 ? HB_gain + Q15ONE/7 : Q15ONE; + HB_gain = Q15ONE - SHR32(celt_exp2(-celt_rate * QCONST16(1.f/1024, 10)), 1); } } else { /* SILK gets all bits */ @@ -1569,7 +1742,6 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ st->silk_mode.bitRate += 3*rate_offset/5; else st->silk_mode.bitRate += rate_offset; - bytes_target += rate_offset * frame_size / (8 * st->Fs); } st->silk_mode.payloadSize_ms = 1000 * frame_size / st->Fs; @@ -1580,7 +1752,7 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ } else if (curr_bandwidth == OPUS_BANDWIDTH_MEDIUMBAND) { st->silk_mode.desiredInternalSampleRate = 12000; } else { - silk_assert( st->mode == MODE_HYBRID || curr_bandwidth == OPUS_BANDWIDTH_WIDEBAND ); + celt_assert( st->mode == MODE_HYBRID || curr_bandwidth == OPUS_BANDWIDTH_WIDEBAND ); st->silk_mode.desiredInternalSampleRate = 16000; } if( st->mode == MODE_HYBRID ) { @@ -1590,40 +1762,53 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ st->silk_mode.minInternalSampleRate = 8000; } + st->silk_mode.maxInternalSampleRate = 16000; if (st->mode == MODE_SILK_ONLY) { opus_int32 effective_max_rate = max_rate; - st->silk_mode.maxInternalSampleRate = 16000; if (frame_rate > 50) effective_max_rate = effective_max_rate*2/3; - if (effective_max_rate < 13000) + if (effective_max_rate < 8000) { st->silk_mode.maxInternalSampleRate = 12000; st->silk_mode.desiredInternalSampleRate = IMIN(12000, st->silk_mode.desiredInternalSampleRate); } - if (effective_max_rate < 9600) + if (effective_max_rate < 7000) { st->silk_mode.maxInternalSampleRate = 8000; st->silk_mode.desiredInternalSampleRate = IMIN(8000, st->silk_mode.desiredInternalSampleRate); } - } else { - st->silk_mode.maxInternalSampleRate = 16000; } st->silk_mode.useCBR = !st->use_vbr; /* Call SILK encoder for the low band */ - nBytes = IMIN(1275, max_data_bytes-1-redundancy_bytes); - st->silk_mode.maxBits = nBytes*8; - /* Only allow up to 90% of the bits for hybrid mode*/ - if (st->mode == MODE_HYBRID) - st->silk_mode.maxBits = (opus_int32)st->silk_mode.maxBits*9/10; + /* Max bits for SILK, counting ToC, redundancy bytes, and optionally redundancy. */ + st->silk_mode.maxBits = (max_data_bytes-1)*8; + if (redundancy && redundancy_bytes >= 2) + { + /* Counting 1 bit for redundancy position and 20 bits for flag+size (only for hybrid). */ + st->silk_mode.maxBits -= redundancy_bytes*8 + 1; + if (st->mode == MODE_HYBRID) + st->silk_mode.maxBits -= 20; + } if (st->silk_mode.useCBR) { - st->silk_mode.maxBits = (st->silk_mode.bitRate * frame_size / (st->Fs * 8))*8; - /* Reduce the initial target to make it easier to reach the CBR rate */ - st->silk_mode.bitRate = IMAX(1, st->silk_mode.bitRate-2000); + if (st->mode == MODE_HYBRID) + { + st->silk_mode.maxBits = IMIN(st->silk_mode.maxBits, st->silk_mode.bitRate * frame_size / st->Fs); + } + } else { + /* Constrained VBR. */ + if (st->mode == MODE_HYBRID) + { + /* Compute SILK bitrate corresponding to the max total bits available */ + opus_int32 maxBitRate = compute_silk_rate_for_hybrid(st->silk_mode.maxBits*st->Fs / frame_size, + curr_bandwidth, st->Fs == 50 * frame_size, st->use_vbr, st->silk_mode.LBRR_coded, + st->stream_channels); + st->silk_mode.maxBits = maxBitRate * frame_size / st->Fs; + } } if (prefill) @@ -1646,7 +1831,9 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ for (i=0;i<st->encoder_buffer*st->channels;i++) pcm_silk[i] = FLOAT2INT16(st->delay_buffer[i]); #endif - silk_Encode( silk_enc, &st->silk_mode, pcm_silk, st->encoder_buffer, NULL, &zero, 1 ); + silk_Encode( silk_enc, &st->silk_mode, pcm_silk, st->encoder_buffer, NULL, &zero, prefill, activity ); + /* Prevent a second switch in the real encode call. */ + st->silk_mode.opusCanSwitch = 0; } #ifdef FIXED_POINT @@ -1655,20 +1842,14 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ for (i=0;i<frame_size*st->channels;i++) pcm_silk[i] = FLOAT2INT16(pcm_buf[total_buffer*st->channels + i]); #endif - ret = silk_Encode( silk_enc, &st->silk_mode, pcm_silk, frame_size, &enc, &nBytes, 0 ); + ret = silk_Encode( silk_enc, &st->silk_mode, pcm_silk, frame_size, &enc, &nBytes, 0, activity ); if( ret ) { /*fprintf (stderr, "SILK encode error: %d\n", ret);*/ /* Handle error */ RESTORE_STACK; return OPUS_INTERNAL_ERROR; } - if (nBytes==0) - { - st->rangeFinal = 0; - data[-1] = gen_toc(st->mode, st->Fs/frame_size, curr_bandwidth, st->stream_channels); - RESTORE_STACK; - return 1; - } + /* Extract SILK internal bandwidth for signaling in first byte */ if( st->mode == MODE_SILK_ONLY ) { if( st->silk_mode.internalSampleRate == 8000 ) { @@ -1679,14 +1860,24 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ curr_bandwidth = OPUS_BANDWIDTH_WIDEBAND; } } else { - silk_assert( st->silk_mode.internalSampleRate == 16000 ); + celt_assert( st->silk_mode.internalSampleRate == 16000 ); + } + + st->silk_mode.opusCanSwitch = st->silk_mode.switchReady && !st->nonfinal_frame; + + if (nBytes==0) + { + st->rangeFinal = 0; + data[-1] = gen_toc(st->mode, st->Fs/frame_size, curr_bandwidth, st->stream_channels); + RESTORE_STACK; + return 1; } - st->silk_mode.opusCanSwitch = st->silk_mode.switchReady; /* FIXME: How do we allocate the redundancy for CBR? */ if (st->silk_mode.opusCanSwitch) { - redundancy = 1; + redundancy_bytes = compute_redundancy_bytes(max_data_bytes, st->bitrate_bps, frame_rate, st->stream_channels); + redundancy = (redundancy_bytes != 0); celt_to_silk = 0; st->silk_bw_switch = 1; } @@ -1727,40 +1918,18 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ if (st->mode == MODE_HYBRID) { - int len; - - len = (ec_tell(&enc)+7)>>3; - if (redundancy) - len += st->mode == MODE_HYBRID ? 3 : 1; if( st->use_vbr ) { - nb_compr_bytes = len + bytes_target - (st->silk_mode.bitRate * frame_size) / (8 * st->Fs); - } else { - /* check if SILK used up too much */ - nb_compr_bytes = len > bytes_target ? len : bytes_target; + celt_encoder_ctl(celt_enc, OPUS_SET_BITRATE(st->bitrate_bps-st->silk_mode.bitRate)); + celt_encoder_ctl(celt_enc, OPUS_SET_VBR_CONSTRAINT(0)); } } else { if (st->use_vbr) { - opus_int32 bonus=0; -#ifndef DISABLE_FLOAT_API - if (st->variable_duration==OPUS_FRAMESIZE_VARIABLE && frame_size != st->Fs/50) - { - bonus = (60*st->stream_channels+40)*(st->Fs/frame_size-50); - if (analysis_info.valid) - bonus = (opus_int32)(bonus*(1.f+.5f*analysis_info.tonality)); - } -#endif celt_encoder_ctl(celt_enc, OPUS_SET_VBR(1)); celt_encoder_ctl(celt_enc, OPUS_SET_VBR_CONSTRAINT(st->vbr_constraint)); - celt_encoder_ctl(celt_enc, OPUS_SET_BITRATE(st->bitrate_bps+bonus)); - nb_compr_bytes = max_data_bytes-1-redundancy_bytes; - } else { - nb_compr_bytes = bytes_target; + celt_encoder_ctl(celt_enc, OPUS_SET_BITRATE(st->bitrate_bps)); } } - - } else { - nb_compr_bytes = 0; } ALLOC(tmp_prefill, st->channels*st->Fs/400, opus_val16); @@ -1786,7 +1955,14 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ } st->prev_HB_gain = HB_gain; if (st->mode != MODE_HYBRID || st->stream_channels==1) - st->silk_mode.stereoWidth_Q14 = IMIN((1<<14),2*IMAX(0,equiv_rate-30000)); + { + if (equiv_rate > 32000) + st->silk_mode.stereoWidth_Q14 = 16384; + else if (equiv_rate < 16000) + st->silk_mode.stereoWidth_Q14 = 0; + else + st->silk_mode.stereoWidth_Q14 = 16384 - 2048*(opus_int32)(32000-equiv_rate)/(equiv_rate-14000); + } if( !st->energy_masking && st->channels == 2 ) { /* Apply stereo width reduction (at low bitrates) */ if( st->hybrid_stereo_width_Q14 < (1 << 14) || st->silk_mode.stereoWidth_Q14 < (1 << 14) ) { @@ -1809,19 +1985,23 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ if ( st->mode != MODE_CELT_ONLY && ec_tell(&enc)+17+20*(st->mode == MODE_HYBRID) <= 8*(max_data_bytes-1)) { /* For SILK mode, the redundancy is inferred from the length */ - if (st->mode == MODE_HYBRID && (redundancy || ec_tell(&enc)+37 <= 8*nb_compr_bytes)) + if (st->mode == MODE_HYBRID) ec_enc_bit_logp(&enc, redundancy, 12); if (redundancy) { int max_redundancy; ec_enc_bit_logp(&enc, celt_to_silk, 1); if (st->mode == MODE_HYBRID) - max_redundancy = (max_data_bytes-1)-nb_compr_bytes; + { + /* Reserve the 8 bits needed for the redundancy length, + and at least a few bits for CELT if possible */ + max_redundancy = (max_data_bytes-1)-((ec_tell(&enc)+8+3+7)>>3); + } else max_redundancy = (max_data_bytes-1)-((ec_tell(&enc)+7)>>3); /* Target the same bit-rate for redundancy as for the rest, up to a max of 257 bytes */ - redundancy_bytes = IMIN(max_redundancy, st->bitrate_bps/1600); + redundancy_bytes = IMIN(max_redundancy, redundancy_bytes); redundancy_bytes = IMIN(257, IMAX(2, redundancy_bytes)); if (st->mode == MODE_HYBRID) ec_enc_uint(&enc, redundancy_bytes-2, 256); @@ -1843,7 +2023,7 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ ec_enc_done(&enc); nb_compr_bytes = ret; } else { - nb_compr_bytes = IMIN((max_data_bytes-1)-redundancy_bytes, nb_compr_bytes); + nb_compr_bytes = (max_data_bytes-1)-redundancy_bytes; ec_enc_shrink(&enc, nb_compr_bytes); } @@ -1851,6 +2031,12 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ if (redundancy || st->mode != MODE_SILK_ONLY) celt_encoder_ctl(celt_enc, CELT_SET_ANALYSIS(&analysis_info)); #endif + if (st->mode == MODE_HYBRID) { + SILKInfo info; + info.signalType = st->silk_mode.signalType; + info.offset = st->silk_mode.offset; + celt_encoder_ctl(celt_enc, CELT_SET_SILK_INFO(&info)); + } /* 5 ms redundant frame for CELT->SILK */ if (redundancy && celt_to_silk) @@ -1858,6 +2044,7 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ int err; celt_encoder_ctl(celt_enc, CELT_SET_START_BAND(0)); celt_encoder_ctl(celt_enc, OPUS_SET_VBR(0)); + celt_encoder_ctl(celt_enc, OPUS_SET_BITRATE(OPUS_BITRATE_MAX)); err = celt_encode_with_ec(celt_enc, pcm_buf, st->Fs/200, data+nb_compr_bytes, redundancy_bytes, NULL); if (err < 0) { @@ -1881,15 +2068,25 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ celt_encode_with_ec(celt_enc, tmp_prefill, st->Fs/400, dummy, 2, NULL); celt_encoder_ctl(celt_enc, CELT_SET_PREDICTION(0)); } - /* If false, we already busted the budget and we'll end up with a "PLC packet" */ + /* If false, we already busted the budget and we'll end up with a "PLC frame" */ if (ec_tell(&enc) <= 8*nb_compr_bytes) { + /* Set the bitrate again if it was overridden in the redundancy code above*/ + if (redundancy && celt_to_silk && st->mode==MODE_HYBRID && st->use_vbr) + celt_encoder_ctl(celt_enc, OPUS_SET_BITRATE(st->bitrate_bps-st->silk_mode.bitRate)); + celt_encoder_ctl(celt_enc, OPUS_SET_VBR(st->use_vbr)); ret = celt_encode_with_ec(celt_enc, pcm_buf, frame_size, NULL, nb_compr_bytes, &enc); if (ret < 0) { RESTORE_STACK; return OPUS_INTERNAL_ERROR; } + /* Put CELT->SILK redundancy data in the right place. */ + if (redundancy && celt_to_silk && st->mode==MODE_HYBRID && st->use_vbr) + { + OPUS_MOVE(data+ret, data+nb_compr_bytes, redundancy_bytes); + nb_compr_bytes = nb_compr_bytes+redundancy_bytes; + } } } @@ -1905,7 +2102,15 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ celt_encoder_ctl(celt_enc, OPUS_RESET_STATE); celt_encoder_ctl(celt_enc, CELT_SET_START_BAND(0)); celt_encoder_ctl(celt_enc, CELT_SET_PREDICTION(0)); + celt_encoder_ctl(celt_enc, OPUS_SET_VBR(0)); + celt_encoder_ctl(celt_enc, OPUS_SET_BITRATE(OPUS_BITRATE_MAX)); + if (st->mode == MODE_HYBRID) + { + /* Shrink packet to what the encoder actually used. */ + nb_compr_bytes = ret; + ec_enc_shrink(&enc, nb_compr_bytes); + } /* NOTE: We could speed this up slightly (at the expense of code size) by just adding a function that prefills the buffer */ celt_encode_with_ec(celt_enc, pcm_buf+st->channels*(frame_size-N2-N4), N4, dummy, 2, NULL); @@ -1935,6 +2140,23 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ st->first = 0; + /* DTX decision */ +#ifndef DISABLE_FLOAT_API + if (st->use_dtx && (analysis_info.valid || is_silence)) + { + if (decide_dtx_mode(analysis_info.activity_probability, &st->nb_no_activity_frames, + st->peak_signal_energy, pcm, frame_size, st->channels, is_silence, st->arch)) + { + st->rangeFinal = 0; + data[0] = gen_toc(st->mode, st->Fs/frame_size, curr_bandwidth, st->stream_channels); + RESTORE_STACK; + return 1; + } + } else { + st->nb_no_activity_frames = 0; + } +#endif + /* In the unlikely case that the SILK encoder busted its target, tell the decoder to call the PLC */ if (ec_tell(&enc) > (max_data_bytes-1)*8) @@ -1962,7 +2184,6 @@ opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_ if (!st->use_vbr) { if (opus_packet_pad(data, ret, max_data_bytes) != OPUS_OK) - { RESTORE_STACK; return OPUS_INTERNAL_ERROR; @@ -1981,18 +2202,15 @@ opus_int32 opus_encode_float(OpusEncoder *st, const float *pcm, int analysis_fra { int i, ret; int frame_size; - int delay_compensation; VARDECL(opus_int16, in); ALLOC_STACK; - if (st->application == OPUS_APPLICATION_RESTRICTED_LOWDELAY) - delay_compensation = 0; - else - delay_compensation = st->delay_compensation; - frame_size = compute_frame_size(pcm, analysis_frame_size, - st->variable_duration, st->channels, st->Fs, st->bitrate_bps, - delay_compensation, downmix_float, st->analysis.subframe_mem); - + frame_size = frame_size_select(analysis_frame_size, st->variable_duration, st->Fs); + if (frame_size <= 0) + { + RESTORE_STACK; + return OPUS_BAD_ARG; + } ALLOC(in, frame_size*st->channels, opus_int16); for (i=0;i<frame_size*st->channels;i++) @@ -2008,18 +2226,7 @@ opus_int32 opus_encode(OpusEncoder *st, const opus_int16 *pcm, int analysis_fram unsigned char *data, opus_int32 out_data_bytes) { int frame_size; - int delay_compensation; - if (st->application == OPUS_APPLICATION_RESTRICTED_LOWDELAY) - delay_compensation = 0; - else - delay_compensation = st->delay_compensation; - frame_size = compute_frame_size(pcm, analysis_frame_size, - st->variable_duration, st->channels, st->Fs, st->bitrate_bps, - delay_compensation, downmix_int -#ifndef DISABLE_FLOAT_API - , st->analysis.subframe_mem -#endif - ); + frame_size = frame_size_select(analysis_frame_size, st->variable_duration, st->Fs); return opus_encode_native(st, pcm, frame_size, data, out_data_bytes, 16, pcm, analysis_frame_size, 0, -2, st->channels, downmix_int, 0); } @@ -2030,18 +2237,15 @@ opus_int32 opus_encode(OpusEncoder *st, const opus_int16 *pcm, int analysis_fram { int i, ret; int frame_size; - int delay_compensation; VARDECL(float, in); ALLOC_STACK; - if (st->application == OPUS_APPLICATION_RESTRICTED_LOWDELAY) - delay_compensation = 0; - else - delay_compensation = st->delay_compensation; - frame_size = compute_frame_size(pcm, analysis_frame_size, - st->variable_duration, st->channels, st->Fs, st->bitrate_bps, - delay_compensation, downmix_int, st->analysis.subframe_mem); - + frame_size = frame_size_select(analysis_frame_size, st->variable_duration, st->Fs); + if (frame_size <= 0) + { + RESTORE_STACK; + return OPUS_BAD_ARG; + } ALLOC(in, frame_size*st->channels, float); for (i=0;i<frame_size*st->channels;i++) @@ -2055,14 +2259,7 @@ opus_int32 opus_encode_float(OpusEncoder *st, const float *pcm, int analysis_fra unsigned char *data, opus_int32 out_data_bytes) { int frame_size; - int delay_compensation; - if (st->application == OPUS_APPLICATION_RESTRICTED_LOWDELAY) - delay_compensation = 0; - else - delay_compensation = st->delay_compensation; - frame_size = compute_frame_size(pcm, analysis_frame_size, - st->variable_duration, st->channels, st->Fs, st->bitrate_bps, - delay_compensation, downmix_float, st->analysis.subframe_mem); + frame_size = frame_size_select(analysis_frame_size, st->variable_duration, st->Fs); return opus_encode_native(st, pcm, frame_size, data, out_data_bytes, 24, pcm, analysis_frame_size, 0, -2, st->channels, downmix_float, 1); } @@ -2093,6 +2290,9 @@ int opus_encoder_ctl(OpusEncoder *st, int request, ...) break; } st->application = value; +#ifndef DISABLE_FLOAT_API + st->analysis.application = value; +#endif } break; case OPUS_GET_APPLICATION_REQUEST: @@ -2211,7 +2411,7 @@ int opus_encoder_ctl(OpusEncoder *st, int request, ...) { goto bad_arg; } - st->silk_mode.useDTX = value; + st->use_dtx = value; } break; case OPUS_GET_DTX_REQUEST: @@ -2221,7 +2421,7 @@ int opus_encoder_ctl(OpusEncoder *st, int request, ...) { goto bad_arg; } - *value = st->silk_mode.useDTX; + *value = st->use_dtx; } break; case OPUS_SET_COMPLEXITY_REQUEST: @@ -2422,15 +2622,15 @@ int opus_encoder_ctl(OpusEncoder *st, int request, ...) case OPUS_SET_EXPERT_FRAME_DURATION_REQUEST: { opus_int32 value = va_arg(ap, opus_int32); - if (value != OPUS_FRAMESIZE_ARG && value != OPUS_FRAMESIZE_2_5_MS && - value != OPUS_FRAMESIZE_5_MS && value != OPUS_FRAMESIZE_10_MS && - value != OPUS_FRAMESIZE_20_MS && value != OPUS_FRAMESIZE_40_MS && - value != OPUS_FRAMESIZE_60_MS && value != OPUS_FRAMESIZE_VARIABLE) + if (value != OPUS_FRAMESIZE_ARG && value != OPUS_FRAMESIZE_2_5_MS && + value != OPUS_FRAMESIZE_5_MS && value != OPUS_FRAMESIZE_10_MS && + value != OPUS_FRAMESIZE_20_MS && value != OPUS_FRAMESIZE_40_MS && + value != OPUS_FRAMESIZE_60_MS && value != OPUS_FRAMESIZE_80_MS && + value != OPUS_FRAMESIZE_100_MS && value != OPUS_FRAMESIZE_120_MS) { goto bad_arg; } st->variable_duration = value; - celt_encoder_ctl(celt_enc, OPUS_SET_EXPERT_FRAME_DURATION(value)); } break; case OPUS_GET_EXPERT_FRAME_DURATION_REQUEST: @@ -2459,6 +2659,26 @@ int opus_encoder_ctl(OpusEncoder *st, int request, ...) *value = st->silk_mode.reducedDependency; } break; + case OPUS_SET_PHASE_INVERSION_DISABLED_REQUEST: + { + opus_int32 value = va_arg(ap, opus_int32); + if(value<0 || value>1) + { + goto bad_arg; + } + celt_encoder_ctl(celt_enc, OPUS_SET_PHASE_INVERSION_DISABLED(value)); + } + break; + case OPUS_GET_PHASE_INVERSION_DISABLED_REQUEST: + { + opus_int32 *value = va_arg(ap, opus_int32*); + if (!value) + { + goto bad_arg; + } + celt_encoder_ctl(celt_enc, OPUS_GET_PHASE_INVERSION_DISABLED(value)); + } + break; case OPUS_RESET_STATE: { void *silk_enc; @@ -2507,6 +2727,33 @@ int opus_encoder_ctl(OpusEncoder *st, int request, ...) ret = celt_encoder_ctl(celt_enc, OPUS_SET_ENERGY_MASK(value)); } break; + case OPUS_GET_IN_DTX_REQUEST: + { + opus_int32 *value = va_arg(ap, opus_int32*); + if (!value) + { + goto bad_arg; + } + if (st->silk_mode.useDTX && (st->prev_mode == MODE_SILK_ONLY || st->prev_mode == MODE_HYBRID)) { + /* DTX determined by Silk. */ + int n; + void *silk_enc = (char*)st+st->silk_enc_offset; + *value = 1; + for (n=0;n<st->silk_mode.nChannelsInternal;n++) { + *value = *value && ((silk_encoder*)silk_enc)->state_Fxx[n].sCmn.noSpeechCounter >= NB_SPEECH_FRAMES_BEFORE_DTX; + } + } +#ifndef DISABLE_FLOAT_API + else if (st->use_dtx) { + /* DTX determined by Opus. */ + *value = st->nb_no_activity_frames >= NB_SPEECH_FRAMES_BEFORE_DTX; + } +#endif + else { + *value = 0; + } + } + break; case CELT_GET_MODE_REQUEST: { diff --git a/thirdparty/opus/opus_multistream_decoder.c b/thirdparty/opus/opus_multistream_decoder.c index b95eaa6eac..0018517aeb 100644 --- a/thirdparty/opus/opus_multistream_decoder.c +++ b/thirdparty/opus/opus_multistream_decoder.c @@ -37,15 +37,18 @@ #include "float_cast.h" #include "os_support.h" -struct OpusMSDecoder { - ChannelLayout layout; - /* Decoder states go here */ -}; - - +/* DECODER */ +#if defined(ENABLE_HARDENING) || defined(ENABLE_ASSERTIONS) +static void validate_ms_decoder(OpusMSDecoder *st) +{ + validate_layout(&st->layout); +} +#define VALIDATE_MS_DECODER(st) validate_ms_decoder(st) +#else +#define VALIDATE_MS_DECODER(st) +#endif -/* DECODER */ opus_int32 opus_multistream_decoder_get_size(int nb_streams, int nb_coupled_streams) { @@ -143,15 +146,6 @@ OpusMSDecoder *opus_multistream_decoder_create( return st; } -typedef void (*opus_copy_channel_out_func)( - void *dst, - int dst_stride, - int dst_channel, - const opus_val16 *src, - int src_stride, - int frame_size -); - static int opus_multistream_packet_validate(const unsigned char *data, opus_int32 len, int nb_streams, opus_int32 Fs) { @@ -181,7 +175,7 @@ static int opus_multistream_packet_validate(const unsigned char *data, return samples; } -static int opus_multistream_decode_native( +int opus_multistream_decode_native( OpusMSDecoder *st, const unsigned char *data, opus_int32 len, @@ -189,7 +183,8 @@ static int opus_multistream_decode_native( opus_copy_channel_out_func copy_channel_out, int frame_size, int decode_fec, - int soft_clip + int soft_clip, + void *user_data ) { opus_int32 Fs; @@ -201,8 +196,14 @@ static int opus_multistream_decode_native( VARDECL(opus_val16, buf); ALLOC_STACK; + VALIDATE_MS_DECODER(st); + if (frame_size <= 0) + { + RESTORE_STACK; + return OPUS_BAD_ARG; + } /* Limit frame_size to avoid excessive stack allocations. */ - opus_multistream_decoder_ctl(st, OPUS_GET_SAMPLE_RATE(&Fs)); + MUST_SUCCEED(opus_multistream_decoder_ctl(st, OPUS_GET_SAMPLE_RATE(&Fs))); frame_size = IMIN(frame_size, Fs/25*3); ALLOC(buf, 2*frame_size, opus_val16); ptr = (char*)st + align(sizeof(OpusMSDecoder)); @@ -237,7 +238,8 @@ static int opus_multistream_decode_native( for (s=0;s<st->layout.nb_streams;s++) { OpusDecoder *dec; - int packet_offset, ret; + opus_int32 packet_offset; + int ret; dec = (OpusDecoder*)ptr; ptr += (s < st->layout.nb_coupled_streams) ? align(coupled_size) : align(mono_size); @@ -265,7 +267,7 @@ static int opus_multistream_decode_native( while ( (chan = get_left_channel(&st->layout, s, prev)) != -1) { (*copy_channel_out)(pcm, st->layout.nb_channels, chan, - buf, 2, frame_size); + buf, 2, frame_size, user_data); prev = chan; } prev = -1; @@ -273,7 +275,7 @@ static int opus_multistream_decode_native( while ( (chan = get_right_channel(&st->layout, s, prev)) != -1) { (*copy_channel_out)(pcm, st->layout.nb_channels, chan, - buf+1, 2, frame_size); + buf+1, 2, frame_size, user_data); prev = chan; } } else { @@ -283,7 +285,7 @@ static int opus_multistream_decode_native( while ( (chan = get_mono_channel(&st->layout, s, prev)) != -1) { (*copy_channel_out)(pcm, st->layout.nb_channels, chan, - buf, 1, frame_size); + buf, 1, frame_size, user_data); prev = chan; } } @@ -294,7 +296,7 @@ static int opus_multistream_decode_native( if (st->layout.mapping[c] == 255) { (*copy_channel_out)(pcm, st->layout.nb_channels, c, - NULL, 0, frame_size); + NULL, 0, frame_size, user_data); } } RESTORE_STACK; @@ -308,11 +310,13 @@ static void opus_copy_channel_out_float( int dst_channel, const opus_val16 *src, int src_stride, - int frame_size + int frame_size, + void *user_data ) { float *float_dst; opus_int32 i; + (void)user_data; float_dst = (float*)dst; if (src != NULL) { @@ -337,11 +341,13 @@ static void opus_copy_channel_out_short( int dst_channel, const opus_val16 *src, int src_stride, - int frame_size + int frame_size, + void *user_data ) { opus_int16 *short_dst; opus_int32 i; + (void)user_data; short_dst = (opus_int16*)dst; if (src != NULL) { @@ -372,7 +378,7 @@ int opus_multistream_decode( ) { return opus_multistream_decode_native(st, data, len, - pcm, opus_copy_channel_out_short, frame_size, decode_fec, 0); + pcm, opus_copy_channel_out_short, frame_size, decode_fec, 0, NULL); } #ifndef DISABLE_FLOAT_API @@ -380,7 +386,7 @@ int opus_multistream_decode_float(OpusMSDecoder *st, const unsigned char *data, opus_int32 len, float *pcm, int frame_size, int decode_fec) { return opus_multistream_decode_native(st, data, len, - pcm, opus_copy_channel_out_float, frame_size, decode_fec, 0); + pcm, opus_copy_channel_out_float, frame_size, decode_fec, 0, NULL); } #endif @@ -390,32 +396,30 @@ int opus_multistream_decode(OpusMSDecoder *st, const unsigned char *data, opus_int32 len, opus_int16 *pcm, int frame_size, int decode_fec) { return opus_multistream_decode_native(st, data, len, - pcm, opus_copy_channel_out_short, frame_size, decode_fec, 1); + pcm, opus_copy_channel_out_short, frame_size, decode_fec, 1, NULL); } int opus_multistream_decode_float( OpusMSDecoder *st, const unsigned char *data, opus_int32 len, - float *pcm, + opus_val16 *pcm, int frame_size, int decode_fec ) { return opus_multistream_decode_native(st, data, len, - pcm, opus_copy_channel_out_float, frame_size, decode_fec, 0); + pcm, opus_copy_channel_out_float, frame_size, decode_fec, 0, NULL); } #endif -int opus_multistream_decoder_ctl(OpusMSDecoder *st, int request, ...) +int opus_multistream_decoder_ctl_va_list(OpusMSDecoder *st, int request, + va_list ap) { - va_list ap; int coupled_size, mono_size; char *ptr; int ret = OPUS_OK; - va_start(ap, request); - coupled_size = opus_decoder_get_size(2); mono_size = opus_decoder_get_size(1); ptr = (char*)st + align(sizeof(OpusMSDecoder)); @@ -425,6 +429,7 @@ int opus_multistream_decoder_ctl(OpusMSDecoder *st, int request, ...) case OPUS_GET_SAMPLE_RATE_REQUEST: case OPUS_GET_GAIN_REQUEST: case OPUS_GET_LAST_PACKET_DURATION_REQUEST: + case OPUS_GET_PHASE_INVERSION_DISABLED_REQUEST: { OpusDecoder *dec; /* For int32* GET params, just query the first stream */ @@ -482,7 +487,7 @@ int opus_multistream_decoder_ctl(OpusMSDecoder *st, int request, ...) OpusDecoder **value; stream_id = va_arg(ap, opus_int32); if (stream_id<0 || stream_id >= st->layout.nb_streams) - ret = OPUS_BAD_ARG; + goto bad_arg; value = va_arg(ap, OpusDecoder**); if (!value) { @@ -499,6 +504,7 @@ int opus_multistream_decoder_ctl(OpusMSDecoder *st, int request, ...) } break; case OPUS_SET_GAIN_REQUEST: + case OPUS_SET_PHASE_INVERSION_DISABLED_REQUEST: { int s; /* This works for int32 params */ @@ -522,14 +528,20 @@ int opus_multistream_decoder_ctl(OpusMSDecoder *st, int request, ...) ret = OPUS_UNIMPLEMENTED; break; } - - va_end(ap); return ret; bad_arg: - va_end(ap); return OPUS_BAD_ARG; } +int opus_multistream_decoder_ctl(OpusMSDecoder *st, int request, ...) +{ + int ret; + va_list ap; + va_start(ap, request); + ret = opus_multistream_decoder_ctl_va_list(st, request, ap); + va_end(ap); + return ret; +} void opus_multistream_decoder_destroy(OpusMSDecoder *st) { diff --git a/thirdparty/opus/opus_multistream_encoder.c b/thirdparty/opus/opus_multistream_encoder.c index 1698223a16..93204a14c1 100644 --- a/thirdparty/opus/opus_multistream_encoder.c +++ b/thirdparty/opus/opus_multistream_encoder.c @@ -61,38 +61,6 @@ static const VorbisLayout vorbis_mappings[8] = { {5, 3, {0, 6, 1, 2, 3, 4, 5, 7}}, /* 8: 7.1 surround */ }; -typedef void (*opus_copy_channel_in_func)( - opus_val16 *dst, - int dst_stride, - const void *src, - int src_stride, - int src_channel, - int frame_size -); - -typedef enum { - MAPPING_TYPE_NONE, - MAPPING_TYPE_SURROUND -#ifdef ENABLE_EXPERIMENTAL_AMBISONICS - , /* Do not include comma at end of enumerator list */ - MAPPING_TYPE_AMBISONICS -#endif -} MappingType; - -struct OpusMSEncoder { - ChannelLayout layout; - int arch; - int lfe_stream; - int application; - int variable_duration; - MappingType mapping_type; - opus_int32 bitrate_bps; - float subframe_mem[3]; - /* Encoder states go here */ - /* then opus_val32 window_mem[channels*120]; */ - /* then opus_val32 preemph_mem[channels]; */ -}; - static opus_val32 *ms_get_preemph_mem(OpusMSEncoder *st) { int s; @@ -133,6 +101,29 @@ static opus_val32 *ms_get_window_mem(OpusMSEncoder *st) return (opus_val32*)(void*)ptr; } +static int validate_ambisonics(int nb_channels, int *nb_streams, int *nb_coupled_streams) +{ + int order_plus_one; + int acn_channels; + int nondiegetic_channels; + + if (nb_channels < 1 || nb_channels > 227) + return 0; + + order_plus_one = isqrt32(nb_channels); + acn_channels = order_plus_one * order_plus_one; + nondiegetic_channels = nb_channels - acn_channels; + + if (nondiegetic_channels != 0 && nondiegetic_channels != 2) + return 0; + + if (nb_streams) + *nb_streams = acn_channels + (nondiegetic_channels != 0); + if (nb_coupled_streams) + *nb_coupled_streams = nondiegetic_channels != 0; + return 1; +} + static int validate_encoder_layout(const ChannelLayout *layout) { int s; @@ -240,6 +231,7 @@ void surround_analysis(const CELTMode *celt_mode, const void *pcm, opus_val16 *b int pos[8] = {0}; int upsample; int frame_size; + int freq_size; opus_val16 channel_offset; opus_val32 bandE[21]; opus_val16 maskLogE[3][21]; @@ -250,6 +242,7 @@ void surround_analysis(const CELTMode *celt_mode, const void *pcm, opus_val16 *b upsample = resampling_factor(rate); frame_size = len*upsample; + freq_size = IMIN(960, frame_size); /* LM = log2(frame_size / 120) */ for (LM=0;LM<celt_mode->maxLM;LM++) @@ -258,7 +251,7 @@ void surround_analysis(const CELTMode *celt_mode, const void *pcm, opus_val16 *b ALLOC(in, frame_size+overlap, opus_val32); ALLOC(x, len, opus_val16); - ALLOC(freq, frame_size, opus_val32); + ALLOC(freq, freq_size, opus_val32); channel_pos(channels, pos); @@ -268,8 +261,11 @@ void surround_analysis(const CELTMode *celt_mode, const void *pcm, opus_val16 *b for (c=0;c<channels;c++) { + int frame; + int nb_frames = frame_size/freq_size; + celt_assert(nb_frames*freq_size == frame_size); OPUS_COPY(in, mem+c*overlap, overlap); - (*copy_channel_in)(x, 1, pcm, channels, c, len); + (*copy_channel_in)(x, 1, pcm, channels, c, len, NULL); celt_preemphasis(x, in+overlap, frame_size, 1, upsample, celt_mode->preemph, preemph_mem+c, 0); #ifndef FIXED_POINT { @@ -284,18 +280,26 @@ void surround_analysis(const CELTMode *celt_mode, const void *pcm, opus_val16 *b } } #endif - clt_mdct_forward(&celt_mode->mdct, in, freq, celt_mode->window, - overlap, celt_mode->maxLM-LM, 1, arch); - if (upsample != 1) + OPUS_CLEAR(bandE, 21); + for (frame=0;frame<nb_frames;frame++) { - int bound = len; - for (i=0;i<bound;i++) - freq[i] *= upsample; - for (;i<frame_size;i++) - freq[i] = 0; - } + opus_val32 tmpE[21]; + clt_mdct_forward(&celt_mode->mdct, in+960*frame, freq, celt_mode->window, + overlap, celt_mode->maxLM-LM, 1, arch); + if (upsample != 1) + { + int bound = freq_size/upsample; + for (i=0;i<bound;i++) + freq[i] *= upsample; + for (;i<freq_size;i++) + freq[i] = 0; + } - compute_band_energies(celt_mode, freq, bandE, 21, 1, LM); + compute_band_energies(celt_mode, freq, tmpE, 21, 1, LM, arch); + /* If we have multiple frames, take the max energy. */ + for (i=0;i<21;i++) + bandE[i] = MAX32(bandE[i], tmpE[i]); + } amp2Log2(celt_mode, 21, 21, bandE, bandLogE+21*c, 1); /* Apply spreading function with -6 dB/band going up and -12 dB/band going down. */ for (i=1;i<21;i++) @@ -408,12 +412,10 @@ opus_int32 opus_multistream_surround_encoder_get_size(int channels, int mapping_ { nb_streams=channels; nb_coupled_streams=0; -#ifdef ENABLE_EXPERIMENTAL_AMBISONICS - } else if (mapping_family==254) + } else if (mapping_family==2) { - nb_streams=channels; - nb_coupled_streams=0; -#endif + if (!validate_ambisonics(channels, &nb_streams, &nb_coupled_streams)) + return 0; } else return 0; size = opus_multistream_encoder_get_size(nb_streams, nb_coupled_streams); @@ -448,7 +450,6 @@ static int opus_multistream_encoder_init_impl( st->layout.nb_channels = channels; st->layout.nb_streams = streams; st->layout.nb_coupled_streams = coupled_streams; - st->subframe_mem[0]=st->subframe_mem[1]=st->subframe_mem[2]=0; if (mapping_type != MAPPING_TYPE_SURROUND) st->lfe_stream = -1; st->bitrate_bps = OPUS_AUTO; @@ -456,7 +457,13 @@ static int opus_multistream_encoder_init_impl( st->variable_duration = OPUS_FRAMESIZE_ARG; for (i=0;i<st->layout.nb_channels;i++) st->layout.mapping[i] = mapping[i]; - if (!validate_layout(&st->layout) || !validate_encoder_layout(&st->layout)) + if (!validate_layout(&st->layout)) + return OPUS_BAD_ARG; + if (mapping_type == MAPPING_TYPE_SURROUND && + !validate_encoder_layout(&st->layout)) + return OPUS_BAD_ARG; + if (mapping_type == MAPPING_TYPE_AMBISONICS && + !validate_ambisonics(st->layout.nb_channels, NULL, NULL)) return OPUS_BAD_ARG; ptr = (char*)st + align(sizeof(OpusMSEncoder)); coupled_size = opus_encoder_get_size(2); @@ -549,25 +556,23 @@ int opus_multistream_surround_encoder_init( *coupled_streams=0; for(i=0;i<channels;i++) mapping[i] = i; -#ifdef ENABLE_EXPERIMENTAL_AMBISONICS - } else if (mapping_family==254) + } else if (mapping_family==2) { int i; - *streams=channels; - *coupled_streams=0; - for(i=0;i<channels;i++) - mapping[i] = i; -#endif + if (!validate_ambisonics(channels, streams, coupled_streams)) + return OPUS_BAD_ARG; + for(i = 0; i < (*streams - *coupled_streams); i++) + mapping[i] = i + (*coupled_streams * 2); + for(i = 0; i < *coupled_streams * 2; i++) + mapping[i + (*streams - *coupled_streams)] = i; } else return OPUS_UNIMPLEMENTED; if (channels>2 && mapping_family==1) { mapping_type = MAPPING_TYPE_SURROUND; -#ifdef ENABLE_EXPERIMENTAL_AMBISONICS - } else if (mapping_family==254) + } else if (mapping_family==2) { mapping_type = MAPPING_TYPE_AMBISONICS; -#endif } else { mapping_type = MAPPING_TYPE_NONE; @@ -672,62 +677,62 @@ static void surround_rate_allocation( int lfe_offset; int coupled_ratio; /* Q8 */ int lfe_ratio; /* Q8 */ + int nb_lfe; + int nb_uncoupled; + int nb_coupled; + int nb_normal; + opus_int32 channel_offset; + opus_int32 bitrate; + int total; + + nb_lfe = (st->lfe_stream!=-1); + nb_coupled = st->layout.nb_coupled_streams; + nb_uncoupled = st->layout.nb_streams-nb_coupled-nb_lfe; + nb_normal = 2*nb_coupled + nb_uncoupled; + + /* Give each non-LFE channel enough bits per channel for coding band energy. */ + channel_offset = 40*IMAX(50, Fs/frame_size); - if (st->bitrate_bps > st->layout.nb_channels*40000) - stream_offset = 20000; - else - stream_offset = st->bitrate_bps/st->layout.nb_channels/2; - stream_offset += 60*(Fs/frame_size-50); - /* We start by giving each stream (coupled or uncoupled) the same bitrate. - This models the main saving of coupled channels over uncoupled. */ - /* The LFE stream is an exception to the above and gets fewer bits. */ - lfe_offset = 3500 + 60*(Fs/frame_size-50); - /* Coupled streams get twice the mono rate after the first 20 kb/s. */ - coupled_ratio = 512; - /* Should depend on the bitrate, for now we assume LFE gets 1/8 the bits of mono */ - lfe_ratio = 32; - - /* Compute bitrate allocation between streams */ if (st->bitrate_bps==OPUS_AUTO) { - channel_rate = Fs+60*Fs/frame_size; + bitrate = nb_normal*(channel_offset + Fs + 10000) + 8000*nb_lfe; } else if (st->bitrate_bps==OPUS_BITRATE_MAX) { - channel_rate = 300000; + bitrate = nb_normal*300000 + nb_lfe*128000; } else { - int nb_lfe; - int nb_uncoupled; - int nb_coupled; - int total; - nb_lfe = (st->lfe_stream!=-1); - nb_coupled = st->layout.nb_coupled_streams; - nb_uncoupled = st->layout.nb_streams-nb_coupled-nb_lfe; - total = (nb_uncoupled<<8) /* mono */ - + coupled_ratio*nb_coupled /* stereo */ - + nb_lfe*lfe_ratio; - channel_rate = 256*(st->bitrate_bps-lfe_offset*nb_lfe-stream_offset*(nb_coupled+nb_uncoupled))/total; + bitrate = st->bitrate_bps; } -#ifndef FIXED_POINT - if (st->variable_duration==OPUS_FRAMESIZE_VARIABLE && frame_size != Fs/50) - { - opus_int32 bonus; - bonus = 60*(Fs/frame_size-50); - channel_rate += bonus; - } -#endif + + /* Give LFE some basic stream_channel allocation but never exceed 1/20 of the + total rate for the non-energy part to avoid problems at really low rate. */ + lfe_offset = IMIN(bitrate/20, 3000) + 15*IMAX(50, Fs/frame_size); + + /* We give each stream (coupled or uncoupled) a starting bitrate. + This models the main saving of coupled channels over uncoupled. */ + stream_offset = (bitrate - channel_offset*nb_normal - lfe_offset*nb_lfe)/nb_normal/2; + stream_offset = IMAX(0, IMIN(20000, stream_offset)); + + /* Coupled streams get twice the mono rate after the offset is allocated. */ + coupled_ratio = 512; + /* Should depend on the bitrate, for now we assume LFE gets 1/8 the bits of mono */ + lfe_ratio = 32; + + total = (nb_uncoupled<<8) /* mono */ + + coupled_ratio*nb_coupled /* stereo */ + + nb_lfe*lfe_ratio; + channel_rate = 256*(opus_int64)(bitrate - lfe_offset*nb_lfe - stream_offset*(nb_coupled+nb_uncoupled) - channel_offset*nb_normal)/total; for (i=0;i<st->layout.nb_streams;i++) { if (i<st->layout.nb_coupled_streams) - rate[i] = stream_offset+(channel_rate*coupled_ratio>>8); + rate[i] = 2*channel_offset + IMAX(0, stream_offset+(channel_rate*coupled_ratio>>8)); else if (i!=st->lfe_stream) - rate[i] = stream_offset+channel_rate; + rate[i] = channel_offset + IMAX(0, stream_offset + channel_rate); else - rate[i] = lfe_offset+(channel_rate*lfe_ratio>>8); + rate[i] = IMAX(0, lfe_offset+(channel_rate*lfe_ratio>>8)); } } -#ifdef ENABLE_EXPERIMENTAL_AMBISONICS static void ambisonics_rate_allocation( OpusMSEncoder *st, opus_int32 *rate, @@ -736,50 +741,31 @@ static void ambisonics_rate_allocation( ) { int i; - int non_mono_rate; - int total_rate; + opus_int32 total_rate; + opus_int32 per_stream_rate; - /* The mono channel gets (rate_ratio_num / rate_ratio_den) times as many bits - * as all other channels */ - const int rate_ratio_num = 4; - const int rate_ratio_den = 3; - const int num_channels = st->layout.nb_streams; + const int nb_channels = st->layout.nb_streams + st->layout.nb_coupled_streams; if (st->bitrate_bps==OPUS_AUTO) { - total_rate = num_channels * (20000 + st->layout.nb_streams*(Fs+60*Fs/frame_size)); + total_rate = (st->layout.nb_coupled_streams + st->layout.nb_streams) * + (Fs+60*Fs/frame_size) + st->layout.nb_streams * (opus_int32)15000; } else if (st->bitrate_bps==OPUS_BITRATE_MAX) { - total_rate = num_channels * 320000; - } else { - total_rate = st->bitrate_bps; - } - - /* Let y be the non-mono rate and let p, q be integers such that the mono - * channel rate is (p/q) * y. - * Also let T be the total bitrate to allocate. Then - * (n - 1) y + (p/q) y = T - * y = (T q) / (qn - q + p) - */ - non_mono_rate = - total_rate * rate_ratio_den - / (rate_ratio_den*num_channels + rate_ratio_num - rate_ratio_den); - -#ifndef FIXED_POINT - if (st->variable_duration==OPUS_FRAMESIZE_VARIABLE && frame_size != Fs/50) + total_rate = nb_channels * 320000; + } else { - opus_int32 bonus = 60*(Fs/frame_size-50); - non_mono_rate += bonus; + total_rate = st->bitrate_bps; } -#endif - rate[0] = total_rate - (num_channels - 1) * non_mono_rate; - for (i=1;i<st->layout.nb_streams;i++) + /* Allocate equal number of bits to Ambisonic (uncoupled) and non-diegetic + * (coupled) streams */ + per_stream_rate = total_rate / st->layout.nb_streams; + for (i = 0; i < st->layout.nb_streams; i++) { - rate[i] = non_mono_rate; + rate[i] = per_stream_rate; } } -#endif /* ENABLE_EXPERIMENTAL_AMBISONICS */ static opus_int32 rate_allocation( OpusMSEncoder *st, @@ -795,11 +781,9 @@ static opus_int32 rate_allocation( ptr = (char*)st + align(sizeof(OpusMSEncoder)); opus_encoder_ctl((OpusEncoder*)ptr, OPUS_GET_SAMPLE_RATE(&Fs)); -#ifdef ENABLE_EXPERIMENTAL_AMBISONICS if (st->mapping_type == MAPPING_TYPE_AMBISONICS) { ambisonics_rate_allocation(st, rate, frame_size, Fs); } else -#endif { surround_rate_allocation(st, rate, frame_size, Fs); } @@ -812,9 +796,9 @@ static opus_int32 rate_allocation( return rate_sum; } -/* Max size in case the encoder decides to return three frames */ -#define MS_FRAME_TMP (3*1275+7) -static int opus_multistream_encode_native +/* Max size in case the encoder decides to return six frames (6 x 20 ms = 120 ms) */ +#define MS_FRAME_TMP (6*1275+12) +int opus_multistream_encode_native ( OpusMSEncoder *st, opus_copy_channel_in_func copy_channel_in, @@ -824,7 +808,8 @@ static int opus_multistream_encode_native opus_int32 max_data_bytes, int lsb_depth, downmix_func downmix, - int float_api + int float_api, + void *user_data ) { opus_int32 Fs; @@ -859,32 +844,8 @@ static int opus_multistream_encode_native opus_encoder_ctl((OpusEncoder*)ptr, OPUS_GET_VBR(&vbr)); opus_encoder_ctl((OpusEncoder*)ptr, CELT_GET_MODE(&celt_mode)); - { - opus_int32 delay_compensation; - int channels; - - channels = st->layout.nb_streams + st->layout.nb_coupled_streams; - opus_encoder_ctl((OpusEncoder*)ptr, OPUS_GET_LOOKAHEAD(&delay_compensation)); - delay_compensation -= Fs/400; - frame_size = compute_frame_size(pcm, analysis_frame_size, - st->variable_duration, channels, Fs, st->bitrate_bps, - delay_compensation, downmix -#ifndef DISABLE_FLOAT_API - , st->subframe_mem -#endif - ); - } - - if (400*frame_size < Fs) - { - RESTORE_STACK; - return OPUS_BAD_ARG; - } - /* Validate frame_size before using it to allocate stack space. - This mirrors the checks in opus_encode[_float](). */ - if (400*frame_size != Fs && 200*frame_size != Fs && - 100*frame_size != Fs && 50*frame_size != Fs && - 25*frame_size != Fs && 50*frame_size != 3*Fs) + frame_size = frame_size_select(analysis_frame_size, st->variable_duration, Fs); + if (frame_size <= 0) { RESTORE_STACK; return OPUS_BAD_ARG; @@ -892,6 +853,9 @@ static int opus_multistream_encode_native /* Smallest packet the encoder can produce. */ smallest_packet = st->layout.nb_streams*2-1; + /* 100 ms needs an extra byte per stream for the ToC. */ + if (Fs/frame_size == 10) + smallest_packet += st->layout.nb_streams; if (max_data_bytes < smallest_packet) { RESTORE_STACK; @@ -952,11 +916,9 @@ static int opus_multistream_encode_native opus_encoder_ctl(enc, OPUS_SET_FORCE_CHANNELS(2)); } } -#ifdef ENABLE_EXPERIMENTAL_AMBISONICS else if (st->mapping_type == MAPPING_TYPE_AMBISONICS) { opus_encoder_ctl(enc, OPUS_SET_FORCE_MODE(MODE_CELT_ONLY)); } -#endif } ptr = (char*)st + align(sizeof(OpusMSEncoder)); @@ -979,9 +941,9 @@ static int opus_multistream_encode_native left = get_left_channel(&st->layout, s, -1); right = get_right_channel(&st->layout, s, -1); (*copy_channel_in)(buf, 2, - pcm, st->layout.nb_channels, left, frame_size); + pcm, st->layout.nb_channels, left, frame_size, user_data); (*copy_channel_in)(buf+1, 2, - pcm, st->layout.nb_channels, right, frame_size); + pcm, st->layout.nb_channels, right, frame_size, user_data); ptr += align(coupled_size); if (st->mapping_type == MAPPING_TYPE_SURROUND) { @@ -997,7 +959,7 @@ static int opus_multistream_encode_native int i; int chan = get_mono_channel(&st->layout, s, -1); (*copy_channel_in)(buf, 1, - pcm, st->layout.nb_channels, chan, frame_size); + pcm, st->layout.nb_channels, chan, frame_size, user_data); ptr += align(mono_size); if (st->mapping_type == MAPPING_TYPE_SURROUND) { @@ -1013,6 +975,9 @@ static int opus_multistream_encode_native curr_max = max_data_bytes - tot_size; /* Reserve one byte for the last stream and two for the others */ curr_max -= IMAX(0,2*(st->layout.nb_streams-s-1)-1); + /* For 100 ms, reserve an extra byte per stream for the ToC */ + if (Fs/frame_size == 10) + curr_max -= st->layout.nb_streams-s-1; curr_max = IMIN(curr_max,MS_FRAME_TMP); /* Repacketizer will add one or two bytes for self-delimited frames */ if (s != st->layout.nb_streams-1) curr_max -= curr_max>253 ? 2 : 1; @@ -1053,11 +1018,13 @@ static void opus_copy_channel_in_float( const void *src, int src_stride, int src_channel, - int frame_size + int frame_size, + void *user_data ) { const float *float_src; opus_int32 i; + (void)user_data; float_src = (const float *)src; for (i=0;i<frame_size;i++) #if defined(FIXED_POINT) @@ -1074,11 +1041,13 @@ static void opus_copy_channel_in_short( const void *src, int src_stride, int src_channel, - int frame_size + int frame_size, + void *user_data ) { const opus_int16 *short_src; opus_int32 i; + (void)user_data; short_src = (const opus_int16 *)src; for (i=0;i<frame_size;i++) #if defined(FIXED_POINT) @@ -1099,7 +1068,7 @@ int opus_multistream_encode( ) { return opus_multistream_encode_native(st, opus_copy_channel_in_short, - pcm, frame_size, data, max_data_bytes, 16, downmix_int, 0); + pcm, frame_size, data, max_data_bytes, 16, downmix_int, 0, NULL); } #ifndef DISABLE_FLOAT_API @@ -1112,7 +1081,7 @@ int opus_multistream_encode_float( ) { return opus_multistream_encode_native(st, opus_copy_channel_in_float, - pcm, frame_size, data, max_data_bytes, 16, downmix_float, 1); + pcm, frame_size, data, max_data_bytes, 16, downmix_float, 1, NULL); } #endif @@ -1128,7 +1097,7 @@ int opus_multistream_encode_float ) { return opus_multistream_encode_native(st, opus_copy_channel_in_float, - pcm, frame_size, data, max_data_bytes, 24, downmix_float, 1); + pcm, frame_size, data, max_data_bytes, 24, downmix_float, 1, NULL); } int opus_multistream_encode( @@ -1140,19 +1109,17 @@ int opus_multistream_encode( ) { return opus_multistream_encode_native(st, opus_copy_channel_in_short, - pcm, frame_size, data, max_data_bytes, 16, downmix_int, 0); + pcm, frame_size, data, max_data_bytes, 16, downmix_int, 0, NULL); } #endif -int opus_multistream_encoder_ctl(OpusMSEncoder *st, int request, ...) +int opus_multistream_encoder_ctl_va_list(OpusMSEncoder *st, int request, + va_list ap) { - va_list ap; int coupled_size, mono_size; char *ptr; int ret = OPUS_OK; - va_start(ap, request); - coupled_size = opus_encoder_get_size(2); mono_size = opus_encoder_get_size(1); ptr = (char*)st + align(sizeof(OpusMSEncoder)); @@ -1161,9 +1128,11 @@ int opus_multistream_encoder_ctl(OpusMSEncoder *st, int request, ...) case OPUS_SET_BITRATE_REQUEST: { opus_int32 value = va_arg(ap, opus_int32); - if (value<0 && value!=OPUS_AUTO && value!=OPUS_BITRATE_MAX) + if (value != OPUS_AUTO && value != OPUS_BITRATE_MAX) { - goto bad_arg; + if (value <= 0) + goto bad_arg; + value = IMIN(300000*st->layout.nb_channels, IMAX(500*st->layout.nb_channels, value)); } st->bitrate_bps = value; } @@ -1206,6 +1175,7 @@ int opus_multistream_encoder_ctl(OpusMSEncoder *st, int request, ...) case OPUS_GET_INBAND_FEC_REQUEST: case OPUS_GET_FORCE_CHANNELS_REQUEST: case OPUS_GET_PREDICTION_DISABLED_REQUEST: + case OPUS_GET_PHASE_INVERSION_DISABLED_REQUEST: { OpusEncoder *enc; /* For int32* GET params, just query the first stream */ @@ -1252,6 +1222,7 @@ int opus_multistream_encoder_ctl(OpusMSEncoder *st, int request, ...) case OPUS_SET_FORCE_MODE_REQUEST: case OPUS_SET_FORCE_CHANNELS_REQUEST: case OPUS_SET_PREDICTION_DISABLED_REQUEST: + case OPUS_SET_PHASE_INVERSION_DISABLED_REQUEST: { int s; /* This works for int32 params */ @@ -1278,7 +1249,7 @@ int opus_multistream_encoder_ctl(OpusMSEncoder *st, int request, ...) OpusEncoder **value; stream_id = va_arg(ap, opus_int32); if (stream_id<0 || stream_id >= st->layout.nb_streams) - ret = OPUS_BAD_ARG; + goto bad_arg; value = va_arg(ap, OpusEncoder**); if (!value) { @@ -1313,7 +1284,6 @@ int opus_multistream_encoder_ctl(OpusMSEncoder *st, int request, ...) case OPUS_RESET_STATE: { int s; - st->subframe_mem[0] = st->subframe_mem[1] = st->subframe_mem[2] = 0; if (st->mapping_type == MAPPING_TYPE_SURROUND) { OPUS_CLEAR(ms_get_preemph_mem(st), st->layout.nb_channels); @@ -1337,14 +1307,21 @@ int opus_multistream_encoder_ctl(OpusMSEncoder *st, int request, ...) ret = OPUS_UNIMPLEMENTED; break; } - - va_end(ap); return ret; bad_arg: - va_end(ap); return OPUS_BAD_ARG; } +int opus_multistream_encoder_ctl(OpusMSEncoder *st, int request, ...) +{ + int ret; + va_list ap; + va_start(ap, request); + ret = opus_multistream_encoder_ctl_va_list(st, request, ap); + va_end(ap); + return ret; +} + void opus_multistream_encoder_destroy(OpusMSEncoder *st) { opus_free(st); diff --git a/thirdparty/opus/opus_private.h b/thirdparty/opus/opus_private.h index 3b62eed096..5e2463f546 100644 --- a/thirdparty/opus/opus_private.h +++ b/thirdparty/opus/opus_private.h @@ -33,6 +33,7 @@ #include "opus.h" #include "celt.h" +#include <stdarg.h> /* va_list */ #include <stddef.h> /* offsetof */ struct OpusRepacketizer { @@ -50,12 +51,59 @@ typedef struct ChannelLayout { unsigned char mapping[256]; } ChannelLayout; +typedef enum { + MAPPING_TYPE_NONE, + MAPPING_TYPE_SURROUND, + MAPPING_TYPE_AMBISONICS +} MappingType; + +struct OpusMSEncoder { + ChannelLayout layout; + int arch; + int lfe_stream; + int application; + int variable_duration; + MappingType mapping_type; + opus_int32 bitrate_bps; + /* Encoder states go here */ + /* then opus_val32 window_mem[channels*120]; */ + /* then opus_val32 preemph_mem[channels]; */ +}; + +struct OpusMSDecoder { + ChannelLayout layout; + /* Decoder states go here */ +}; + +int opus_multistream_encoder_ctl_va_list(struct OpusMSEncoder *st, int request, + va_list ap); +int opus_multistream_decoder_ctl_va_list(struct OpusMSDecoder *st, int request, + va_list ap); + int validate_layout(const ChannelLayout *layout); int get_left_channel(const ChannelLayout *layout, int stream_id, int prev); int get_right_channel(const ChannelLayout *layout, int stream_id, int prev); int get_mono_channel(const ChannelLayout *layout, int stream_id, int prev); - +typedef void (*opus_copy_channel_in_func)( + opus_val16 *dst, + int dst_stride, + const void *src, + int src_stride, + int src_channel, + int frame_size, + void *user_data +); + +typedef void (*opus_copy_channel_out_func)( + void *dst, + int dst_stride, + int dst_channel, + const opus_val16 *src, + int src_stride, + int frame_size, + void *user_data +); #define MODE_SILK_ONLY 1000 #define MODE_HYBRID 1001 @@ -87,19 +135,12 @@ int get_mono_channel(const ChannelLayout *layout, int stream_id, int prev); typedef void (*downmix_func)(const void *, opus_val32 *, int, int, int, int, int); void downmix_float(const void *_x, opus_val32 *sub, int subframe, int offset, int c1, int c2, int C); void downmix_int(const void *_x, opus_val32 *sub, int subframe, int offset, int c1, int c2, int C); +int is_digital_silence(const opus_val16* pcm, int frame_size, int channels, int lsb_depth); int encode_size(int size, unsigned char *data); opus_int32 frame_size_select(opus_int32 frame_size, int variable_duration, opus_int32 Fs); -opus_int32 compute_frame_size(const void *analysis_pcm, int frame_size, - int variable_duration, int C, opus_int32 Fs, int bitrate_bps, - int delay_compensation, downmix_func downmix -#ifndef DISABLE_FLOAT_API - , float *subframe_mem -#endif - ); - opus_int32 opus_encode_native(OpusEncoder *st, const opus_val16 *pcm, int frame_size, unsigned char *data, opus_int32 out_data_bytes, int lsb_depth, const void *analysis_pcm, opus_int32 analysis_size, int c1, int c2, @@ -131,4 +172,30 @@ opus_int32 opus_repacketizer_out_range_impl(OpusRepacketizer *rp, int begin, int int pad_frame(unsigned char *data, opus_int32 len, opus_int32 new_len); +int opus_multistream_encode_native +( + struct OpusMSEncoder *st, + opus_copy_channel_in_func copy_channel_in, + const void *pcm, + int analysis_frame_size, + unsigned char *data, + opus_int32 max_data_bytes, + int lsb_depth, + downmix_func downmix, + int float_api, + void *user_data +); + +int opus_multistream_decode_native( + struct OpusMSDecoder *st, + const unsigned char *data, + opus_int32 len, + void *pcm, + opus_copy_channel_out_func copy_channel_out, + int frame_size, + int decode_fec, + int soft_clip, + void *user_data +); + #endif /* OPUS_PRIVATE_H */ diff --git a/thirdparty/opus/opus_projection_decoder.c b/thirdparty/opus/opus_projection_decoder.c new file mode 100644 index 0000000000..c2e07d5bcf --- /dev/null +++ b/thirdparty/opus/opus_projection_decoder.c @@ -0,0 +1,258 @@ +/* Copyright (c) 2017 Google Inc. + Written by Andrew Allen */ +/* + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + - Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + - Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER + OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, + EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, + PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR + PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#include "mathops.h" +#include "os_support.h" +#include "opus_private.h" +#include "opus_defines.h" +#include "opus_projection.h" +#include "opus_multistream.h" +#include "mapping_matrix.h" +#include "stack_alloc.h" + +struct OpusProjectionDecoder +{ + opus_int32 demixing_matrix_size_in_bytes; + /* Encoder states go here */ +}; + +#if !defined(DISABLE_FLOAT_API) +static void opus_projection_copy_channel_out_float( + void *dst, + int dst_stride, + int dst_channel, + const opus_val16 *src, + int src_stride, + int frame_size, + void *user_data) +{ + float *float_dst; + const MappingMatrix *matrix; + float_dst = (float *)dst; + matrix = (const MappingMatrix *)user_data; + + if (dst_channel == 0) + OPUS_CLEAR(float_dst, frame_size * dst_stride); + + if (src != NULL) + mapping_matrix_multiply_channel_out_float(matrix, src, dst_channel, + src_stride, float_dst, dst_stride, frame_size); +} +#endif + +static void opus_projection_copy_channel_out_short( + void *dst, + int dst_stride, + int dst_channel, + const opus_val16 *src, + int src_stride, + int frame_size, + void *user_data) +{ + opus_int16 *short_dst; + const MappingMatrix *matrix; + short_dst = (opus_int16 *)dst; + matrix = (const MappingMatrix *)user_data; + if (dst_channel == 0) + OPUS_CLEAR(short_dst, frame_size * dst_stride); + + if (src != NULL) + mapping_matrix_multiply_channel_out_short(matrix, src, dst_channel, + src_stride, short_dst, dst_stride, frame_size); +} + +static MappingMatrix *get_dec_demixing_matrix(OpusProjectionDecoder *st) +{ + /* void* cast avoids clang -Wcast-align warning */ + return (MappingMatrix*)(void*)((char*)st + + align(sizeof(OpusProjectionDecoder))); +} + +static OpusMSDecoder *get_multistream_decoder(OpusProjectionDecoder *st) +{ + /* void* cast avoids clang -Wcast-align warning */ + return (OpusMSDecoder*)(void*)((char*)st + + align(sizeof(OpusProjectionDecoder) + + st->demixing_matrix_size_in_bytes)); +} + +opus_int32 opus_projection_decoder_get_size(int channels, int streams, + int coupled_streams) +{ + opus_int32 matrix_size; + opus_int32 decoder_size; + + matrix_size = + mapping_matrix_get_size(streams + coupled_streams, channels); + if (!matrix_size) + return 0; + + decoder_size = opus_multistream_decoder_get_size(streams, coupled_streams); + if (!decoder_size) + return 0; + + return align(sizeof(OpusProjectionDecoder)) + matrix_size + decoder_size; +} + +int opus_projection_decoder_init(OpusProjectionDecoder *st, opus_int32 Fs, + int channels, int streams, int coupled_streams, + unsigned char *demixing_matrix, opus_int32 demixing_matrix_size) +{ + int nb_input_streams; + opus_int32 expected_matrix_size; + int i, ret; + unsigned char mapping[255]; + VARDECL(opus_int16, buf); + ALLOC_STACK; + + /* Verify supplied matrix size. */ + nb_input_streams = streams + coupled_streams; + expected_matrix_size = nb_input_streams * channels * sizeof(opus_int16); + if (expected_matrix_size != demixing_matrix_size) + { + RESTORE_STACK; + return OPUS_BAD_ARG; + } + + /* Convert demixing matrix input into internal format. */ + ALLOC(buf, nb_input_streams * channels, opus_int16); + for (i = 0; i < nb_input_streams * channels; i++) + { + int s = demixing_matrix[2*i + 1] << 8 | demixing_matrix[2*i]; + s = ((s & 0xFFFF) ^ 0x8000) - 0x8000; + buf[i] = (opus_int16)s; + } + + /* Assign demixing matrix. */ + st->demixing_matrix_size_in_bytes = + mapping_matrix_get_size(channels, nb_input_streams); + if (!st->demixing_matrix_size_in_bytes) + { + RESTORE_STACK; + return OPUS_BAD_ARG; + } + + mapping_matrix_init(get_dec_demixing_matrix(st), channels, nb_input_streams, 0, + buf, demixing_matrix_size); + + /* Set trivial mapping so each input channel pairs with a matrix column. */ + for (i = 0; i < channels; i++) + mapping[i] = i; + + ret = opus_multistream_decoder_init( + get_multistream_decoder(st), Fs, channels, streams, coupled_streams, mapping); + RESTORE_STACK; + return ret; +} + +OpusProjectionDecoder *opus_projection_decoder_create( + opus_int32 Fs, int channels, int streams, int coupled_streams, + unsigned char *demixing_matrix, opus_int32 demixing_matrix_size, int *error) +{ + int size; + int ret; + OpusProjectionDecoder *st; + + /* Allocate space for the projection decoder. */ + size = opus_projection_decoder_get_size(channels, streams, coupled_streams); + if (!size) { + if (error) + *error = OPUS_ALLOC_FAIL; + return NULL; + } + st = (OpusProjectionDecoder *)opus_alloc(size); + if (!st) + { + if (error) + *error = OPUS_ALLOC_FAIL; + return NULL; + } + + /* Initialize projection decoder with provided settings. */ + ret = opus_projection_decoder_init(st, Fs, channels, streams, coupled_streams, + demixing_matrix, demixing_matrix_size); + if (ret != OPUS_OK) + { + opus_free(st); + st = NULL; + } + if (error) + *error = ret; + return st; +} + +#ifdef FIXED_POINT +int opus_projection_decode(OpusProjectionDecoder *st, const unsigned char *data, + opus_int32 len, opus_int16 *pcm, int frame_size, + int decode_fec) +{ + return opus_multistream_decode_native(get_multistream_decoder(st), data, len, + pcm, opus_projection_copy_channel_out_short, frame_size, decode_fec, 0, + get_dec_demixing_matrix(st)); +} +#else +int opus_projection_decode(OpusProjectionDecoder *st, const unsigned char *data, + opus_int32 len, opus_int16 *pcm, int frame_size, + int decode_fec) +{ + return opus_multistream_decode_native(get_multistream_decoder(st), data, len, + pcm, opus_projection_copy_channel_out_short, frame_size, decode_fec, 1, + get_dec_demixing_matrix(st)); +} +#endif + +#ifndef DISABLE_FLOAT_API +int opus_projection_decode_float(OpusProjectionDecoder *st, const unsigned char *data, + opus_int32 len, float *pcm, int frame_size, int decode_fec) +{ + return opus_multistream_decode_native(get_multistream_decoder(st), data, len, + pcm, opus_projection_copy_channel_out_float, frame_size, decode_fec, 0, + get_dec_demixing_matrix(st)); +} +#endif + +int opus_projection_decoder_ctl(OpusProjectionDecoder *st, int request, ...) +{ + va_list ap; + int ret = OPUS_OK; + + va_start(ap, request); + ret = opus_multistream_decoder_ctl_va_list(get_multistream_decoder(st), + request, ap); + va_end(ap); + return ret; +} + +void opus_projection_decoder_destroy(OpusProjectionDecoder *st) +{ + opus_free(st); +} + diff --git a/thirdparty/opus/opus_projection_encoder.c b/thirdparty/opus/opus_projection_encoder.c new file mode 100644 index 0000000000..06fb2d2526 --- /dev/null +++ b/thirdparty/opus/opus_projection_encoder.c @@ -0,0 +1,468 @@ +/* Copyright (c) 2017 Google Inc. + Written by Andrew Allen */ +/* + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + - Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + - Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER + OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, + EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, + PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR + PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING + NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#include "mathops.h" +#include "os_support.h" +#include "opus_private.h" +#include "opus_defines.h" +#include "opus_projection.h" +#include "opus_multistream.h" +#include "stack_alloc.h" +#include "mapping_matrix.h" + +struct OpusProjectionEncoder +{ + opus_int32 mixing_matrix_size_in_bytes; + opus_int32 demixing_matrix_size_in_bytes; + /* Encoder states go here */ +}; + +#if !defined(DISABLE_FLOAT_API) +static void opus_projection_copy_channel_in_float( + opus_val16 *dst, + int dst_stride, + const void *src, + int src_stride, + int src_channel, + int frame_size, + void *user_data +) +{ + mapping_matrix_multiply_channel_in_float((const MappingMatrix*)user_data, + (const float*)src, src_stride, dst, src_channel, dst_stride, frame_size); +} +#endif + +static void opus_projection_copy_channel_in_short( + opus_val16 *dst, + int dst_stride, + const void *src, + int src_stride, + int src_channel, + int frame_size, + void *user_data +) +{ + mapping_matrix_multiply_channel_in_short((const MappingMatrix*)user_data, + (const opus_int16*)src, src_stride, dst, src_channel, dst_stride, frame_size); +} + +static int get_order_plus_one_from_channels(int channels, int *order_plus_one) +{ + int order_plus_one_; + int acn_channels; + int nondiegetic_channels; + + /* Allowed numbers of channels: + * (1 + n)^2 + 2j, for n = 0...14 and j = 0 or 1. + */ + if (channels < 1 || channels > 227) + return OPUS_BAD_ARG; + + order_plus_one_ = isqrt32(channels); + acn_channels = order_plus_one_ * order_plus_one_; + nondiegetic_channels = channels - acn_channels; + if (nondiegetic_channels != 0 && nondiegetic_channels != 2) + return OPUS_BAD_ARG; + + if (order_plus_one) + *order_plus_one = order_plus_one_; + return OPUS_OK; +} + +static int get_streams_from_channels(int channels, int mapping_family, + int *streams, int *coupled_streams, + int *order_plus_one) +{ + if (mapping_family == 3) + { + if (get_order_plus_one_from_channels(channels, order_plus_one) != OPUS_OK) + return OPUS_BAD_ARG; + if (streams) + *streams = (channels + 1) / 2; + if (coupled_streams) + *coupled_streams = channels / 2; + return OPUS_OK; + } + return OPUS_BAD_ARG; +} + +static MappingMatrix *get_mixing_matrix(OpusProjectionEncoder *st) +{ + /* void* cast avoids clang -Wcast-align warning */ + return (MappingMatrix *)(void*)((char*)st + + align(sizeof(OpusProjectionEncoder))); +} + +static MappingMatrix *get_enc_demixing_matrix(OpusProjectionEncoder *st) +{ + /* void* cast avoids clang -Wcast-align warning */ + return (MappingMatrix *)(void*)((char*)st + + align(sizeof(OpusProjectionEncoder) + + st->mixing_matrix_size_in_bytes)); +} + +static OpusMSEncoder *get_multistream_encoder(OpusProjectionEncoder *st) +{ + /* void* cast avoids clang -Wcast-align warning */ + return (OpusMSEncoder *)(void*)((char*)st + + align(sizeof(OpusProjectionEncoder) + + st->mixing_matrix_size_in_bytes + + st->demixing_matrix_size_in_bytes)); +} + +opus_int32 opus_projection_ambisonics_encoder_get_size(int channels, + int mapping_family) +{ + int nb_streams; + int nb_coupled_streams; + int order_plus_one; + int mixing_matrix_rows, mixing_matrix_cols; + int demixing_matrix_rows, demixing_matrix_cols; + opus_int32 mixing_matrix_size, demixing_matrix_size; + opus_int32 encoder_size; + int ret; + + ret = get_streams_from_channels(channels, mapping_family, &nb_streams, + &nb_coupled_streams, &order_plus_one); + if (ret != OPUS_OK) + return 0; + + if (order_plus_one == 2) + { + mixing_matrix_rows = mapping_matrix_foa_mixing.rows; + mixing_matrix_cols = mapping_matrix_foa_mixing.cols; + demixing_matrix_rows = mapping_matrix_foa_demixing.rows; + demixing_matrix_cols = mapping_matrix_foa_demixing.cols; + } + else if (order_plus_one == 3) + { + mixing_matrix_rows = mapping_matrix_soa_mixing.rows; + mixing_matrix_cols = mapping_matrix_soa_mixing.cols; + demixing_matrix_rows = mapping_matrix_soa_demixing.rows; + demixing_matrix_cols = mapping_matrix_soa_demixing.cols; + } + else if (order_plus_one == 4) + { + mixing_matrix_rows = mapping_matrix_toa_mixing.rows; + mixing_matrix_cols = mapping_matrix_toa_mixing.cols; + demixing_matrix_rows = mapping_matrix_toa_demixing.rows; + demixing_matrix_cols = mapping_matrix_toa_demixing.cols; + } + else + return 0; + + mixing_matrix_size = + mapping_matrix_get_size(mixing_matrix_rows, mixing_matrix_cols); + if (!mixing_matrix_size) + return 0; + + demixing_matrix_size = + mapping_matrix_get_size(demixing_matrix_rows, demixing_matrix_cols); + if (!demixing_matrix_size) + return 0; + + encoder_size = + opus_multistream_encoder_get_size(nb_streams, nb_coupled_streams); + if (!encoder_size) + return 0; + + return align(sizeof(OpusProjectionEncoder)) + + mixing_matrix_size + demixing_matrix_size + encoder_size; +} + +int opus_projection_ambisonics_encoder_init(OpusProjectionEncoder *st, opus_int32 Fs, + int channels, int mapping_family, + int *streams, int *coupled_streams, + int application) +{ + MappingMatrix *mixing_matrix; + MappingMatrix *demixing_matrix; + OpusMSEncoder *ms_encoder; + int i; + int ret; + int order_plus_one; + unsigned char mapping[255]; + + if (streams == NULL || coupled_streams == NULL) { + return OPUS_BAD_ARG; + } + + if (get_streams_from_channels(channels, mapping_family, streams, + coupled_streams, &order_plus_one) != OPUS_OK) + return OPUS_BAD_ARG; + + if (mapping_family == 3) + { + /* Assign mixing matrix based on available pre-computed matrices. */ + mixing_matrix = get_mixing_matrix(st); + if (order_plus_one == 2) + { + mapping_matrix_init(mixing_matrix, mapping_matrix_foa_mixing.rows, + mapping_matrix_foa_mixing.cols, mapping_matrix_foa_mixing.gain, + mapping_matrix_foa_mixing_data, + sizeof(mapping_matrix_foa_mixing_data)); + } + else if (order_plus_one == 3) + { + mapping_matrix_init(mixing_matrix, mapping_matrix_soa_mixing.rows, + mapping_matrix_soa_mixing.cols, mapping_matrix_soa_mixing.gain, + mapping_matrix_soa_mixing_data, + sizeof(mapping_matrix_soa_mixing_data)); + } + else if (order_plus_one == 4) + { + mapping_matrix_init(mixing_matrix, mapping_matrix_toa_mixing.rows, + mapping_matrix_toa_mixing.cols, mapping_matrix_toa_mixing.gain, + mapping_matrix_toa_mixing_data, + sizeof(mapping_matrix_toa_mixing_data)); + } + else + return OPUS_BAD_ARG; + + st->mixing_matrix_size_in_bytes = mapping_matrix_get_size( + mixing_matrix->rows, mixing_matrix->cols); + if (!st->mixing_matrix_size_in_bytes) + return OPUS_BAD_ARG; + + /* Assign demixing matrix based on available pre-computed matrices. */ + demixing_matrix = get_enc_demixing_matrix(st); + if (order_plus_one == 2) + { + mapping_matrix_init(demixing_matrix, mapping_matrix_foa_demixing.rows, + mapping_matrix_foa_demixing.cols, mapping_matrix_foa_demixing.gain, + mapping_matrix_foa_demixing_data, + sizeof(mapping_matrix_foa_demixing_data)); + } + else if (order_plus_one == 3) + { + mapping_matrix_init(demixing_matrix, mapping_matrix_soa_demixing.rows, + mapping_matrix_soa_demixing.cols, mapping_matrix_soa_demixing.gain, + mapping_matrix_soa_demixing_data, + sizeof(mapping_matrix_soa_demixing_data)); + } + else if (order_plus_one == 4) + { + mapping_matrix_init(demixing_matrix, mapping_matrix_toa_demixing.rows, + mapping_matrix_toa_demixing.cols, mapping_matrix_toa_demixing.gain, + mapping_matrix_toa_demixing_data, + sizeof(mapping_matrix_toa_demixing_data)); + } + else + return OPUS_BAD_ARG; + + st->demixing_matrix_size_in_bytes = mapping_matrix_get_size( + demixing_matrix->rows, demixing_matrix->cols); + if (!st->demixing_matrix_size_in_bytes) + return OPUS_BAD_ARG; + } + else + return OPUS_UNIMPLEMENTED; + + /* Ensure matrices are large enough for desired coding scheme. */ + if (*streams + *coupled_streams > mixing_matrix->rows || + channels > mixing_matrix->cols || + channels > demixing_matrix->rows || + *streams + *coupled_streams > demixing_matrix->cols) + return OPUS_BAD_ARG; + + /* Set trivial mapping so each input channel pairs with a matrix column. */ + for (i = 0; i < channels; i++) + mapping[i] = i; + + /* Initialize multistream encoder with provided settings. */ + ms_encoder = get_multistream_encoder(st); + ret = opus_multistream_encoder_init(ms_encoder, Fs, channels, *streams, + *coupled_streams, mapping, application); + return ret; +} + +OpusProjectionEncoder *opus_projection_ambisonics_encoder_create( + opus_int32 Fs, int channels, int mapping_family, int *streams, + int *coupled_streams, int application, int *error) +{ + int size; + int ret; + OpusProjectionEncoder *st; + + /* Allocate space for the projection encoder. */ + size = opus_projection_ambisonics_encoder_get_size(channels, mapping_family); + if (!size) { + if (error) + *error = OPUS_ALLOC_FAIL; + return NULL; + } + st = (OpusProjectionEncoder *)opus_alloc(size); + if (!st) + { + if (error) + *error = OPUS_ALLOC_FAIL; + return NULL; + } + + /* Initialize projection encoder with provided settings. */ + ret = opus_projection_ambisonics_encoder_init(st, Fs, channels, + mapping_family, streams, coupled_streams, application); + if (ret != OPUS_OK) + { + opus_free(st); + st = NULL; + } + if (error) + *error = ret; + return st; +} + +int opus_projection_encode(OpusProjectionEncoder *st, const opus_int16 *pcm, + int frame_size, unsigned char *data, + opus_int32 max_data_bytes) +{ + return opus_multistream_encode_native(get_multistream_encoder(st), + opus_projection_copy_channel_in_short, pcm, frame_size, data, + max_data_bytes, 16, downmix_int, 0, get_mixing_matrix(st)); +} + +#ifndef DISABLE_FLOAT_API +#ifdef FIXED_POINT +int opus_projection_encode_float(OpusProjectionEncoder *st, const float *pcm, + int frame_size, unsigned char *data, + opus_int32 max_data_bytes) +{ + return opus_multistream_encode_native(get_multistream_encoder(st), + opus_projection_copy_channel_in_float, pcm, frame_size, data, + max_data_bytes, 16, downmix_float, 1, get_mixing_matrix(st)); +} +#else +int opus_projection_encode_float(OpusProjectionEncoder *st, const float *pcm, + int frame_size, unsigned char *data, + opus_int32 max_data_bytes) +{ + return opus_multistream_encode_native(get_multistream_encoder(st), + opus_projection_copy_channel_in_float, pcm, frame_size, data, + max_data_bytes, 24, downmix_float, 1, get_mixing_matrix(st)); +} +#endif +#endif + +void opus_projection_encoder_destroy(OpusProjectionEncoder *st) +{ + opus_free(st); +} + +int opus_projection_encoder_ctl(OpusProjectionEncoder *st, int request, ...) +{ + va_list ap; + MappingMatrix *demixing_matrix; + OpusMSEncoder *ms_encoder; + int ret = OPUS_OK; + + ms_encoder = get_multistream_encoder(st); + demixing_matrix = get_enc_demixing_matrix(st); + + va_start(ap, request); + switch(request) + { + case OPUS_PROJECTION_GET_DEMIXING_MATRIX_SIZE_REQUEST: + { + opus_int32 *value = va_arg(ap, opus_int32*); + if (!value) + { + goto bad_arg; + } + *value = + ms_encoder->layout.nb_channels * (ms_encoder->layout.nb_streams + + ms_encoder->layout.nb_coupled_streams) * sizeof(opus_int16); + } + break; + case OPUS_PROJECTION_GET_DEMIXING_MATRIX_GAIN_REQUEST: + { + opus_int32 *value = va_arg(ap, opus_int32*); + if (!value) + { + goto bad_arg; + } + *value = demixing_matrix->gain; + } + break; + case OPUS_PROJECTION_GET_DEMIXING_MATRIX_REQUEST: + { + int i, j, k, l; + int nb_input_streams; + int nb_output_streams; + unsigned char *external_char; + opus_int16 *internal_short; + opus_int32 external_size; + opus_int32 internal_size; + + /* (I/O is in relation to the decoder's perspective). */ + nb_input_streams = ms_encoder->layout.nb_streams + + ms_encoder->layout.nb_coupled_streams; + nb_output_streams = ms_encoder->layout.nb_channels; + + external_char = va_arg(ap, unsigned char *); + external_size = va_arg(ap, opus_int32); + if (!external_char) + { + goto bad_arg; + } + internal_short = mapping_matrix_get_data(demixing_matrix); + internal_size = nb_input_streams * nb_output_streams * sizeof(opus_int16); + if (external_size != internal_size) + { + goto bad_arg; + } + + /* Copy demixing matrix subset to output destination. */ + l = 0; + for (i = 0; i < nb_input_streams; i++) { + for (j = 0; j < nb_output_streams; j++) { + k = demixing_matrix->rows * i + j; + external_char[2*l] = (unsigned char)internal_short[k]; + external_char[2*l+1] = (unsigned char)(internal_short[k] >> 8); + l++; + } + } + } + break; + default: + { + ret = opus_multistream_encoder_ctl_va_list(ms_encoder, request, ap); + } + break; + } + va_end(ap); + return ret; + +bad_arg: + va_end(ap); + return OPUS_BAD_ARG; +} + diff --git a/thirdparty/opus/opusfile.c b/thirdparty/opus/opusfile.c index b8b3a354cf..8b000a2c58 100644 --- a/thirdparty/opus/opusfile.c +++ b/thirdparty/opus/opusfile.c @@ -86,14 +86,15 @@ int op_test(OpusHead *_head, This is to prevent us spending a lot of time allocating memory and looking for Ogg pages in non-Ogg files.*/ if(memcmp(_initial_data,"OggS",4)!=0)return OP_ENOTFORMAT; + if(OP_UNLIKELY(_initial_bytes>(size_t)LONG_MAX))return OP_EFAULT; ogg_sync_init(&oy); - data=ogg_sync_buffer(&oy,_initial_bytes); + data=ogg_sync_buffer(&oy,(long)_initial_bytes); if(data!=NULL){ ogg_stream_state os; ogg_page og; int ret; memcpy(data,_initial_data,_initial_bytes); - ogg_sync_wrote(&oy,_initial_bytes); + ogg_sync_wrote(&oy,(long)_initial_bytes); ogg_stream_init(&os,-1); err=OP_FALSE; do{ @@ -147,7 +148,7 @@ static int op_get_data(OggOpusFile *_of,int _nbytes){ int nbytes; OP_ASSERT(_nbytes>0); buffer=(unsigned char *)ogg_sync_buffer(&_of->oy,_nbytes); - nbytes=(int)(*_of->callbacks.read)(_of->source,buffer,_nbytes); + nbytes=(int)(*_of->callbacks.read)(_of->stream,buffer,_nbytes); OP_ASSERT(nbytes<=_nbytes); if(OP_LIKELY(nbytes>0))ogg_sync_wrote(&_of->oy,nbytes); return nbytes; @@ -157,7 +158,7 @@ static int op_get_data(OggOpusFile *_of,int _nbytes){ static int op_seek_helper(OggOpusFile *_of,opus_int64 _offset){ if(_offset==_of->offset)return 0; if(_of->callbacks.seek==NULL - ||(*_of->callbacks.seek)(_of->source,_offset,SEEK_SET)){ + ||(*_of->callbacks.seek)(_of->stream,_offset,SEEK_SET)){ return OP_EREAD; } _of->offset=_offset; @@ -165,7 +166,7 @@ static int op_seek_helper(OggOpusFile *_of,opus_int64 _offset){ return 0; } -/*Get the current position indicator of the underlying source. +/*Get the current position indicator of the underlying stream. This should be the same as the value reported by tell().*/ static opus_int64 op_position(const OggOpusFile *_of){ /*The current position indicator is _not_ simply offset. @@ -369,7 +370,7 @@ static int op_get_prev_page_serial(OggOpusFile *_of,OpusSeekRecord *_sr, search_start=llret+1; } /*We started from the beginning of the stream and found nothing. - This should be impossible unless the contents of the source changed out + This should be impossible unless the contents of the stream changed out from under us after we read from it.*/ if(OP_UNLIKELY(!begin)&&OP_UNLIKELY(_offset<0))return OP_EBADLINK; /*Bump up the chunk size. @@ -455,7 +456,7 @@ static opus_int64 op_get_last_page(OggOpusFile *_of,ogg_int64_t *_gp, } } /*We started from at or before the beginning of the link and found nothing. - This should be impossible unless the contents of the source changed out + This should be impossible unless the contents of the stream changed out from under us after we read from it.*/ if((OP_UNLIKELY(left_link)||OP_UNLIKELY(!begin))&&OP_UNLIKELY(_offset<0)){ return OP_EBADLINK; @@ -855,6 +856,7 @@ static int op_find_initial_pcm_offset(OggOpusFile *_of, /*Fail if the pre-skip is non-zero, since it's asking us to skip more samples than exist.*/ if(_link->head.pre_skip>0)return OP_EBADTIMESTAMP; + _link->pcm_file_offset=0; /*Set pcm_end and end_offset so we can skip the call to op_find_final_pcm_offset().*/ _link->pcm_start=_link->pcm_end=0; @@ -866,7 +868,8 @@ static int op_find_initial_pcm_offset(OggOpusFile *_of, if(_link->head.pre_skip>0)return OP_EBADTIMESTAMP; /*Set pcm_end and end_offset so we can skip the call to op_find_final_pcm_offset().*/ - _link->pcm_end=_link->pcm_start=0; + _link->pcm_file_offset=0; + _link->pcm_start=_link->pcm_end=0; _link->end_offset=_link->data_offset; /*Tell the caller we've got a buffered page for them.*/ return 1; @@ -951,6 +954,7 @@ static int op_find_initial_pcm_offset(OggOpusFile *_of, /*Update the packet count after end-trimming.*/ _of->op_count=pi; _of->cur_discard_count=_link->head.pre_skip; + _link->pcm_file_offset=0; _of->prev_packet_gp=_link->pcm_start=pcm_start; _of->prev_page_offset=page_offset; return 0; @@ -1271,6 +1275,7 @@ static int op_bisect_forward_serialno(OggOpusFile *_of, always starts with a seek.*/ ret=op_find_initial_pcm_offset(_of,links+nlinks,NULL); if(OP_UNLIKELY(ret<0))return ret; + links[nlinks].pcm_file_offset=total_duration; _searched=_of->offset; /*Mark the current link count so it can be cleaned up on error.*/ _of->nlinks=++nlinks; @@ -1390,8 +1395,8 @@ static int op_open_seekable2_impl(OggOpusFile *_of){ opus_int64 data_offset; int ret; /*We can seek, so set out learning all about this file.*/ - (*_of->callbacks.seek)(_of->source,0,SEEK_END); - _of->offset=_of->end=(*_of->callbacks.tell)(_of->source); + (*_of->callbacks.seek)(_of->stream,0,SEEK_END); + _of->offset=_of->end=(*_of->callbacks.tell)(_of->stream); if(OP_UNLIKELY(_of->end<0))return OP_EREAD; data_offset=_of->links[0].data_offset; if(OP_UNLIKELY(_of->end<data_offset))return OP_EBADLINK; @@ -1436,7 +1441,7 @@ static int op_open_seekable2(OggOpusFile *_of){ prev_page_offset=_of->prev_page_offset; start_offset=_of->offset; memcpy(op_start,_of->op,sizeof(*op_start)*start_op_count); - OP_ASSERT((*_of->callbacks.tell)(_of->source)==op_position(_of)); + OP_ASSERT((*_of->callbacks.tell)(_of->stream)==op_position(_of)); ogg_sync_init(&_of->oy); ogg_stream_init(&_of->os,-1); ret=op_open_seekable2_impl(_of); @@ -1454,7 +1459,7 @@ static int op_open_seekable2(OggOpusFile *_of){ _of->cur_discard_count=_of->links[0].head.pre_skip; if(OP_UNLIKELY(ret<0))return ret; /*And restore the position indicator.*/ - ret=(*_of->callbacks.seek)(_of->source,op_position(_of),SEEK_SET); + ret=(*_of->callbacks.seek)(_of->stream,op_position(_of),SEEK_SET); return OP_UNLIKELY(ret<0)?OP_EREAD:0; } @@ -1493,19 +1498,20 @@ static void op_clear(OggOpusFile *_of){ _ogg_free(_of->serialnos); ogg_stream_clear(&_of->os); ogg_sync_clear(&_of->oy); - if(_of->callbacks.close!=NULL)(*_of->callbacks.close)(_of->source); + if(_of->callbacks.close!=NULL)(*_of->callbacks.close)(_of->stream); } static int op_open1(OggOpusFile *_of, - void *_source,const OpusFileCallbacks *_cb, + void *_stream,const OpusFileCallbacks *_cb, const unsigned char *_initial_data,size_t _initial_bytes){ ogg_page og; ogg_page *pog; int seekable; int ret; memset(_of,0,sizeof(*_of)); + if(OP_UNLIKELY(_initial_bytes>(size_t)LONG_MAX))return OP_EFAULT; _of->end=-1; - _of->source=_source; + _of->stream=_stream; *&_of->callbacks=*_cb; /*At a minimum, we need to be able to read data.*/ if(OP_UNLIKELY(_of->callbacks.read==NULL))return OP_EREAD; @@ -1520,18 +1526,18 @@ static int op_open1(OggOpusFile *_of, decoding entire files from RAM.*/ if(_initial_bytes>0){ char *buffer; - buffer=ogg_sync_buffer(&_of->oy,_initial_bytes); + buffer=ogg_sync_buffer(&_of->oy,(long)_initial_bytes); memcpy(buffer,_initial_data,_initial_bytes*sizeof(*buffer)); - ogg_sync_wrote(&_of->oy,_initial_bytes); + ogg_sync_wrote(&_of->oy,(long)_initial_bytes); } /*Can we seek? Stevens suggests the seek test is portable.*/ - seekable=_cb->seek!=NULL&&(*_cb->seek)(_source,0,SEEK_CUR)!=-1; + seekable=_cb->seek!=NULL&&(*_cb->seek)(_stream,0,SEEK_CUR)!=-1; /*If seek is implemented, tell must also be implemented.*/ if(seekable){ opus_int64 pos; if(OP_UNLIKELY(_of->callbacks.tell==NULL))return OP_EINVAL; - pos=(*_of->callbacks.tell)(_of->source); + pos=(*_of->callbacks.tell)(_of->stream); /*If the current position is not equal to the initial bytes consumed, absolute seeking will not work.*/ if(OP_UNLIKELY(pos!=(opus_int64)_initial_bytes))return OP_EINVAL; @@ -1590,14 +1596,14 @@ static int op_open2(OggOpusFile *_of){ return ret; } -OggOpusFile *op_test_callbacks(void *_source,const OpusFileCallbacks *_cb, +OggOpusFile *op_test_callbacks(void *_stream,const OpusFileCallbacks *_cb, const unsigned char *_initial_data,size_t _initial_bytes,int *_error){ OggOpusFile *of; int ret; of=(OggOpusFile *)_ogg_malloc(sizeof(*of)); ret=OP_EFAULT; if(OP_LIKELY(of!=NULL)){ - ret=op_open1(of,_source,_cb,_initial_data,_initial_bytes); + ret=op_open1(of,_stream,_cb,_initial_data,_initial_bytes); if(OP_LIKELY(ret>=0)){ if(_error!=NULL)*_error=0; return of; @@ -1611,10 +1617,10 @@ OggOpusFile *op_test_callbacks(void *_source,const OpusFileCallbacks *_cb, return NULL; } -OggOpusFile *op_open_callbacks(void *_source,const OpusFileCallbacks *_cb, +OggOpusFile *op_open_callbacks(void *_stream,const OpusFileCallbacks *_cb, const unsigned char *_initial_data,size_t _initial_bytes,int *_error){ OggOpusFile *of; - of=op_test_callbacks(_source,_cb,_initial_data,_initial_bytes,_error); + of=op_test_callbacks(_stream,_cb,_initial_data,_initial_bytes,_error); if(OP_LIKELY(of!=NULL)){ int ret; ret=op_open2(of); @@ -1627,15 +1633,15 @@ OggOpusFile *op_open_callbacks(void *_source,const OpusFileCallbacks *_cb, /*Convenience routine to clean up from failure for the open functions that create their own streams.*/ -static OggOpusFile *op_open_close_on_failure(void *_source, +static OggOpusFile *op_open_close_on_failure(void *_stream, const OpusFileCallbacks *_cb,int *_error){ OggOpusFile *of; - if(OP_UNLIKELY(_source==NULL)){ + if(OP_UNLIKELY(_stream==NULL)){ if(_error!=NULL)*_error=OP_EFAULT; return NULL; } - of=op_open_callbacks(_source,_cb,NULL,0,_error); - if(OP_UNLIKELY(of==NULL))(*_cb->close)(_source); + of=op_open_callbacks(_stream,_cb,NULL,0,_error); + if(OP_UNLIKELY(of==NULL))(*_cb->close)(_stream); return of; } @@ -1653,15 +1659,15 @@ OggOpusFile *op_open_memory(const unsigned char *_data,size_t _size, /*Convenience routine to clean up from failure for the open functions that create their own streams.*/ -static OggOpusFile *op_test_close_on_failure(void *_source, +static OggOpusFile *op_test_close_on_failure(void *_stream, const OpusFileCallbacks *_cb,int *_error){ OggOpusFile *of; - if(OP_UNLIKELY(_source==NULL)){ + if(OP_UNLIKELY(_stream==NULL)){ if(_error!=NULL)*_error=OP_EFAULT; return NULL; } - of=op_test_callbacks(_source,_cb,NULL,0,_error); - if(OP_UNLIKELY(of==NULL))(*_cb->close)(_source); + of=op_test_callbacks(_stream,_cb,NULL,0,_error); + if(OP_UNLIKELY(of==NULL))(*_cb->close)(_stream); return of; } @@ -1702,7 +1708,7 @@ int op_link_count(const OggOpusFile *_of){ return _of->nlinks; } -ogg_uint32_t op_serialno(const OggOpusFile *_of,int _li){ +opus_uint32 op_serialno(const OggOpusFile *_of,int _li){ if(OP_UNLIKELY(_li>=_of->nlinks))_li=_of->nlinks-1; if(!_of->seekable)_li=0; return _of->links[_li<0?_of->cur_link:_li].serialno; @@ -1718,13 +1724,14 @@ opus_int64 op_raw_total(const OggOpusFile *_of,int _li){ ||OP_UNLIKELY(_li>=_of->nlinks)){ return OP_EINVAL; } - if(_li<0)return _of->end-_of->links[0].offset; + if(_li<0)return _of->end; return (_li+1>=_of->nlinks?_of->end:_of->links[_li+1].offset) - -_of->links[_li].offset; + -(_li>0?_of->links[_li].offset:0); } ogg_int64_t op_pcm_total(const OggOpusFile *_of,int _li){ OggOpusLink *links; + ogg_int64_t pcm_total; ogg_int64_t diff; int nlinks; nlinks=_of->nlinks; @@ -1737,20 +1744,14 @@ ogg_int64_t op_pcm_total(const OggOpusFile *_of,int _li){ /*We verify that the granule position differences are larger than the pre-skip and that the total duration does not overflow during link enumeration, so we don't have to check here.*/ + pcm_total=0; if(_li<0){ - ogg_int64_t pcm_total; - int li; - pcm_total=0; - for(li=0;li<nlinks;li++){ - OP_ALWAYS_TRUE(!op_granpos_diff(&diff, - links[li].pcm_end,links[li].pcm_start)); - pcm_total+=diff-links[li].head.pre_skip; - } - return pcm_total; + pcm_total=links[nlinks-1].pcm_file_offset; + _li=nlinks-1; } OP_ALWAYS_TRUE(!op_granpos_diff(&diff, links[_li].pcm_end,links[_li].pcm_start)); - return diff-links[_li].head.pre_skip; + return pcm_total+diff-links[_li].head.pre_skip; } const OpusHead *op_head(const OggOpusFile *_of,int _li){ @@ -1820,6 +1821,34 @@ opus_int32 op_bitrate_instant(OggOpusFile *_of){ return ret; } +/*Given a serialno, find a link with a corresponding Opus stream, if it exists. + Return: The index of the link to which the page belongs, or a negative number + if it was not a desired Opus bitstream section.*/ +static int op_get_link_from_serialno(const OggOpusFile *_of,int _cur_link, + opus_int64 _page_offset,ogg_uint32_t _serialno){ + const OggOpusLink *links; + int nlinks; + int li_lo; + int li_hi; + OP_ASSERT(_of->seekable); + links=_of->links; + nlinks=_of->nlinks; + li_lo=0; + /*Start off by guessing we're just a multiplexed page in the current link.*/ + li_hi=_cur_link+1<nlinks&&_page_offset<links[_cur_link+1].offset? + _cur_link+1:nlinks; + do{ + if(_page_offset>=links[_cur_link].offset)li_lo=_cur_link; + else li_hi=_cur_link; + _cur_link=li_lo+(li_hi-li_lo>>1); + } + while(li_hi-li_lo>1); + /*We've identified the link that should contain this page. + Make sure it's a page we care about.*/ + if(links[_cur_link].serialno!=_serialno)return OP_FALSE; + return _cur_link; +} + /*Fetch and process a page. This handles the case where we're at a bitstream boundary and dumps the decoding machine. @@ -1876,19 +1905,28 @@ static int op_fetch_and_process_page(OggOpusFile *_of, if(OP_UNLIKELY(_of->ready_state<OP_STREAMSET)){ if(seekable){ ogg_uint32_t serialno; - int nlinks; - int li; serialno=ogg_page_serialno(&og); - /*Match the serialno to bitstream section. - We use this rather than offset positions to avoid problems near - logical bitstream boundaries.*/ - nlinks=_of->nlinks; - for(li=0;li<nlinks&&links[li].serialno!=serialno;li++); - /*Not a desired Opus bitstream section. - Keep trying.*/ - if(li>=nlinks)continue; + /*Match the serialno to bitstream section.*/ + OP_ASSERT(cur_link>=0&&cur_link<_of->nlinks); + if(links[cur_link].serialno!=serialno){ + /*It wasn't a page from the current link. + Is it from the next one?*/ + if(OP_LIKELY(cur_link+1<_of->nlinks&&links[cur_link+1].serialno== + serialno)){ + cur_link++; + } + else{ + int new_link; + new_link= + op_get_link_from_serialno(_of,cur_link,_page_offset,serialno); + /*Not a desired Opus bitstream section. + Keep trying.*/ + if(new_link<0)continue; + cur_link=new_link; + } + } cur_serialno=serialno; - _of->cur_link=cur_link=li; + _of->cur_link=cur_link; ogg_stream_reset_serialno(&_of->os,serialno); _of->ready_state=OP_STREAMSET; /*If we're at the start of this link, initialize the granule position @@ -1942,13 +1980,32 @@ static int op_fetch_and_process_page(OggOpusFile *_of, opus_int32 total_duration; int durations[255]; int op_count; + int report_hole; + report_hole=0; total_duration=op_collect_audio_packets(_of,durations); if(OP_UNLIKELY(total_duration<0)){ - /*Drain the packets from the page anyway.*/ + /*libogg reported a hole (a gap in the page sequence numbers). + Drain the packets from the page anyway. + If we don't, they'll still be there when we fetch the next page. + Then, when we go to pull out packets, we might get more than 255, + which would overrun our packet buffer.*/ total_duration=op_collect_audio_packets(_of,durations); OP_ASSERT(total_duration>=0); - /*Report holes to the caller.*/ - if(!_ignore_holes)return OP_HOLE; + if(!_ignore_holes){ + /*Report the hole to the caller after we finish timestamping the + packets.*/ + report_hole=1; + /*We had lost or damaged pages, so reset our granule position + tracking. + This makes holes behave the same as a small raw seek. + If the next page is the EOS page, we'll discard it (because we + can't perform end trimming properly), and we'll always discard at + least 80 ms of audio (to allow decoder state to re-converge). + We could try to fill in the gap with PLC by looking at timestamps + in the non-EOS case, but that's complicated and error prone and we + can't rely on the timestamps being valid.*/ + _of->prev_packet_gp=-1; + } } op_count=_of->op_count; /*If we found at least one audio data packet, compute per-packet granule @@ -1975,6 +2032,7 @@ static int op_fetch_and_process_page(OggOpusFile *_of, Proceed to the next link, rather than risk playing back some samples that shouldn't have been played.*/ _of->op_count=0; + if(report_hole)return OP_HOLE; continue; } /*By default discard 80 ms of data after a seek, unless we seek @@ -2020,7 +2078,11 @@ static int op_fetch_and_process_page(OggOpusFile *_of, &&OP_LIKELY(diff<total_duration)){ cur_packet_gp=prev_packet_gp; for(pi=0;pi<op_count;pi++){ - diff=durations[pi]-diff; + /*Check for overflow.*/ + if(diff<0&&OP_UNLIKELY(OP_INT64_MAX+diff<durations[pi])){ + diff=durations[pi]+1; + } + else diff=durations[pi]-diff; /*If we have samples to trim...*/ if(diff>0){ /*If we trimmed the entire packet, stop (the spec says encoders @@ -2076,10 +2138,11 @@ static int op_fetch_and_process_page(OggOpusFile *_of, } _of->prev_packet_gp=prev_packet_gp; _of->prev_page_offset=_page_offset; - _of->op_count=pi; - /*If end-trimming didn't trim all the packets, we're done.*/ - if(OP_LIKELY(pi>0))return 0; + _of->op_count=op_count=pi; } + if(report_hole)return OP_HOLE; + /*If end-trimming didn't trim all the packets, we're done.*/ + if(op_count>0)return 0; } } } @@ -2117,35 +2180,41 @@ static ogg_int64_t op_get_granulepos(const OggOpusFile *_of, ogg_int64_t _pcm_offset,int *_li){ const OggOpusLink *links; ogg_int64_t duration; + ogg_int64_t pcm_start; + opus_int32 pre_skip; int nlinks; - int li; + int li_lo; + int li_hi; OP_ASSERT(_pcm_offset>=0); nlinks=_of->nlinks; links=_of->links; - for(li=0;OP_LIKELY(li<nlinks);li++){ - ogg_int64_t pcm_start; - opus_int32 pre_skip; - pcm_start=links[li].pcm_start; - pre_skip=links[li].head.pre_skip; - OP_ALWAYS_TRUE(!op_granpos_diff(&duration,links[li].pcm_end,pcm_start)); - duration-=pre_skip; - if(_pcm_offset<duration){ - _pcm_offset+=pre_skip; - if(OP_UNLIKELY(pcm_start>OP_INT64_MAX-_pcm_offset)){ - /*Adding this amount to the granule position would overflow the positive - half of its 64-bit range. - Since signed overflow is undefined in C, do it in a way the compiler - isn't allowed to screw up.*/ - _pcm_offset-=OP_INT64_MAX-pcm_start+1; - pcm_start=OP_INT64_MIN; - } - pcm_start+=_pcm_offset; - *_li=li; - return pcm_start; - } - _pcm_offset-=duration; - } - return -1; + li_lo=0; + li_hi=nlinks; + do{ + int li; + li=li_lo+(li_hi-li_lo>>1); + if(links[li].pcm_file_offset<=_pcm_offset)li_lo=li; + else li_hi=li; + } + while(li_hi-li_lo>1); + _pcm_offset-=links[li_lo].pcm_file_offset; + pcm_start=links[li_lo].pcm_start; + pre_skip=links[li_lo].head.pre_skip; + OP_ALWAYS_TRUE(!op_granpos_diff(&duration,links[li_lo].pcm_end,pcm_start)); + duration-=pre_skip; + if(_pcm_offset>=duration)return -1; + _pcm_offset+=pre_skip; + if(OP_UNLIKELY(pcm_start>OP_INT64_MAX-_pcm_offset)){ + /*Adding this amount to the granule position would overflow the positive + half of its 64-bit range. + Since signed overflow is undefined in C, do it in a way the compiler + isn't allowed to screw up.*/ + _pcm_offset-=OP_INT64_MAX-pcm_start+1; + pcm_start=OP_INT64_MIN; + } + pcm_start+=_pcm_offset; + *_li=li_lo; + return pcm_start; } /*A small helper to determine if an Ogg page contains data that continues onto @@ -2532,15 +2601,14 @@ int op_pcm_seek(OggOpusFile *_of,ogg_int64_t _pcm_offset){ ogg_int64_t gp; gp=_of->prev_packet_gp; if(OP_LIKELY(gp!=-1)){ - int nbuffered; + ogg_int64_t discard_count; + int nbuffered; nbuffered=OP_MAX(_of->od_buffer_size-_of->od_buffer_pos,0); OP_ALWAYS_TRUE(!op_granpos_add(&gp,gp,-nbuffered)); /*We do _not_ add cur_discard_count to gp. Otherwise the total amount to discard could grow without bound, and it would be better just to do a full seek.*/ - if(OP_LIKELY(!op_granpos_diff(&diff,gp,pcm_start))){ - ogg_int64_t discard_count; - discard_count=_pcm_offset-diff; + if(OP_LIKELY(!op_granpos_diff(&discard_count,target_gp,gp))){ /*We use a threshold of 90 ms instead of 80, since 80 ms is the _minimum_ we would have discarded after a full seek. Assuming 20 ms frames (the default), we'd discard 90 ms on average.*/ @@ -2606,22 +2674,14 @@ static ogg_int64_t op_get_pcm_offset(const OggOpusFile *_of, ogg_int64_t _gp,int _li){ const OggOpusLink *links; ogg_int64_t pcm_offset; - ogg_int64_t delta; - int li; links=_of->links; - pcm_offset=0; - OP_ASSERT(_li<_of->nlinks); - for(li=0;li<_li;li++){ - OP_ALWAYS_TRUE(!op_granpos_diff(&delta, - links[li].pcm_end,links[li].pcm_start)); - delta-=links[li].head.pre_skip; - pcm_offset+=delta; - } - OP_ASSERT(_li>=0); + OP_ASSERT(_li>=0&&_li<_of->nlinks); + pcm_offset=links[_li].pcm_file_offset; if(_of->seekable&&OP_UNLIKELY(op_granpos_cmp(_gp,links[_li].pcm_end)>0)){ _gp=links[_li].pcm_end; } if(OP_LIKELY(op_granpos_cmp(_gp,links[_li].pcm_start)>0)){ + ogg_int64_t delta; if(OP_UNLIKELY(op_granpos_diff(&delta,_gp,links[_li].pcm_start)<0)){ /*This means an unseekable stream claimed to have a page from more than 2 billion days after we joined.*/ diff --git a/thirdparty/opus/repacketizer.c b/thirdparty/opus/repacketizer.c index c80ee7f001..bda44a148a 100644 --- a/thirdparty/opus/repacketizer.c +++ b/thirdparty/opus/repacketizer.c @@ -213,7 +213,8 @@ opus_int32 opus_repacketizer_out_range_impl(OpusRepacketizer *rp, int begin, int { /* Using OPUS_MOVE() instead of OPUS_COPY() in case we're doing in-place padding from opus_packet_pad or opus_packet_unpad(). */ - celt_assert(frames[i] + len[i] <= data || ptr <= frames[i]); + /* assert disabled because it's not valid in C. */ + /* celt_assert(frames[i] + len[i] <= data || ptr <= frames[i]); */ OPUS_MOVE(ptr, frames[i], len[i]); ptr += len[i]; } diff --git a/thirdparty/opus/silk/A2NLSF.c b/thirdparty/opus/silk/A2NLSF.c index b6e9e5ffcc..b487686ff9 100644 --- a/thirdparty/opus/silk/A2NLSF.c +++ b/thirdparty/opus/silk/A2NLSF.c @@ -40,7 +40,7 @@ POSSIBILITY OF SUCH DAMAGE. /* Number of binary divisions, when not in low complexity mode */ #define BIN_DIV_STEPS_A2NLSF_FIX 3 /* must be no higher than 16 - log2( LSF_COS_TAB_SZ_FIX ) */ -#define MAX_ITERATIONS_A2NLSF_FIX 30 +#define MAX_ITERATIONS_A2NLSF_FIX 16 /* Helper function for A2NLSF(..) */ /* Transforms polynomials from cos(n*f) to cos(f)^n */ @@ -130,7 +130,7 @@ void silk_A2NLSF( const opus_int d /* I Filter order (must be even) */ ) { - opus_int i, k, m, dd, root_ix, ffrac; + opus_int i, k, m, dd, root_ix, ffrac; opus_int32 xlo, xhi, xmid; opus_int32 ylo, yhi, ymid, thr; opus_int32 nom, den; @@ -239,13 +239,13 @@ void silk_A2NLSF( /* Set NLSFs to white spectrum and exit */ NLSF[ 0 ] = (opus_int16)silk_DIV32_16( 1 << 15, d + 1 ); for( k = 1; k < d; k++ ) { - NLSF[ k ] = (opus_int16)silk_SMULBB( k + 1, NLSF[ 0 ] ); + NLSF[ k ] = (opus_int16)silk_ADD16( NLSF[ k-1 ], NLSF[ 0 ] ); } return; } /* Error: Apply progressively more bandwidth expansion and run again */ - silk_bwexpander_32( a_Q16, d, 65536 - silk_SMULBB( 10 + i, i ) ); /* 10_Q16 = 0.00015*/ + silk_bwexpander_32( a_Q16, d, 65536 - silk_LSHIFT( 1, i ) ); silk_A2NLSF_init( a_Q16, P, Q, dd ); p = P; /* Pointer to polynomial */ diff --git a/thirdparty/opus/silk/API.h b/thirdparty/opus/silk/API.h index 0131acbb08..4d90ff9aa3 100644 --- a/thirdparty/opus/silk/API.h +++ b/thirdparty/opus/silk/API.h @@ -80,7 +80,8 @@ opus_int silk_Encode( /* O Returns error co opus_int nSamplesIn, /* I Number of samples in input vector */ ec_enc *psRangeEnc, /* I/O Compressor data structure */ opus_int32 *nBytesOut, /* I/O Number of bytes in payload (input: Max bytes) */ - const opus_int prefillFlag /* I Flag to indicate prefilling buffers no coding */ + const opus_int prefillFlag, /* I Flag to indicate prefilling buffers no coding */ + int activity /* I Decision of Opus voice activity detector */ ); /****************************************/ diff --git a/thirdparty/opus/silk/CNG.c b/thirdparty/opus/silk/CNG.c index 8443ad63bb..ef8e38df9f 100644 --- a/thirdparty/opus/silk/CNG.c +++ b/thirdparty/opus/silk/CNG.c @@ -138,16 +138,16 @@ void silk_CNG( gain_Q16 = silk_LSHIFT32( silk_SQRT_APPROX( gain_Q16 ), 8 ); } gain_Q10 = silk_RSHIFT( gain_Q16, 6 ); - + silk_CNG_exc( CNG_sig_Q14 + MAX_LPC_ORDER, psCNG->CNG_exc_buf_Q14, length, &psCNG->rand_seed ); /* Convert CNG NLSF to filter representation */ - silk_NLSF2A( A_Q12, psCNG->CNG_smth_NLSF_Q15, psDec->LPC_order ); + silk_NLSF2A( A_Q12, psCNG->CNG_smth_NLSF_Q15, psDec->LPC_order, psDec->arch ); /* Generate CNG signal, by synthesis filtering */ silk_memcpy( CNG_sig_Q14, psCNG->CNG_synth_state, MAX_LPC_ORDER * sizeof( opus_int32 ) ); + celt_assert( psDec->LPC_order == 10 || psDec->LPC_order == 16 ); for( i = 0; i < length; i++ ) { - silk_assert( psDec->LPC_order == 10 || psDec->LPC_order == 16 ); /* Avoids introducing a bias because silk_SMLAWB() always rounds to -inf */ LPC_pred_Q10 = silk_RSHIFT( psDec->LPC_order, 1 ); LPC_pred_Q10 = silk_SMLAWB( LPC_pred_Q10, CNG_sig_Q14[ MAX_LPC_ORDER + i - 1 ], A_Q12[ 0 ] ); @@ -170,11 +170,11 @@ void silk_CNG( } /* Update states */ - CNG_sig_Q14[ MAX_LPC_ORDER + i ] = silk_ADD_LSHIFT( CNG_sig_Q14[ MAX_LPC_ORDER + i ], LPC_pred_Q10, 4 ); - + CNG_sig_Q14[ MAX_LPC_ORDER + i ] = silk_ADD_SAT32( CNG_sig_Q14[ MAX_LPC_ORDER + i ], silk_LSHIFT_SAT32( LPC_pred_Q10, 4 ) ); + /* Scale with Gain and add to input signal */ frame[ i ] = (opus_int16)silk_ADD_SAT16( frame[ i ], silk_SAT16( silk_RSHIFT_ROUND( silk_SMULWW( CNG_sig_Q14[ MAX_LPC_ORDER + i ], gain_Q10 ), 8 ) ) ); - + } silk_memcpy( psCNG->CNG_synth_state, &CNG_sig_Q14[ length ], MAX_LPC_ORDER * sizeof( opus_int32 ) ); } else { diff --git a/thirdparty/opus/silk/LPC_analysis_filter.c b/thirdparty/opus/silk/LPC_analysis_filter.c index 20906673ff..d34b5eb709 100644 --- a/thirdparty/opus/silk/LPC_analysis_filter.c +++ b/thirdparty/opus/silk/LPC_analysis_filter.c @@ -39,6 +39,13 @@ POSSIBILITY OF SUCH DAMAGE. /* first d output samples are set to zero */ /*******************************************/ +/* OPT: Using celt_fir() for this function should be faster, but it may cause + integer overflows in intermediate values (not final results), which the + current implementation silences by casting to unsigned. Enabling + this should be safe in pretty much all cases, even though it is not technically + C89-compliant. */ +#define USE_CELT_FIR 0 + void silk_LPC_analysis_filter( opus_int16 *out, /* O Output signal */ const opus_int16 *in, /* I Input signal */ @@ -49,8 +56,7 @@ void silk_LPC_analysis_filter( ) { opus_int j; -#ifdef FIXED_POINT - opus_int16 mem[SILK_MAX_ORDER_LPC]; +#if defined(FIXED_POINT) && USE_CELT_FIR opus_int16 num[SILK_MAX_ORDER_LPC]; #else int ix; @@ -58,19 +64,16 @@ void silk_LPC_analysis_filter( const opus_int16 *in_ptr; #endif - silk_assert( d >= 6 ); - silk_assert( (d & 1) == 0 ); - silk_assert( d <= len ); + celt_assert( d >= 6 ); + celt_assert( (d & 1) == 0 ); + celt_assert( d <= len ); -#ifdef FIXED_POINT - silk_assert( d <= SILK_MAX_ORDER_LPC ); +#if defined(FIXED_POINT) && USE_CELT_FIR + celt_assert( d <= SILK_MAX_ORDER_LPC ); for ( j = 0; j < d; j++ ) { num[ j ] = -B[ j ]; } - for (j=0;j<d;j++) { - mem[ j ] = in[ d - j - 1 ]; - } - celt_fir( in + d, num, out + d, len - d, d, mem, arch ); + celt_fir( in + d, num, out + d, len - d, d, arch ); for ( j = 0; j < d; j++ ) { out[ j ] = 0; } diff --git a/thirdparty/opus/silk/LPC_fit.c b/thirdparty/opus/silk/LPC_fit.c new file mode 100644 index 0000000000..cdea4f3abc --- /dev/null +++ b/thirdparty/opus/silk/LPC_fit.c @@ -0,0 +1,81 @@ +/*********************************************************************** +Copyright (c) 2013, Koen Vos. All rights reserved. +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +- Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. +- Neither the name of Internet Society, IETF or IETF Trust, nor the +names of specific contributors, may be used to endorse or promote +products derived from this software without specific prior written +permission. +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. +***********************************************************************/ + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#include "SigProc_FIX.h" + +/* Convert int32 coefficients to int16 coefs and make sure there's no wrap-around */ +void silk_LPC_fit( + opus_int16 *a_QOUT, /* O Output signal */ + opus_int32 *a_QIN, /* I/O Input signal */ + const opus_int QOUT, /* I Input Q domain */ + const opus_int QIN, /* I Input Q domain */ + const opus_int d /* I Filter order */ +) +{ + opus_int i, k, idx = 0; + opus_int32 maxabs, absval, chirp_Q16; + + /* Limit the maximum absolute value of the prediction coefficients, so that they'll fit in int16 */ + for( i = 0; i < 10; i++ ) { + /* Find maximum absolute value and its index */ + maxabs = 0; + for( k = 0; k < d; k++ ) { + absval = silk_abs( a_QIN[k] ); + if( absval > maxabs ) { + maxabs = absval; + idx = k; + } + } + maxabs = silk_RSHIFT_ROUND( maxabs, QIN - QOUT ); + + if( maxabs > silk_int16_MAX ) { + /* Reduce magnitude of prediction coefficients */ + maxabs = silk_min( maxabs, 163838 ); /* ( silk_int32_MAX >> 14 ) + silk_int16_MAX = 163838 */ + chirp_Q16 = SILK_FIX_CONST( 0.999, 16 ) - silk_DIV32( silk_LSHIFT( maxabs - silk_int16_MAX, 14 ), + silk_RSHIFT32( silk_MUL( maxabs, idx + 1), 2 ) ); + silk_bwexpander_32( a_QIN, d, chirp_Q16 ); + } else { + break; + } + } + + if( i == 10 ) { + /* Reached the last iteration, clip the coefficients */ + for( k = 0; k < d; k++ ) { + a_QOUT[ k ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( a_QIN[ k ], QIN - QOUT ) ); + a_QIN[ k ] = silk_LSHIFT( (opus_int32)a_QOUT[ k ], QIN - QOUT ); + } + } else { + for( k = 0; k < d; k++ ) { + a_QOUT[ k ] = (opus_int16)silk_RSHIFT_ROUND( a_QIN[ k ], QIN - QOUT ); + } + } +} diff --git a/thirdparty/opus/silk/LPC_inv_pred_gain.c b/thirdparty/opus/silk/LPC_inv_pred_gain.c index 4af89aa5fa..a3746a6ef9 100644 --- a/thirdparty/opus/silk/LPC_inv_pred_gain.c +++ b/thirdparty/opus/silk/LPC_inv_pred_gain.c @@ -30,6 +30,7 @@ POSSIBILITY OF SUCH DAMAGE. #endif #include "SigProc_FIX.h" +#include "define.h" #define QA 24 #define A_LIMIT SILK_FIX_CONST( 0.99975, QA ) @@ -38,117 +39,103 @@ POSSIBILITY OF SUCH DAMAGE. /* Compute inverse of LPC prediction gain, and */ /* test if LPC coefficients are stable (all poles within unit circle) */ -static opus_int32 LPC_inverse_pred_gain_QA( /* O Returns inverse prediction gain in energy domain, Q30 */ - opus_int32 A_QA[ 2 ][ SILK_MAX_ORDER_LPC ], /* I Prediction coefficients */ +static opus_int32 LPC_inverse_pred_gain_QA_c( /* O Returns inverse prediction gain in energy domain, Q30 */ + opus_int32 A_QA[ SILK_MAX_ORDER_LPC ], /* I Prediction coefficients */ const opus_int order /* I Prediction order */ ) { opus_int k, n, mult2Q; - opus_int32 invGain_Q30, rc_Q31, rc_mult1_Q30, rc_mult2, tmp_QA; - opus_int32 *Aold_QA, *Anew_QA; + opus_int32 invGain_Q30, rc_Q31, rc_mult1_Q30, rc_mult2, tmp1, tmp2; - Anew_QA = A_QA[ order & 1 ]; - - invGain_Q30 = (opus_int32)1 << 30; + invGain_Q30 = SILK_FIX_CONST( 1, 30 ); for( k = order - 1; k > 0; k-- ) { /* Check for stability */ - if( ( Anew_QA[ k ] > A_LIMIT ) || ( Anew_QA[ k ] < -A_LIMIT ) ) { + if( ( A_QA[ k ] > A_LIMIT ) || ( A_QA[ k ] < -A_LIMIT ) ) { return 0; } /* Set RC equal to negated AR coef */ - rc_Q31 = -silk_LSHIFT( Anew_QA[ k ], 31 - QA ); + rc_Q31 = -silk_LSHIFT( A_QA[ k ], 31 - QA ); /* rc_mult1_Q30 range: [ 1 : 2^30 ] */ - rc_mult1_Q30 = ( (opus_int32)1 << 30 ) - silk_SMMUL( rc_Q31, rc_Q31 ); + rc_mult1_Q30 = silk_SUB32( SILK_FIX_CONST( 1, 30 ), silk_SMMUL( rc_Q31, rc_Q31 ) ); silk_assert( rc_mult1_Q30 > ( 1 << 15 ) ); /* reduce A_LIMIT if fails */ silk_assert( rc_mult1_Q30 <= ( 1 << 30 ) ); - /* rc_mult2 range: [ 2^30 : silk_int32_MAX ] */ - mult2Q = 32 - silk_CLZ32( silk_abs( rc_mult1_Q30 ) ); - rc_mult2 = silk_INVERSE32_varQ( rc_mult1_Q30, mult2Q + 30 ); - /* Update inverse gain */ /* invGain_Q30 range: [ 0 : 2^30 ] */ invGain_Q30 = silk_LSHIFT( silk_SMMUL( invGain_Q30, rc_mult1_Q30 ), 2 ); silk_assert( invGain_Q30 >= 0 ); silk_assert( invGain_Q30 <= ( 1 << 30 ) ); + if( invGain_Q30 < SILK_FIX_CONST( 1.0f / MAX_PREDICTION_POWER_GAIN, 30 ) ) { + return 0; + } - /* Swap pointers */ - Aold_QA = Anew_QA; - Anew_QA = A_QA[ k & 1 ]; + /* rc_mult2 range: [ 2^30 : silk_int32_MAX ] */ + mult2Q = 32 - silk_CLZ32( silk_abs( rc_mult1_Q30 ) ); + rc_mult2 = silk_INVERSE32_varQ( rc_mult1_Q30, mult2Q + 30 ); /* Update AR coefficient */ - for( n = 0; n < k; n++ ) { - tmp_QA = Aold_QA[ n ] - MUL32_FRAC_Q( Aold_QA[ k - n - 1 ], rc_Q31, 31 ); - Anew_QA[ n ] = MUL32_FRAC_Q( tmp_QA, rc_mult2 , mult2Q ); + for( n = 0; n < (k + 1) >> 1; n++ ) { + opus_int64 tmp64; + tmp1 = A_QA[ n ]; + tmp2 = A_QA[ k - n - 1 ]; + tmp64 = silk_RSHIFT_ROUND64( silk_SMULL( silk_SUB_SAT32(tmp1, + MUL32_FRAC_Q( tmp2, rc_Q31, 31 ) ), rc_mult2 ), mult2Q); + if( tmp64 > silk_int32_MAX || tmp64 < silk_int32_MIN ) { + return 0; + } + A_QA[ n ] = ( opus_int32 )tmp64; + tmp64 = silk_RSHIFT_ROUND64( silk_SMULL( silk_SUB_SAT32(tmp2, + MUL32_FRAC_Q( tmp1, rc_Q31, 31 ) ), rc_mult2), mult2Q); + if( tmp64 > silk_int32_MAX || tmp64 < silk_int32_MIN ) { + return 0; + } + A_QA[ k - n - 1 ] = ( opus_int32 )tmp64; } } /* Check for stability */ - if( ( Anew_QA[ 0 ] > A_LIMIT ) || ( Anew_QA[ 0 ] < -A_LIMIT ) ) { + if( ( A_QA[ k ] > A_LIMIT ) || ( A_QA[ k ] < -A_LIMIT ) ) { return 0; } /* Set RC equal to negated AR coef */ - rc_Q31 = -silk_LSHIFT( Anew_QA[ 0 ], 31 - QA ); + rc_Q31 = -silk_LSHIFT( A_QA[ 0 ], 31 - QA ); /* Range: [ 1 : 2^30 ] */ - rc_mult1_Q30 = ( (opus_int32)1 << 30 ) - silk_SMMUL( rc_Q31, rc_Q31 ); + rc_mult1_Q30 = silk_SUB32( SILK_FIX_CONST( 1, 30 ), silk_SMMUL( rc_Q31, rc_Q31 ) ); /* Update inverse gain */ /* Range: [ 0 : 2^30 ] */ invGain_Q30 = silk_LSHIFT( silk_SMMUL( invGain_Q30, rc_mult1_Q30 ), 2 ); - silk_assert( invGain_Q30 >= 0 ); - silk_assert( invGain_Q30 <= 1<<30 ); + silk_assert( invGain_Q30 >= 0 ); + silk_assert( invGain_Q30 <= ( 1 << 30 ) ); + if( invGain_Q30 < SILK_FIX_CONST( 1.0f / MAX_PREDICTION_POWER_GAIN, 30 ) ) { + return 0; + } return invGain_Q30; } /* For input in Q12 domain */ -opus_int32 silk_LPC_inverse_pred_gain( /* O Returns inverse prediction gain in energy domain, Q30 */ +opus_int32 silk_LPC_inverse_pred_gain_c( /* O Returns inverse prediction gain in energy domain, Q30 */ const opus_int16 *A_Q12, /* I Prediction coefficients, Q12 [order] */ const opus_int order /* I Prediction order */ ) { opus_int k; - opus_int32 Atmp_QA[ 2 ][ SILK_MAX_ORDER_LPC ]; - opus_int32 *Anew_QA; + opus_int32 Atmp_QA[ SILK_MAX_ORDER_LPC ]; opus_int32 DC_resp = 0; - Anew_QA = Atmp_QA[ order & 1 ]; - /* Increase Q domain of the AR coefficients */ for( k = 0; k < order; k++ ) { DC_resp += (opus_int32)A_Q12[ k ]; - Anew_QA[ k ] = silk_LSHIFT32( (opus_int32)A_Q12[ k ], QA - 12 ); + Atmp_QA[ k ] = silk_LSHIFT32( (opus_int32)A_Q12[ k ], QA - 12 ); } /* If the DC is unstable, we don't even need to do the full calculations */ if( DC_resp >= 4096 ) { return 0; } - return LPC_inverse_pred_gain_QA( Atmp_QA, order ); + return LPC_inverse_pred_gain_QA_c( Atmp_QA, order ); } - -#ifdef FIXED_POINT - -/* For input in Q24 domain */ -opus_int32 silk_LPC_inverse_pred_gain_Q24( /* O Returns inverse prediction gain in energy domain, Q30 */ - const opus_int32 *A_Q24, /* I Prediction coefficients [order] */ - const opus_int order /* I Prediction order */ -) -{ - opus_int k; - opus_int32 Atmp_QA[ 2 ][ SILK_MAX_ORDER_LPC ]; - opus_int32 *Anew_QA; - - Anew_QA = Atmp_QA[ order & 1 ]; - - /* Increase Q domain of the AR coefficients */ - for( k = 0; k < order; k++ ) { - Anew_QA[ k ] = silk_RSHIFT32( A_Q24[ k ], 24 - QA ); - } - - return LPC_inverse_pred_gain_QA( Atmp_QA, order ); -} -#endif diff --git a/thirdparty/opus/silk/LP_variable_cutoff.c b/thirdparty/opus/silk/LP_variable_cutoff.c index f639e1f899..79112ad354 100644 --- a/thirdparty/opus/silk/LP_variable_cutoff.c +++ b/thirdparty/opus/silk/LP_variable_cutoff.c @@ -130,6 +130,6 @@ void silk_LP_variable_cutoff( /* ARMA low-pass filtering */ silk_assert( TRANSITION_NB == 3 && TRANSITION_NA == 2 ); - silk_biquad_alt( frame, B_Q28, A_Q28, psLP->In_LP_State, frame, frame_length, 1); + silk_biquad_alt_stride1( frame, B_Q28, A_Q28, psLP->In_LP_State, frame, frame_length); } } diff --git a/thirdparty/opus/silk/MacroCount.h b/thirdparty/opus/silk/MacroCount.h index 834817d058..78100ffede 100644 --- a/thirdparty/opus/silk/MacroCount.h +++ b/thirdparty/opus/silk/MacroCount.h @@ -319,14 +319,6 @@ static OPUS_INLINE opus_int32 silk_ADD_POS_SAT32(opus_int64 a, opus_int64 b){ return(tmp); } -#undef silk_ADD_POS_SAT64 -static OPUS_INLINE opus_int64 silk_ADD_POS_SAT64(opus_int64 a, opus_int64 b){ - opus_int64 tmp; - ops_count += 1; - tmp = ((((a)+(b)) & 0x8000000000000000LL) ? silk_int64_MAX : ((a)+(b))); - return(tmp); -} - #undef silk_LSHIFT8 static OPUS_INLINE opus_int8 silk_LSHIFT8(opus_int8 a, opus_int32 shift){ opus_int8 ret; @@ -699,7 +691,7 @@ return(ret); #undef silk_LIMIT_32 -static OPUS_INLINE opus_int silk_LIMIT_32(opus_int32 a, opus_int32 limit1, opus_int32 limit2) +static OPUS_INLINE opus_int32 silk_LIMIT_32(opus_int32 a, opus_int32 limit1, opus_int32 limit2) { opus_int32 ret; ops_count += 6; diff --git a/thirdparty/opus/silk/MacroDebug.h b/thirdparty/opus/silk/MacroDebug.h index 35aedc5c5f..8dd4ce2ee2 100644 --- a/thirdparty/opus/silk/MacroDebug.h +++ b/thirdparty/opus/silk/MacroDebug.h @@ -539,8 +539,7 @@ static OPUS_INLINE opus_int32 silk_DIV32_16_(opus_int32 a32, opus_int32 b32, cha no checking needed for silk_POS_SAT32 no checking needed for silk_ADD_POS_SAT8 no checking needed for silk_ADD_POS_SAT16 - no checking needed for silk_ADD_POS_SAT32 - no checking needed for silk_ADD_POS_SAT64 */ + no checking needed for silk_ADD_POS_SAT32 */ #undef silk_LSHIFT8 #define silk_LSHIFT8(a,b) silk_LSHIFT8_((a), (b), __FILE__, __LINE__) diff --git a/thirdparty/opus/silk/NLSF2A.c b/thirdparty/opus/silk/NLSF2A.c index b1c559ea68..d5b7730638 100644 --- a/thirdparty/opus/silk/NLSF2A.c +++ b/thirdparty/opus/silk/NLSF2A.c @@ -66,7 +66,8 @@ static OPUS_INLINE void silk_NLSF2A_find_poly( void silk_NLSF2A( opus_int16 *a_Q12, /* O monic whitening filter coefficients in Q12, [ d ] */ const opus_int16 *NLSF, /* I normalized line spectral frequencies in Q15, [ d ] */ - const opus_int d /* I filter order (should be even) */ + const opus_int d, /* I filter order (should be even) */ + int arch /* I Run-time architecture */ ) { /* This ordering was found to maximize quality. It improves numerical accuracy of @@ -83,15 +84,14 @@ void silk_NLSF2A( opus_int32 P[ SILK_MAX_ORDER_LPC / 2 + 1 ], Q[ SILK_MAX_ORDER_LPC / 2 + 1 ]; opus_int32 Ptmp, Qtmp, f_int, f_frac, cos_val, delta; opus_int32 a32_QA1[ SILK_MAX_ORDER_LPC ]; - opus_int32 maxabs, absval, idx=0, sc_Q16; silk_assert( LSF_COS_TAB_SZ_FIX == 128 ); - silk_assert( d==10||d==16 ); + celt_assert( d==10 || d==16 ); /* convert LSFs to 2*cos(LSF), using piecewise linear curve from table */ ordering = d == 16 ? ordering16 : ordering10; for( k = 0; k < d; k++ ) { - silk_assert(NLSF[k] >= 0 ); + silk_assert( NLSF[k] >= 0 ); /* f_int on a scale 0-127 (rounded down) */ f_int = silk_RSHIFT( NLSF[k], 15 - 7 ); @@ -126,52 +126,15 @@ void silk_NLSF2A( a32_QA1[ d-k-1 ] = Qtmp - Ptmp; /* QA+1 */ } - /* Limit the maximum absolute value of the prediction coefficients, so that they'll fit in int16 */ - for( i = 0; i < 10; i++ ) { - /* Find maximum absolute value and its index */ - maxabs = 0; - for( k = 0; k < d; k++ ) { - absval = silk_abs( a32_QA1[k] ); - if( absval > maxabs ) { - maxabs = absval; - idx = k; - } - } - maxabs = silk_RSHIFT_ROUND( maxabs, QA + 1 - 12 ); /* QA+1 -> Q12 */ - - if( maxabs > silk_int16_MAX ) { - /* Reduce magnitude of prediction coefficients */ - maxabs = silk_min( maxabs, 163838 ); /* ( silk_int32_MAX >> 14 ) + silk_int16_MAX = 163838 */ - sc_Q16 = SILK_FIX_CONST( 0.999, 16 ) - silk_DIV32( silk_LSHIFT( maxabs - silk_int16_MAX, 14 ), - silk_RSHIFT32( silk_MUL( maxabs, idx + 1), 2 ) ); - silk_bwexpander_32( a32_QA1, d, sc_Q16 ); - } else { - break; - } - } + /* Convert int32 coefficients to Q12 int16 coefs */ + silk_LPC_fit( a_Q12, a32_QA1, 12, QA + 1, d ); - if( i == 10 ) { - /* Reached the last iteration, clip the coefficients */ + for( i = 0; silk_LPC_inverse_pred_gain( a_Q12, d, arch ) == 0 && i < MAX_LPC_STABILIZE_ITERATIONS; i++ ) { + /* Prediction coefficients are (too close to) unstable; apply bandwidth expansion */ + /* on the unscaled coefficients, convert to Q12 and measure again */ + silk_bwexpander_32( a32_QA1, d, 65536 - silk_LSHIFT( 2, i ) ); for( k = 0; k < d; k++ ) { - a_Q12[ k ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( a32_QA1[ k ], QA + 1 - 12 ) ); /* QA+1 -> Q12 */ - a32_QA1[ k ] = silk_LSHIFT( (opus_int32)a_Q12[ k ], QA + 1 - 12 ); - } - } else { - for( k = 0; k < d; k++ ) { - a_Q12[ k ] = (opus_int16)silk_RSHIFT_ROUND( a32_QA1[ k ], QA + 1 - 12 ); /* QA+1 -> Q12 */ - } - } - - for( i = 0; i < MAX_LPC_STABILIZE_ITERATIONS; i++ ) { - if( silk_LPC_inverse_pred_gain( a_Q12, d ) < SILK_FIX_CONST( 1.0 / MAX_PREDICTION_POWER_GAIN, 30 ) ) { - /* Prediction coefficients are (too close to) unstable; apply bandwidth expansion */ - /* on the unscaled coefficients, convert to Q12 and measure again */ - silk_bwexpander_32( a32_QA1, d, 65536 - silk_LSHIFT( 2, i ) ); - for( k = 0; k < d; k++ ) { - a_Q12[ k ] = (opus_int16)silk_RSHIFT_ROUND( a32_QA1[ k ], QA + 1 - 12 ); /* QA+1 -> Q12 */ - } - } else { - break; + a_Q12[ k ] = (opus_int16)silk_RSHIFT_ROUND( a32_QA1[ k ], QA + 1 - 12 ); /* QA+1 -> Q12 */ } } } diff --git a/thirdparty/opus/silk/NLSF_VQ.c b/thirdparty/opus/silk/NLSF_VQ.c index 69b6e22e18..b83182a79c 100644 --- a/thirdparty/opus/silk/NLSF_VQ.c +++ b/thirdparty/opus/silk/NLSF_VQ.c @@ -33,36 +33,44 @@ POSSIBILITY OF SUCH DAMAGE. /* Compute quantization errors for an LPC_order element input vector for a VQ codebook */ void silk_NLSF_VQ( - opus_int32 err_Q26[], /* O Quantization errors [K] */ + opus_int32 err_Q24[], /* O Quantization errors [K] */ const opus_int16 in_Q15[], /* I Input vectors to be quantized [LPC_order] */ const opus_uint8 pCB_Q8[], /* I Codebook vectors [K*LPC_order] */ + const opus_int16 pWght_Q9[], /* I Codebook weights [K*LPC_order] */ const opus_int K, /* I Number of codebook vectors */ const opus_int LPC_order /* I Number of LPCs */ ) { - opus_int i, m; - opus_int32 diff_Q15, sum_error_Q30, sum_error_Q26; + opus_int i, m; + opus_int32 diff_Q15, diffw_Q24, sum_error_Q24, pred_Q24; + const opus_int16 *w_Q9_ptr; + const opus_uint8 *cb_Q8_ptr; - silk_assert( LPC_order <= 16 ); - silk_assert( ( LPC_order & 1 ) == 0 ); + celt_assert( ( LPC_order & 1 ) == 0 ); /* Loop over codebook */ + cb_Q8_ptr = pCB_Q8; + w_Q9_ptr = pWght_Q9; for( i = 0; i < K; i++ ) { - sum_error_Q26 = 0; - for( m = 0; m < LPC_order; m += 2 ) { - /* Compute weighted squared quantization error for index m */ - diff_Q15 = silk_SUB_LSHIFT32( in_Q15[ m ], (opus_int32)*pCB_Q8++, 7 ); /* range: [ -32767 : 32767 ]*/ - sum_error_Q30 = silk_SMULBB( diff_Q15, diff_Q15 ); + sum_error_Q24 = 0; + pred_Q24 = 0; + for( m = LPC_order-2; m >= 0; m -= 2 ) { + /* Compute weighted absolute predictive quantization error for index m + 1 */ + diff_Q15 = silk_SUB_LSHIFT32( in_Q15[ m + 1 ], (opus_int32)cb_Q8_ptr[ m + 1 ], 7 ); /* range: [ -32767 : 32767 ]*/ + diffw_Q24 = silk_SMULBB( diff_Q15, w_Q9_ptr[ m + 1 ] ); + sum_error_Q24 = silk_ADD32( sum_error_Q24, silk_abs( silk_SUB_RSHIFT32( diffw_Q24, pred_Q24, 1 ) ) ); + pred_Q24 = diffw_Q24; - /* Compute weighted squared quantization error for index m + 1 */ - diff_Q15 = silk_SUB_LSHIFT32( in_Q15[m + 1], (opus_int32)*pCB_Q8++, 7 ); /* range: [ -32767 : 32767 ]*/ - sum_error_Q30 = silk_SMLABB( sum_error_Q30, diff_Q15, diff_Q15 ); + /* Compute weighted absolute predictive quantization error for index m */ + diff_Q15 = silk_SUB_LSHIFT32( in_Q15[ m ], (opus_int32)cb_Q8_ptr[ m ], 7 ); /* range: [ -32767 : 32767 ]*/ + diffw_Q24 = silk_SMULBB( diff_Q15, w_Q9_ptr[ m ] ); + sum_error_Q24 = silk_ADD32( sum_error_Q24, silk_abs( silk_SUB_RSHIFT32( diffw_Q24, pred_Q24, 1 ) ) ); + pred_Q24 = diffw_Q24; - sum_error_Q26 = silk_ADD_RSHIFT32( sum_error_Q26, sum_error_Q30, 4 ); - - silk_assert( sum_error_Q26 >= 0 ); - silk_assert( sum_error_Q30 >= 0 ); + silk_assert( sum_error_Q24 >= 0 ); } - err_Q26[ i ] = sum_error_Q26; + err_Q24[ i ] = sum_error_Q24; + cb_Q8_ptr += LPC_order; + w_Q9_ptr += LPC_order; } } diff --git a/thirdparty/opus/silk/NLSF_VQ_weights_laroia.c b/thirdparty/opus/silk/NLSF_VQ_weights_laroia.c index 04894c59ab..9873bcde10 100644 --- a/thirdparty/opus/silk/NLSF_VQ_weights_laroia.c +++ b/thirdparty/opus/silk/NLSF_VQ_weights_laroia.c @@ -48,8 +48,8 @@ void silk_NLSF_VQ_weights_laroia( opus_int k; opus_int32 tmp1_int, tmp2_int; - silk_assert( D > 0 ); - silk_assert( ( D & 1 ) == 0 ); + celt_assert( D > 0 ); + celt_assert( ( D & 1 ) == 0 ); /* First value */ tmp1_int = silk_max_int( pNLSF_Q15[ 0 ], 1 ); diff --git a/thirdparty/opus/silk/NLSF_decode.c b/thirdparty/opus/silk/NLSF_decode.c index 9f715060b8..eeb0ba8c92 100644 --- a/thirdparty/opus/silk/NLSF_decode.c +++ b/thirdparty/opus/silk/NLSF_decode.c @@ -32,7 +32,7 @@ POSSIBILITY OF SUCH DAMAGE. #include "main.h" /* Predictive dequantizer for NLSF residuals */ -static OPUS_INLINE void silk_NLSF_residual_dequant( /* O Returns RD value in Q30 */ +static OPUS_INLINE void silk_NLSF_residual_dequant( /* O Returns RD value in Q30 */ opus_int16 x_Q10[], /* O Output [ order ] */ const opus_int8 indices[], /* I Quantization indices [ order ] */ const opus_uint8 pred_coef_Q8[], /* I Backward predictor coefs [ order ] */ @@ -70,15 +70,9 @@ void silk_NLSF_decode( opus_uint8 pred_Q8[ MAX_LPC_ORDER ]; opus_int16 ec_ix[ MAX_LPC_ORDER ]; opus_int16 res_Q10[ MAX_LPC_ORDER ]; - opus_int16 W_tmp_QW[ MAX_LPC_ORDER ]; - opus_int32 W_tmp_Q9, NLSF_Q15_tmp; + opus_int32 NLSF_Q15_tmp; const opus_uint8 *pCB_element; - - /* Decode first stage */ - pCB_element = &psNLSF_CB->CB1_NLSF_Q8[ NLSFIndices[ 0 ] * psNLSF_CB->order ]; - for( i = 0; i < psNLSF_CB->order; i++ ) { - pNLSF_Q15[ i ] = silk_LSHIFT( (opus_int16)pCB_element[ i ], 7 ); - } + const opus_int16 *pCB_Wght_Q9; /* Unpack entropy table indices and predictor for current CB1 index */ silk_NLSF_unpack( ec_ix, pred_Q8, psNLSF_CB, NLSFIndices[ 0 ] ); @@ -86,13 +80,11 @@ void silk_NLSF_decode( /* Predictive residual dequantizer */ silk_NLSF_residual_dequant( res_Q10, &NLSFIndices[ 1 ], pred_Q8, psNLSF_CB->quantStepSize_Q16, psNLSF_CB->order ); - /* Weights from codebook vector */ - silk_NLSF_VQ_weights_laroia( W_tmp_QW, pNLSF_Q15, psNLSF_CB->order ); - - /* Apply inverse square-rooted weights and add to output */ + /* Apply inverse square-rooted weights to first stage and add to output */ + pCB_element = &psNLSF_CB->CB1_NLSF_Q8[ NLSFIndices[ 0 ] * psNLSF_CB->order ]; + pCB_Wght_Q9 = &psNLSF_CB->CB1_Wght_Q9[ NLSFIndices[ 0 ] * psNLSF_CB->order ]; for( i = 0; i < psNLSF_CB->order; i++ ) { - W_tmp_Q9 = silk_SQRT_APPROX( silk_LSHIFT( (opus_int32)W_tmp_QW[ i ], 18 - NLSF_W_Q ) ); - NLSF_Q15_tmp = silk_ADD32( pNLSF_Q15[ i ], silk_DIV32_16( silk_LSHIFT( (opus_int32)res_Q10[ i ], 14 ), W_tmp_Q9 ) ); + NLSF_Q15_tmp = silk_ADD_LSHIFT32( silk_DIV32_16( silk_LSHIFT( (opus_int32)res_Q10[ i ], 14 ), pCB_Wght_Q9[ i ] ), (opus_int16)pCB_element[ i ], 7 ); pNLSF_Q15[ i ] = (opus_int16)silk_LIMIT( NLSF_Q15_tmp, 0, 32767 ); } diff --git a/thirdparty/opus/silk/NLSF_del_dec_quant.c b/thirdparty/opus/silk/NLSF_del_dec_quant.c index de88fee060..44a16acd0b 100644 --- a/thirdparty/opus/silk/NLSF_del_dec_quant.c +++ b/thirdparty/opus/silk/NLSF_del_dec_quant.c @@ -84,7 +84,7 @@ opus_int32 silk_NLSF_del_dec_quant( /* O Returns nStates = 1; RD_Q25[ 0 ] = 0; prev_out_Q10[ 0 ] = 0; - for( i = order - 1; ; i-- ) { + for( i = order - 1; i >= 0; i-- ) { rates_Q5 = &ec_rates_Q5[ ec_ix[ i ] ]; in_Q10 = x_Q10[ i ]; for( j = 0; j < nStates; j++ ) { @@ -131,7 +131,7 @@ opus_int32 silk_NLSF_del_dec_quant( /* O Returns RD_Q25[ j + nStates ] = silk_SMLABB( silk_MLA( RD_tmp_Q25, silk_SMULBB( diff_Q10, diff_Q10 ), w_Q5[ i ] ), mu_Q20, rate1_Q5 ); } - if( nStates <= ( NLSF_QUANT_DEL_DEC_STATES >> 1 ) ) { + if( nStates <= NLSF_QUANT_DEL_DEC_STATES/2 ) { /* double number of states and copy */ for( j = 0; j < nStates; j++ ) { ind[ j + nStates ][ i ] = ind[ j ][ i ] + 1; @@ -140,7 +140,7 @@ opus_int32 silk_NLSF_del_dec_quant( /* O Returns for( j = nStates; j < NLSF_QUANT_DEL_DEC_STATES; j++ ) { ind[ j ][ i ] = ind[ j - nStates ][ i ]; } - } else if( i > 0 ) { + } else { /* sort lower and upper half of RD_Q25, pairwise */ for( j = 0; j < NLSF_QUANT_DEL_DEC_STATES; j++ ) { if( RD_Q25[ j ] > RD_Q25[ j + NLSF_QUANT_DEL_DEC_STATES ] ) { @@ -191,8 +191,6 @@ opus_int32 silk_NLSF_del_dec_quant( /* O Returns for( j = 0; j < NLSF_QUANT_DEL_DEC_STATES; j++ ) { ind[ j ][ i ] += silk_RSHIFT( ind_sort[ j ], NLSF_QUANT_DEL_DEC_STATES_LOG2 ); } - } else { /* i == 0 */ - break; } } diff --git a/thirdparty/opus/silk/NLSF_encode.c b/thirdparty/opus/silk/NLSF_encode.c index f03c3f1c35..01ac7db78c 100644 --- a/thirdparty/opus/silk/NLSF_encode.c +++ b/thirdparty/opus/silk/NLSF_encode.c @@ -37,9 +37,9 @@ POSSIBILITY OF SUCH DAMAGE. /***********************/ opus_int32 silk_NLSF_encode( /* O Returns RD value in Q25 */ opus_int8 *NLSFIndices, /* I Codebook path vector [ LPC_ORDER + 1 ] */ - opus_int16 *pNLSF_Q15, /* I/O Quantized NLSF vector [ LPC_ORDER ] */ + opus_int16 *pNLSF_Q15, /* I/O (Un)quantized NLSF vector [ LPC_ORDER ] */ const silk_NLSF_CB_struct *psNLSF_CB, /* I Codebook object */ - const opus_int16 *pW_QW, /* I NLSF weight vector [ LPC_ORDER ] */ + const opus_int16 *pW_Q2, /* I NLSF weight vector [ LPC_ORDER ] */ const opus_int NLSF_mu_Q20, /* I Rate weight for the RD optimization */ const opus_int nSurvivors, /* I Max survivors after first stage */ const opus_int signalType /* I Signal type: 0/1/2 */ @@ -47,34 +47,32 @@ opus_int32 silk_NLSF_encode( /* O Returns { opus_int i, s, ind1, bestIndex, prob_Q8, bits_q7; opus_int32 W_tmp_Q9, ret; - VARDECL( opus_int32, err_Q26 ); + VARDECL( opus_int32, err_Q24 ); VARDECL( opus_int32, RD_Q25 ); VARDECL( opus_int, tempIndices1 ); VARDECL( opus_int8, tempIndices2 ); - opus_int16 res_Q15[ MAX_LPC_ORDER ]; opus_int16 res_Q10[ MAX_LPC_ORDER ]; opus_int16 NLSF_tmp_Q15[ MAX_LPC_ORDER ]; - opus_int16 W_tmp_QW[ MAX_LPC_ORDER ]; opus_int16 W_adj_Q5[ MAX_LPC_ORDER ]; opus_uint8 pred_Q8[ MAX_LPC_ORDER ]; opus_int16 ec_ix[ MAX_LPC_ORDER ]; const opus_uint8 *pCB_element, *iCDF_ptr; + const opus_int16 *pCB_Wght_Q9; SAVE_STACK; - silk_assert( nSurvivors <= NLSF_VQ_MAX_SURVIVORS ); - silk_assert( signalType >= 0 && signalType <= 2 ); + celt_assert( signalType >= 0 && signalType <= 2 ); silk_assert( NLSF_mu_Q20 <= 32767 && NLSF_mu_Q20 >= 0 ); /* NLSF stabilization */ silk_NLSF_stabilize( pNLSF_Q15, psNLSF_CB->deltaMin_Q15, psNLSF_CB->order ); /* First stage: VQ */ - ALLOC( err_Q26, psNLSF_CB->nVectors, opus_int32 ); - silk_NLSF_VQ( err_Q26, pNLSF_Q15, psNLSF_CB->CB1_NLSF_Q8, psNLSF_CB->nVectors, psNLSF_CB->order ); + ALLOC( err_Q24, psNLSF_CB->nVectors, opus_int32 ); + silk_NLSF_VQ( err_Q24, pNLSF_Q15, psNLSF_CB->CB1_NLSF_Q8, psNLSF_CB->CB1_Wght_Q9, psNLSF_CB->nVectors, psNLSF_CB->order ); /* Sort the quantization errors */ ALLOC( tempIndices1, nSurvivors, opus_int ); - silk_insertion_sort_increasing( err_Q26, tempIndices1, psNLSF_CB->nVectors, nSurvivors ); + silk_insertion_sort_increasing( err_Q24, tempIndices1, psNLSF_CB->nVectors, nSurvivors ); ALLOC( RD_Q25, nSurvivors, opus_int32 ); ALLOC( tempIndices2, nSurvivors * MAX_LPC_ORDER, opus_int8 ); @@ -85,23 +83,12 @@ opus_int32 silk_NLSF_encode( /* O Returns /* Residual after first stage */ pCB_element = &psNLSF_CB->CB1_NLSF_Q8[ ind1 * psNLSF_CB->order ]; + pCB_Wght_Q9 = &psNLSF_CB->CB1_Wght_Q9[ ind1 * psNLSF_CB->order ]; for( i = 0; i < psNLSF_CB->order; i++ ) { NLSF_tmp_Q15[ i ] = silk_LSHIFT16( (opus_int16)pCB_element[ i ], 7 ); - res_Q15[ i ] = pNLSF_Q15[ i ] - NLSF_tmp_Q15[ i ]; - } - - /* Weights from codebook vector */ - silk_NLSF_VQ_weights_laroia( W_tmp_QW, NLSF_tmp_Q15, psNLSF_CB->order ); - - /* Apply square-rooted weights */ - for( i = 0; i < psNLSF_CB->order; i++ ) { - W_tmp_Q9 = silk_SQRT_APPROX( silk_LSHIFT( (opus_int32)W_tmp_QW[ i ], 18 - NLSF_W_Q ) ); - res_Q10[ i ] = (opus_int16)silk_RSHIFT( silk_SMULBB( res_Q15[ i ], W_tmp_Q9 ), 14 ); - } - - /* Modify input weights accordingly */ - for( i = 0; i < psNLSF_CB->order; i++ ) { - W_adj_Q5[ i ] = silk_DIV32_16( silk_LSHIFT( (opus_int32)pW_QW[ i ], 5 ), W_tmp_QW[ i ] ); + W_tmp_Q9 = pCB_Wght_Q9[ i ]; + res_Q10[ i ] = (opus_int16)silk_RSHIFT( silk_SMULBB( pNLSF_Q15[ i ] - NLSF_tmp_Q15[ i ], W_tmp_Q9 ), 14 ); + W_adj_Q5[ i ] = silk_DIV32_varQ( (opus_int32)pW_Q2[ i ], silk_SMULBB( W_tmp_Q9, W_tmp_Q9 ), 21 ); } /* Unpack entropy table indices and predictor for current CB1 index */ diff --git a/thirdparty/opus/silk/NSQ.c b/thirdparty/opus/silk/NSQ.c index 43e3fee7e0..1d64d8e257 100644 --- a/thirdparty/opus/silk/NSQ.c +++ b/thirdparty/opus/silk/NSQ.c @@ -37,7 +37,7 @@ POSSIBILITY OF SUCH DAMAGE. static OPUS_INLINE void silk_nsq_scale_states( const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ - const opus_int32 x_Q3[], /* I input in Q3 */ + const opus_int16 x16[], /* I input */ opus_int32 x_sc_Q10[], /* O input scaled with 1/Gain */ const opus_int16 sLTP[], /* I re-whitened LTP state in Q0 */ opus_int32 sLTP_Q15[], /* O LTP state matching scaled input */ @@ -75,14 +75,14 @@ static OPUS_INLINE void silk_noise_shape_quantizer( void silk_NSQ_c ( - const silk_encoder_state *psEncC, /* I/O Encoder State */ + const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ SideInfoIndices *psIndices, /* I/O Quantization Indices */ - const opus_int32 x_Q3[], /* I Prefiltered input signal */ + const opus_int16 x16[], /* I Input */ opus_int8 pulses[], /* O Quantized pulse signal */ const opus_int16 PredCoef_Q12[ 2 * MAX_LPC_ORDER ], /* I Short term prediction coefs */ const opus_int16 LTPCoef_Q14[ LTP_ORDER * MAX_NB_SUBFR ], /* I Long term prediction coefs */ - const opus_int16 AR2_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ], /* I Noise shaping coefs */ + const opus_int16 AR_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ], /* I Noise shaping coefs */ const opus_int HarmShapeGain_Q14[ MAX_NB_SUBFR ], /* I Long term shaping coefs */ const opus_int Tilt_Q14[ MAX_NB_SUBFR ], /* I Spectral tilt */ const opus_int32 LF_shp_Q14[ MAX_NB_SUBFR ], /* I Low frequency shaping coefs */ @@ -117,8 +117,7 @@ void silk_NSQ_c LSF_interpolation_flag = 1; } - ALLOC( sLTP_Q15, - psEncC->ltp_mem_length + psEncC->frame_length, opus_int32 ); + ALLOC( sLTP_Q15, psEncC->ltp_mem_length + psEncC->frame_length, opus_int32 ); ALLOC( sLTP, psEncC->ltp_mem_length + psEncC->frame_length, opus_int16 ); ALLOC( x_sc_Q10, psEncC->subfr_length, opus_int32 ); /* Set up pointers to start of sub frame */ @@ -128,7 +127,7 @@ void silk_NSQ_c for( k = 0; k < psEncC->nb_subfr; k++ ) { A_Q12 = &PredCoef_Q12[ (( k >> 1 ) | ( 1 - LSF_interpolation_flag )) * MAX_LPC_ORDER ]; B_Q14 = <PCoef_Q14[ k * LTP_ORDER ]; - AR_shp_Q13 = &AR2_Q13[ k * MAX_SHAPE_LPC_ORDER ]; + AR_shp_Q13 = &AR_Q13[ k * MAX_SHAPE_LPC_ORDER ]; /* Noise shape parameters */ silk_assert( HarmShapeGain_Q14[ k ] >= 0 ); @@ -144,7 +143,7 @@ void silk_NSQ_c if( ( k & ( 3 - silk_LSHIFT( LSF_interpolation_flag, 1 ) ) ) == 0 ) { /* Rewhiten with new A coefs */ start_idx = psEncC->ltp_mem_length - lag - psEncC->predictLPCOrder - LTP_ORDER / 2; - silk_assert( start_idx > 0 ); + celt_assert( start_idx > 0 ); silk_LPC_analysis_filter( &sLTP[ start_idx ], &NSQ->xq[ start_idx + k * psEncC->subfr_length ], A_Q12, psEncC->ltp_mem_length - start_idx, psEncC->predictLPCOrder, psEncC->arch ); @@ -154,13 +153,13 @@ void silk_NSQ_c } } - silk_nsq_scale_states( psEncC, NSQ, x_Q3, x_sc_Q10, sLTP, sLTP_Q15, k, LTP_scale_Q14, Gains_Q16, pitchL, psIndices->signalType ); + silk_nsq_scale_states( psEncC, NSQ, x16, x_sc_Q10, sLTP, sLTP_Q15, k, LTP_scale_Q14, Gains_Q16, pitchL, psIndices->signalType ); silk_noise_shape_quantizer( NSQ, psIndices->signalType, x_sc_Q10, pulses, pxq, sLTP_Q15, A_Q12, B_Q14, AR_shp_Q13, lag, HarmShapeFIRPacked_Q14, Tilt_Q14[ k ], LF_shp_Q14[ k ], Gains_Q16[ k ], Lambda_Q10, offset_Q10, psEncC->subfr_length, psEncC->shapingLPCOrder, psEncC->predictLPCOrder, psEncC->arch ); - x_Q3 += psEncC->subfr_length; + x16 += psEncC->subfr_length; pulses += psEncC->subfr_length; pxq += psEncC->subfr_length; } @@ -169,7 +168,6 @@ void silk_NSQ_c NSQ->lagPrev = pitchL[ psEncC->nb_subfr - 1 ]; /* Save quantized speech and noise shaping signals */ - /* DEBUG_STORE_DATA( enc.pcm, &NSQ->xq[ psEncC->ltp_mem_length ], psEncC->frame_length * sizeof( opus_int16 ) ) */ silk_memmove( NSQ->xq, &NSQ->xq[ psEncC->frame_length ], psEncC->ltp_mem_length * sizeof( opus_int16 ) ); silk_memmove( NSQ->sLTP_shp_Q14, &NSQ->sLTP_shp_Q14[ psEncC->frame_length ], psEncC->ltp_mem_length * sizeof( opus_int32 ) ); RESTORE_STACK; @@ -249,15 +247,15 @@ void silk_noise_shape_quantizer( } /* Noise shape feedback */ - silk_assert( ( shapingLPCOrder & 1 ) == 0 ); /* check that order is even */ - n_AR_Q12 = silk_NSQ_noise_shape_feedback_loop(psLPC_Q14, NSQ->sAR2_Q14, AR_shp_Q13, shapingLPCOrder, arch); + celt_assert( ( shapingLPCOrder & 1 ) == 0 ); /* check that order is even */ + n_AR_Q12 = silk_NSQ_noise_shape_feedback_loop(&NSQ->sDiff_shp_Q14, NSQ->sAR2_Q14, AR_shp_Q13, shapingLPCOrder, arch); n_AR_Q12 = silk_SMLAWB( n_AR_Q12, NSQ->sLF_AR_shp_Q14, Tilt_Q14 ); n_LF_Q12 = silk_SMULWB( NSQ->sLTP_shp_Q14[ NSQ->sLTP_shp_buf_idx - 1 ], LF_shp_Q14 ); n_LF_Q12 = silk_SMLAWT( n_LF_Q12, NSQ->sLF_AR_shp_Q14, LF_shp_Q14 ); - silk_assert( lag > 0 || signalType != TYPE_VOICED ); + celt_assert( lag > 0 || signalType != TYPE_VOICED ); /* Combine prediction and noise shaping signals */ tmp1 = silk_SUB32( silk_LSHIFT32( LPC_pred_Q10, 2 ), n_AR_Q12 ); /* Q12 */ @@ -279,14 +277,27 @@ void silk_noise_shape_quantizer( r_Q10 = silk_SUB32( x_sc_Q10[ i ], tmp1 ); /* residual error Q10 */ /* Flip sign depending on dither */ - if ( NSQ->rand_seed < 0 ) { - r_Q10 = -r_Q10; + if( NSQ->rand_seed < 0 ) { + r_Q10 = -r_Q10; } r_Q10 = silk_LIMIT_32( r_Q10, -(31 << 10), 30 << 10 ); /* Find two quantization level candidates and measure their rate-distortion */ q1_Q10 = silk_SUB32( r_Q10, offset_Q10 ); q1_Q0 = silk_RSHIFT( q1_Q10, 10 ); + if (Lambda_Q10 > 2048) { + /* For aggressive RDO, the bias becomes more than one pulse. */ + int rdo_offset = Lambda_Q10/2 - 512; + if (q1_Q10 > rdo_offset) { + q1_Q0 = silk_RSHIFT( q1_Q10 - rdo_offset, 10 ); + } else if (q1_Q10 < -rdo_offset) { + q1_Q0 = silk_RSHIFT( q1_Q10 + rdo_offset, 10 ); + } else if (q1_Q10 < 0) { + q1_Q0 = -1; + } else { + q1_Q0 = 0; + } + } if( q1_Q0 > 0 ) { q1_Q10 = silk_SUB32( silk_LSHIFT( q1_Q0, 10 ), QUANT_LEVEL_ADJUST_Q10 ); q1_Q10 = silk_ADD32( q1_Q10, offset_Q10 ); @@ -337,7 +348,8 @@ void silk_noise_shape_quantizer( /* Update states */ psLPC_Q14++; *psLPC_Q14 = xq_Q14; - sLF_AR_shp_Q14 = silk_SUB_LSHIFT32( xq_Q14, n_AR_Q12, 2 ); + NSQ->sDiff_shp_Q14 = silk_SUB_LSHIFT32( xq_Q14, x_sc_Q10[ i ], 4 ); + sLF_AR_shp_Q14 = silk_SUB_LSHIFT32( NSQ->sDiff_shp_Q14, n_AR_Q12, 2 ); NSQ->sLF_AR_shp_Q14 = sLF_AR_shp_Q14; NSQ->sLTP_shp_Q14[ NSQ->sLTP_shp_buf_idx ] = silk_SUB_LSHIFT32( sLF_AR_shp_Q14, n_LF_Q12, 2 ); @@ -356,7 +368,7 @@ void silk_noise_shape_quantizer( static OPUS_INLINE void silk_nsq_scale_states( const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ - const opus_int32 x_Q3[], /* I input in Q3 */ + const opus_int16 x16[], /* I input */ opus_int32 x_sc_Q10[], /* O input scaled with 1/Gain */ const opus_int16 sLTP[], /* I re-whitened LTP state in Q0 */ opus_int32 sLTP_Q15[], /* O LTP state matching scaled input */ @@ -368,28 +380,18 @@ static OPUS_INLINE void silk_nsq_scale_states( ) { opus_int i, lag; - opus_int32 gain_adj_Q16, inv_gain_Q31, inv_gain_Q23; + opus_int32 gain_adj_Q16, inv_gain_Q31, inv_gain_Q26; lag = pitchL[ subfr ]; inv_gain_Q31 = silk_INVERSE32_varQ( silk_max( Gains_Q16[ subfr ], 1 ), 47 ); silk_assert( inv_gain_Q31 != 0 ); - /* Calculate gain adjustment factor */ - if( Gains_Q16[ subfr ] != NSQ->prev_gain_Q16 ) { - gain_adj_Q16 = silk_DIV32_varQ( NSQ->prev_gain_Q16, Gains_Q16[ subfr ], 16 ); - } else { - gain_adj_Q16 = (opus_int32)1 << 16; - } - /* Scale input */ - inv_gain_Q23 = silk_RSHIFT_ROUND( inv_gain_Q31, 8 ); + inv_gain_Q26 = silk_RSHIFT_ROUND( inv_gain_Q31, 5 ); for( i = 0; i < psEncC->subfr_length; i++ ) { - x_sc_Q10[ i ] = silk_SMULWW( x_Q3[ i ], inv_gain_Q23 ); + x_sc_Q10[ i ] = silk_SMULWW( x16[ i ], inv_gain_Q26 ); } - /* Save inverse gain */ - NSQ->prev_gain_Q16 = Gains_Q16[ subfr ]; - /* After rewhitening the LTP state is un-scaled, so scale with inv_gain_Q16 */ if( NSQ->rewhite_flag ) { if( subfr == 0 ) { @@ -403,7 +405,9 @@ static OPUS_INLINE void silk_nsq_scale_states( } /* Adjust for changing gain */ - if( gain_adj_Q16 != (opus_int32)1 << 16 ) { + if( Gains_Q16[ subfr ] != NSQ->prev_gain_Q16 ) { + gain_adj_Q16 = silk_DIV32_varQ( NSQ->prev_gain_Q16, Gains_Q16[ subfr ], 16 ); + /* Scale long-term shaping state */ for( i = NSQ->sLTP_shp_buf_idx - psEncC->ltp_mem_length; i < NSQ->sLTP_shp_buf_idx; i++ ) { NSQ->sLTP_shp_Q14[ i ] = silk_SMULWW( gain_adj_Q16, NSQ->sLTP_shp_Q14[ i ] ); @@ -417,6 +421,7 @@ static OPUS_INLINE void silk_nsq_scale_states( } NSQ->sLF_AR_shp_Q14 = silk_SMULWW( gain_adj_Q16, NSQ->sLF_AR_shp_Q14 ); + NSQ->sDiff_shp_Q14 = silk_SMULWW( gain_adj_Q16, NSQ->sDiff_shp_Q14 ); /* Scale short-term prediction and shaping states */ for( i = 0; i < NSQ_LPC_BUF_LENGTH; i++ ) { @@ -425,5 +430,8 @@ static OPUS_INLINE void silk_nsq_scale_states( for( i = 0; i < MAX_SHAPE_LPC_ORDER; i++ ) { NSQ->sAR2_Q14[ i ] = silk_SMULWW( gain_adj_Q16, NSQ->sAR2_Q14[ i ] ); } + + /* Save inverse gain */ + NSQ->prev_gain_Q16 = Gains_Q16[ subfr ]; } } diff --git a/thirdparty/opus/silk/NSQ_del_dec.c b/thirdparty/opus/silk/NSQ_del_dec.c index ab6feeac98..3fd9fa0d5b 100644 --- a/thirdparty/opus/silk/NSQ_del_dec.c +++ b/thirdparty/opus/silk/NSQ_del_dec.c @@ -43,6 +43,7 @@ typedef struct { opus_int32 Shape_Q14[ DECISION_DELAY ]; opus_int32 sAR2_Q14[ MAX_SHAPE_LPC_ORDER ]; opus_int32 LF_AR_Q14; + opus_int32 Diff_Q14; opus_int32 Seed; opus_int32 SeedInit; opus_int32 RD_Q10; @@ -53,6 +54,7 @@ typedef struct { opus_int32 RD_Q10; opus_int32 xq_Q14; opus_int32 LF_AR_Q14; + opus_int32 Diff_Q14; opus_int32 sLTP_shp_Q14; opus_int32 LPC_exc_Q14; } NSQ_sample_struct; @@ -66,7 +68,7 @@ static OPUS_INLINE void silk_nsq_del_dec_scale_states( const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ NSQ_del_dec_struct psDelDec[], /* I/O Delayed decision states */ - const opus_int32 x_Q3[], /* I Input in Q3 */ + const opus_int16 x16[], /* I Input */ opus_int32 x_sc_Q10[], /* O Input scaled with 1/Gain in Q10 */ const opus_int16 sLTP[], /* I Re-whitened LTP state in Q0 */ opus_int32 sLTP_Q15[], /* O LTP state matching scaled input */ @@ -107,20 +109,20 @@ static OPUS_INLINE void silk_noise_shape_quantizer_del_dec( opus_int predictLPCOrder, /* I Prediction filter order */ opus_int warping_Q16, /* I */ opus_int nStatesDelayedDecision, /* I Number of states in decision tree */ - opus_int *smpl_buf_idx, /* I Index to newest samples in buffers */ + opus_int *smpl_buf_idx, /* I/O Index to newest samples in buffers */ opus_int decisionDelay, /* I */ int arch /* I */ ); void silk_NSQ_del_dec_c( - const silk_encoder_state *psEncC, /* I/O Encoder State */ + const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ SideInfoIndices *psIndices, /* I/O Quantization Indices */ - const opus_int32 x_Q3[], /* I Prefiltered input signal */ + const opus_int16 x16[], /* I Input */ opus_int8 pulses[], /* O Quantized pulse signal */ const opus_int16 PredCoef_Q12[ 2 * MAX_LPC_ORDER ], /* I Short term prediction coefs */ const opus_int16 LTPCoef_Q14[ LTP_ORDER * MAX_NB_SUBFR ], /* I Long term prediction coefs */ - const opus_int16 AR2_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ], /* I Noise shaping coefs */ + const opus_int16 AR_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ], /* I Noise shaping coefs */ const opus_int HarmShapeGain_Q14[ MAX_NB_SUBFR ], /* I Long term shaping coefs */ const opus_int Tilt_Q14[ MAX_NB_SUBFR ], /* I Spectral tilt */ const opus_int32 LF_shp_Q14[ MAX_NB_SUBFR ], /* I Low frequency shaping coefs */ @@ -159,6 +161,7 @@ void silk_NSQ_del_dec_c( psDD->SeedInit = psDD->Seed; psDD->RD_Q10 = 0; psDD->LF_AR_Q14 = NSQ->sLF_AR_shp_Q14; + psDD->Diff_Q14 = NSQ->sDiff_shp_Q14; psDD->Shape_Q14[ 0 ] = NSQ->sLTP_shp_Q14[ psEncC->ltp_mem_length - 1 ]; silk_memcpy( psDD->sLPC_Q14, NSQ->sLPC_Q14, NSQ_LPC_BUF_LENGTH * sizeof( opus_int32 ) ); silk_memcpy( psDD->sAR2_Q14, NSQ->sAR2_Q14, sizeof( NSQ->sAR2_Q14 ) ); @@ -186,8 +189,7 @@ void silk_NSQ_del_dec_c( LSF_interpolation_flag = 1; } - ALLOC( sLTP_Q15, - psEncC->ltp_mem_length + psEncC->frame_length, opus_int32 ); + ALLOC( sLTP_Q15, psEncC->ltp_mem_length + psEncC->frame_length, opus_int32 ); ALLOC( sLTP, psEncC->ltp_mem_length + psEncC->frame_length, opus_int16 ); ALLOC( x_sc_Q10, psEncC->subfr_length, opus_int32 ); ALLOC( delayedGain_Q10, DECISION_DELAY, opus_int32 ); @@ -199,7 +201,7 @@ void silk_NSQ_del_dec_c( for( k = 0; k < psEncC->nb_subfr; k++ ) { A_Q12 = &PredCoef_Q12[ ( ( k >> 1 ) | ( 1 - LSF_interpolation_flag ) ) * MAX_LPC_ORDER ]; B_Q14 = <PCoef_Q14[ k * LTP_ORDER ]; - AR_shp_Q13 = &AR2_Q13[ k * MAX_SHAPE_LPC_ORDER ]; + AR_shp_Q13 = &AR_Q13[ k * MAX_SHAPE_LPC_ORDER ]; /* Noise shape parameters */ silk_assert( HarmShapeGain_Q14[ k ] >= 0 ); @@ -235,7 +237,8 @@ void silk_NSQ_del_dec_c( psDD = &psDelDec[ Winner_ind ]; last_smple_idx = smpl_buf_idx + decisionDelay; for( i = 0; i < decisionDelay; i++ ) { - last_smple_idx = ( last_smple_idx - 1 ) & DECISION_DELAY_MASK; + last_smple_idx = ( last_smple_idx - 1 ) % DECISION_DELAY; + if( last_smple_idx < 0 ) last_smple_idx += DECISION_DELAY; pulses[ i - decisionDelay ] = (opus_int8)silk_RSHIFT_ROUND( psDD->Q_Q10[ last_smple_idx ], 10 ); pxq[ i - decisionDelay ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( silk_SMULWW( psDD->Xq_Q14[ last_smple_idx ], Gains_Q16[ 1 ] ), 14 ) ); @@ -247,7 +250,7 @@ void silk_NSQ_del_dec_c( /* Rewhiten with new A coefs */ start_idx = psEncC->ltp_mem_length - lag - psEncC->predictLPCOrder - LTP_ORDER / 2; - silk_assert( start_idx > 0 ); + celt_assert( start_idx > 0 ); silk_LPC_analysis_filter( &sLTP[ start_idx ], &NSQ->xq[ start_idx + k * psEncC->subfr_length ], A_Q12, psEncC->ltp_mem_length - start_idx, psEncC->predictLPCOrder, psEncC->arch ); @@ -257,7 +260,7 @@ void silk_NSQ_del_dec_c( } } - silk_nsq_del_dec_scale_states( psEncC, NSQ, psDelDec, x_Q3, x_sc_Q10, sLTP, sLTP_Q15, k, + silk_nsq_del_dec_scale_states( psEncC, NSQ, psDelDec, x16, x_sc_Q10, sLTP, sLTP_Q15, k, psEncC->nStatesDelayedDecision, LTP_scale_Q14, Gains_Q16, pitchL, psIndices->signalType, decisionDelay ); silk_noise_shape_quantizer_del_dec( NSQ, psDelDec, psIndices->signalType, x_sc_Q10, pulses, pxq, sLTP_Q15, @@ -265,7 +268,7 @@ void silk_NSQ_del_dec_c( Gains_Q16[ k ], Lambda_Q10, offset_Q10, psEncC->subfr_length, subfr++, psEncC->shapingLPCOrder, psEncC->predictLPCOrder, psEncC->warping_Q16, psEncC->nStatesDelayedDecision, &smpl_buf_idx, decisionDelay, psEncC->arch ); - x_Q3 += psEncC->subfr_length; + x16 += psEncC->subfr_length; pulses += psEncC->subfr_length; pxq += psEncC->subfr_length; } @@ -286,7 +289,9 @@ void silk_NSQ_del_dec_c( last_smple_idx = smpl_buf_idx + decisionDelay; Gain_Q10 = silk_RSHIFT32( Gains_Q16[ psEncC->nb_subfr - 1 ], 6 ); for( i = 0; i < decisionDelay; i++ ) { - last_smple_idx = ( last_smple_idx - 1 ) & DECISION_DELAY_MASK; + last_smple_idx = ( last_smple_idx - 1 ) % DECISION_DELAY; + if( last_smple_idx < 0 ) last_smple_idx += DECISION_DELAY; + pulses[ i - decisionDelay ] = (opus_int8)silk_RSHIFT_ROUND( psDD->Q_Q10[ last_smple_idx ], 10 ); pxq[ i - decisionDelay ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( silk_SMULWW( psDD->Xq_Q14[ last_smple_idx ], Gain_Q10 ), 8 ) ); @@ -297,10 +302,10 @@ void silk_NSQ_del_dec_c( /* Update states */ NSQ->sLF_AR_shp_Q14 = psDD->LF_AR_Q14; + NSQ->sDiff_shp_Q14 = psDD->Diff_Q14; NSQ->lagPrev = pitchL[ psEncC->nb_subfr - 1 ]; /* Save quantized speech signal */ - /* DEBUG_STORE_DATA( enc.pcm, &NSQ->xq[psEncC->ltp_mem_length], psEncC->frame_length * sizeof( opus_int16 ) ) */ silk_memmove( NSQ->xq, &NSQ->xq[ psEncC->frame_length ], psEncC->ltp_mem_length * sizeof( opus_int16 ) ); silk_memmove( NSQ->sLTP_shp_Q14, &NSQ->sLTP_shp_Q14[ psEncC->frame_length ], psEncC->ltp_mem_length * sizeof( opus_int32 ) ); RESTORE_STACK; @@ -335,7 +340,7 @@ static OPUS_INLINE void silk_noise_shape_quantizer_del_dec( opus_int predictLPCOrder, /* I Prediction filter order */ opus_int warping_Q16, /* I */ opus_int nStatesDelayedDecision, /* I Number of states in decision tree */ - opus_int *smpl_buf_idx, /* I Index to newest samples in buffers */ + opus_int *smpl_buf_idx, /* I/O Index to newest samples in buffers */ opus_int decisionDelay, /* I */ int arch /* I */ ) @@ -356,7 +361,7 @@ static OPUS_INLINE void silk_noise_shape_quantizer_del_dec( NSQ_sample_struct *psSS; SAVE_STACK; - silk_assert( nStatesDelayedDecision > 0 ); + celt_assert( nStatesDelayedDecision > 0 ); ALLOC( psSampleState, nStatesDelayedDecision, NSQ_sample_pair ); shp_lag_ptr = &NSQ->sLTP_shp_Q14[ NSQ->sLTP_shp_buf_idx - lag + HARM_SHAPE_FIR_TAPS / 2 ]; @@ -414,9 +419,9 @@ static OPUS_INLINE void silk_noise_shape_quantizer_del_dec( LPC_pred_Q14 = silk_LSHIFT( LPC_pred_Q14, 4 ); /* Q10 -> Q14 */ /* Noise shape feedback */ - silk_assert( ( shapingLPCOrder & 1 ) == 0 ); /* check that order is even */ + celt_assert( ( shapingLPCOrder & 1 ) == 0 ); /* check that order is even */ /* Output of lowpass section */ - tmp2 = silk_SMLAWB( psLPC_Q14[ 0 ], psDD->sAR2_Q14[ 0 ], warping_Q16 ); + tmp2 = silk_SMLAWB( psDD->Diff_Q14, psDD->sAR2_Q14[ 0 ], warping_Q16 ); /* Output of allpass section */ tmp1 = silk_SMLAWB( psDD->sAR2_Q14[ 0 ], psDD->sAR2_Q14[ 1 ] - tmp2, warping_Q16 ); psDD->sAR2_Q14[ 0 ] = tmp2; @@ -462,6 +467,19 @@ static OPUS_INLINE void silk_noise_shape_quantizer_del_dec( /* Find two quantization level candidates and measure their rate-distortion */ q1_Q10 = silk_SUB32( r_Q10, offset_Q10 ); q1_Q0 = silk_RSHIFT( q1_Q10, 10 ); + if (Lambda_Q10 > 2048) { + /* For aggressive RDO, the bias becomes more than one pulse. */ + int rdo_offset = Lambda_Q10/2 - 512; + if (q1_Q10 > rdo_offset) { + q1_Q0 = silk_RSHIFT( q1_Q10 - rdo_offset, 10 ); + } else if (q1_Q10 < -rdo_offset) { + q1_Q0 = silk_RSHIFT( q1_Q10 + rdo_offset, 10 ); + } else if (q1_Q10 < 0) { + q1_Q0 = -1; + } else { + q1_Q0 = 0; + } + } if( q1_Q0 > 0 ) { q1_Q10 = silk_SUB32( silk_LSHIFT( q1_Q0, 10 ), QUANT_LEVEL_ADJUST_Q10 ); q1_Q10 = silk_ADD32( q1_Q10, offset_Q10 ); @@ -515,7 +533,8 @@ static OPUS_INLINE void silk_noise_shape_quantizer_del_dec( xq_Q14 = silk_ADD32( LPC_exc_Q14, LPC_pred_Q14 ); /* Update states */ - sLF_AR_shp_Q14 = silk_SUB32( xq_Q14, n_AR_Q14 ); + psSS[ 0 ].Diff_Q14 = silk_SUB_LSHIFT32( xq_Q14, x_Q10[ i ], 4 ); + sLF_AR_shp_Q14 = silk_SUB32( psSS[ 0 ].Diff_Q14, n_AR_Q14 ); psSS[ 0 ].sLTP_shp_Q14 = silk_SUB32( sLF_AR_shp_Q14, n_LF_Q14 ); psSS[ 0 ].LF_AR_Q14 = sLF_AR_shp_Q14; psSS[ 0 ].LPC_exc_Q14 = LPC_exc_Q14; @@ -529,21 +548,22 @@ static OPUS_INLINE void silk_noise_shape_quantizer_del_dec( exc_Q14 = -exc_Q14; } - /* Add predictions */ LPC_exc_Q14 = silk_ADD32( exc_Q14, LTP_pred_Q14 ); xq_Q14 = silk_ADD32( LPC_exc_Q14, LPC_pred_Q14 ); /* Update states */ - sLF_AR_shp_Q14 = silk_SUB32( xq_Q14, n_AR_Q14 ); + psSS[ 1 ].Diff_Q14 = silk_SUB_LSHIFT32( xq_Q14, x_Q10[ i ], 4 ); + sLF_AR_shp_Q14 = silk_SUB32( psSS[ 1 ].Diff_Q14, n_AR_Q14 ); psSS[ 1 ].sLTP_shp_Q14 = silk_SUB32( sLF_AR_shp_Q14, n_LF_Q14 ); psSS[ 1 ].LF_AR_Q14 = sLF_AR_shp_Q14; psSS[ 1 ].LPC_exc_Q14 = LPC_exc_Q14; psSS[ 1 ].xq_Q14 = xq_Q14; } - *smpl_buf_idx = ( *smpl_buf_idx - 1 ) & DECISION_DELAY_MASK; /* Index to newest samples */ - last_smple_idx = ( *smpl_buf_idx + decisionDelay ) & DECISION_DELAY_MASK; /* Index to decisionDelay old samples */ + *smpl_buf_idx = ( *smpl_buf_idx - 1 ) % DECISION_DELAY; + if( *smpl_buf_idx < 0 ) *smpl_buf_idx += DECISION_DELAY; + last_smple_idx = ( *smpl_buf_idx + decisionDelay ) % DECISION_DELAY; /* Find winner */ RDmin_Q10 = psSampleState[ 0 ][ 0 ].RD_Q10; @@ -607,6 +627,7 @@ static OPUS_INLINE void silk_noise_shape_quantizer_del_dec( psDD = &psDelDec[ k ]; psSS = &psSampleState[ k ][ 0 ]; psDD->LF_AR_Q14 = psSS->LF_AR_Q14; + psDD->Diff_Q14 = psSS->Diff_Q14; psDD->sLPC_Q14[ NSQ_LPC_BUF_LENGTH + i ] = psSS->xq_Q14; psDD->Xq_Q14[ *smpl_buf_idx ] = psSS->xq_Q14; psDD->Q_Q10[ *smpl_buf_idx ] = psSS->Q_Q10; @@ -631,7 +652,7 @@ static OPUS_INLINE void silk_nsq_del_dec_scale_states( const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ NSQ_del_dec_struct psDelDec[], /* I/O Delayed decision states */ - const opus_int32 x_Q3[], /* I Input in Q3 */ + const opus_int16 x16[], /* I Input */ opus_int32 x_sc_Q10[], /* O Input scaled with 1/Gain in Q10 */ const opus_int16 sLTP[], /* I Re-whitened LTP state in Q0 */ opus_int32 sLTP_Q15[], /* O LTP state matching scaled input */ @@ -645,29 +666,19 @@ static OPUS_INLINE void silk_nsq_del_dec_scale_states( ) { opus_int i, k, lag; - opus_int32 gain_adj_Q16, inv_gain_Q31, inv_gain_Q23; + opus_int32 gain_adj_Q16, inv_gain_Q31, inv_gain_Q26; NSQ_del_dec_struct *psDD; lag = pitchL[ subfr ]; inv_gain_Q31 = silk_INVERSE32_varQ( silk_max( Gains_Q16[ subfr ], 1 ), 47 ); silk_assert( inv_gain_Q31 != 0 ); - /* Calculate gain adjustment factor */ - if( Gains_Q16[ subfr ] != NSQ->prev_gain_Q16 ) { - gain_adj_Q16 = silk_DIV32_varQ( NSQ->prev_gain_Q16, Gains_Q16[ subfr ], 16 ); - } else { - gain_adj_Q16 = (opus_int32)1 << 16; - } - /* Scale input */ - inv_gain_Q23 = silk_RSHIFT_ROUND( inv_gain_Q31, 8 ); + inv_gain_Q26 = silk_RSHIFT_ROUND( inv_gain_Q31, 5 ); for( i = 0; i < psEncC->subfr_length; i++ ) { - x_sc_Q10[ i ] = silk_SMULWW( x_Q3[ i ], inv_gain_Q23 ); + x_sc_Q10[ i ] = silk_SMULWW( x16[ i ], inv_gain_Q26 ); } - /* Save inverse gain */ - NSQ->prev_gain_Q16 = Gains_Q16[ subfr ]; - /* After rewhitening the LTP state is un-scaled, so scale with inv_gain_Q16 */ if( NSQ->rewhite_flag ) { if( subfr == 0 ) { @@ -681,7 +692,9 @@ static OPUS_INLINE void silk_nsq_del_dec_scale_states( } /* Adjust for changing gain */ - if( gain_adj_Q16 != (opus_int32)1 << 16 ) { + if( Gains_Q16[ subfr ] != NSQ->prev_gain_Q16 ) { + gain_adj_Q16 = silk_DIV32_varQ( NSQ->prev_gain_Q16, Gains_Q16[ subfr ], 16 ); + /* Scale long-term shaping state */ for( i = NSQ->sLTP_shp_buf_idx - psEncC->ltp_mem_length; i < NSQ->sLTP_shp_buf_idx; i++ ) { NSQ->sLTP_shp_Q14[ i ] = silk_SMULWW( gain_adj_Q16, NSQ->sLTP_shp_Q14[ i ] ); @@ -699,6 +712,7 @@ static OPUS_INLINE void silk_nsq_del_dec_scale_states( /* Scale scalar states */ psDD->LF_AR_Q14 = silk_SMULWW( gain_adj_Q16, psDD->LF_AR_Q14 ); + psDD->Diff_Q14 = silk_SMULWW( gain_adj_Q16, psDD->Diff_Q14 ); /* Scale short-term prediction and shaping states */ for( i = 0; i < NSQ_LPC_BUF_LENGTH; i++ ) { @@ -712,5 +726,8 @@ static OPUS_INLINE void silk_nsq_del_dec_scale_states( psDD->Shape_Q14[ i ] = silk_SMULWW( gain_adj_Q16, psDD->Shape_Q14[ i ] ); } } + + /* Save inverse gain */ + NSQ->prev_gain_Q16 = Gains_Q16[ subfr ]; } } diff --git a/thirdparty/opus/silk/PLC.c b/thirdparty/opus/silk/PLC.c index fb6ea887b7..f89391651c 100644 --- a/thirdparty/opus/silk/PLC.c +++ b/thirdparty/opus/silk/PLC.c @@ -275,7 +275,7 @@ static OPUS_INLINE void silk_PLC_conceal( /* Reduce random noise for unvoiced frames with high LPC gain */ opus_int32 invGain_Q30, down_scale_Q30; - invGain_Q30 = silk_LPC_inverse_pred_gain( psPLC->prevLPC_Q12, psDec->LPC_order ); + invGain_Q30 = silk_LPC_inverse_pred_gain( psPLC->prevLPC_Q12, psDec->LPC_order, arch ); down_scale_Q30 = silk_min_32( silk_RSHIFT( (opus_int32)1 << 30, LOG2_INV_LPC_GAIN_HIGH_THRES ), invGain_Q30 ); down_scale_Q30 = silk_max_32( silk_RSHIFT( (opus_int32)1 << 30, LOG2_INV_LPC_GAIN_LOW_THRES ), down_scale_Q30 ); @@ -291,7 +291,7 @@ static OPUS_INLINE void silk_PLC_conceal( /* Rewhiten LTP state */ idx = psDec->ltp_mem_length - lag - psDec->LPC_order - LTP_ORDER / 2; - silk_assert( idx > 0 ); + celt_assert( idx > 0 ); silk_LPC_analysis_filter( &sLTP[ idx ], &psDec->outBuf[ idx ], A_Q12, psDec->ltp_mem_length - idx, psDec->LPC_order, arch ); /* Scale LTP state */ inv_gain_Q30 = silk_INVERSE32_varQ( psPLC->prevGain_Q16[ 1 ], 46 ); @@ -328,8 +328,10 @@ static OPUS_INLINE void silk_PLC_conceal( for( j = 0; j < LTP_ORDER; j++ ) { B_Q14[ j ] = silk_RSHIFT( silk_SMULBB( harm_Gain_Q15, B_Q14[ j ] ), 15 ); } - /* Gradually reduce excitation gain */ - rand_scale_Q14 = silk_RSHIFT( silk_SMULBB( rand_scale_Q14, rand_Gain_Q15 ), 15 ); + if ( psDec->indices.signalType != TYPE_NO_VOICE_ACTIVITY ) { + /* Gradually reduce excitation gain */ + rand_scale_Q14 = silk_RSHIFT( silk_SMULBB( rand_scale_Q14, rand_Gain_Q15 ), 15 ); + } /* Slowly increase pitch lag */ psPLC->pitchL_Q8 = silk_SMLAWB( psPLC->pitchL_Q8, psPLC->pitchL_Q8, PITCH_DRIFT_FAC_Q16 ); @@ -345,7 +347,7 @@ static OPUS_INLINE void silk_PLC_conceal( /* Copy LPC state */ silk_memcpy( sLPC_Q14_ptr, psDec->sLPC_Q14_buf, MAX_LPC_ORDER * sizeof( opus_int32 ) ); - silk_assert( psDec->LPC_order >= 10 ); /* check that unrolling works */ + celt_assert( psDec->LPC_order >= 10 ); /* check that unrolling works */ for( i = 0; i < psDec->frame_length; i++ ) { /* partly unrolled */ /* Avoids introducing a bias because silk_SMLAWB() always rounds to -inf */ diff --git a/thirdparty/opus/silk/SigProc_FIX.h b/thirdparty/opus/silk/SigProc_FIX.h index b63299441e..f9ae326326 100644 --- a/thirdparty/opus/silk/SigProc_FIX.h +++ b/thirdparty/opus/silk/SigProc_FIX.h @@ -35,7 +35,7 @@ extern "C" /*#define silk_MACRO_COUNT */ /* Used to enable WMOPS counting */ -#define SILK_MAX_ORDER_LPC 16 /* max order of the LPC analysis in schur() and k2a() */ +#define SILK_MAX_ORDER_LPC 24 /* max order of the LPC analysis in schur() and k2a() */ #include <string.h> /* for memset(), memcpy(), memmove() */ #include "typedef.h" @@ -47,6 +47,11 @@ extern "C" #include "x86/SigProc_FIX_sse.h" #endif +#if (defined(OPUS_ARM_ASM) || defined(OPUS_ARM_MAY_HAVE_NEON_INTR)) +#include "arm/biquad_alt_arm.h" +#include "arm/LPC_inv_pred_gain_arm.h" +#endif + /********************************************************************/ /* SIGNAL PROCESSING FUNCTIONS */ /********************************************************************/ @@ -96,14 +101,22 @@ void silk_resampler_down2_3( * slower than biquad() but uses more precise coefficients * can handle (slowly) varying coefficients */ -void silk_biquad_alt( +void silk_biquad_alt_stride1( const opus_int16 *in, /* I input signal */ const opus_int32 *B_Q28, /* I MA coefficients [3] */ const opus_int32 *A_Q28, /* I AR coefficients [2] */ opus_int32 *S, /* I/O State vector [2] */ opus_int16 *out, /* O output signal */ - const opus_int32 len, /* I signal length (must be even) */ - opus_int stride /* I Operate on interleaved signal if > 1 */ + const opus_int32 len /* I signal length (must be even) */ +); + +void silk_biquad_alt_stride2_c( + const opus_int16 *in, /* I input signal */ + const opus_int32 *B_Q28, /* I MA coefficients [3] */ + const opus_int32 *A_Q28, /* I AR coefficients [2] */ + opus_int32 *S, /* I/O State vector [4] */ + opus_int16 *out, /* O output signal */ + const opus_int32 len /* I signal length (must be even) */ ); /* Variable order MA prediction error filter. */ @@ -132,17 +145,11 @@ void silk_bwexpander_32( /* Compute inverse of LPC prediction gain, and */ /* test if LPC coefficients are stable (all poles within unit circle) */ -opus_int32 silk_LPC_inverse_pred_gain( /* O Returns inverse prediction gain in energy domain, Q30 */ +opus_int32 silk_LPC_inverse_pred_gain_c( /* O Returns inverse prediction gain in energy domain, Q30 */ const opus_int16 *A_Q12, /* I Prediction coefficients, Q12 [order] */ const opus_int order /* I Prediction order */ ); -/* For input in Q24 domain */ -opus_int32 silk_LPC_inverse_pred_gain_Q24( /* O Returns inverse prediction gain in energy domain, Q30 */ - const opus_int32 *A_Q24, /* I Prediction coefficients [order] */ - const opus_int order /* I Prediction order */ -); - /* Split signal in two decimated bands using first-order allpass filters */ void silk_ana_filt_bank_1( const opus_int16 *in, /* I Input signal [N] */ @@ -152,6 +159,14 @@ void silk_ana_filt_bank_1( const opus_int32 N /* I Number of input samples */ ); +#if !defined(OVERRIDE_silk_biquad_alt_stride2) +#define silk_biquad_alt_stride2(in, B_Q28, A_Q28, S, out, len, arch) ((void)(arch), silk_biquad_alt_stride2_c(in, B_Q28, A_Q28, S, out, len)) +#endif + +#if !defined(OVERRIDE_silk_LPC_inverse_pred_gain) +#define silk_LPC_inverse_pred_gain(A_Q12, order, arch) ((void)(arch), silk_LPC_inverse_pred_gain_c(A_Q12, order)) +#endif + /********************************************************************/ /* SCALAR FUNCTIONS */ /********************************************************************/ @@ -271,7 +286,17 @@ void silk_A2NLSF( void silk_NLSF2A( opus_int16 *a_Q12, /* O monic whitening filter coefficients in Q12, [ d ] */ const opus_int16 *NLSF, /* I normalized line spectral frequencies in Q15, [ d ] */ - const opus_int d /* I filter order (should be even) */ + const opus_int d, /* I filter order (should be even) */ + int arch /* I Run-time architecture */ +); + +/* Convert int32 coefficients to int16 coefs and make sure there's no wrap-around */ +void silk_LPC_fit( + opus_int16 *a_QOUT, /* O Output signal */ + opus_int32 *a_QIN, /* I/O Input signal */ + const opus_int QOUT, /* I Input Q domain */ + const opus_int QIN, /* I Input Q domain */ + const opus_int d /* I Filter order */ ); void silk_insertion_sort_increasing( @@ -471,8 +496,7 @@ static OPUS_INLINE opus_int32 silk_ROR32( opus_int32 a32, opus_int rot ) /* Add with saturation for positive input values */ #define silk_ADD_POS_SAT8(a, b) ((((a)+(b)) & 0x80) ? silk_int8_MAX : ((a)+(b))) #define silk_ADD_POS_SAT16(a, b) ((((a)+(b)) & 0x8000) ? silk_int16_MAX : ((a)+(b))) -#define silk_ADD_POS_SAT32(a, b) ((((a)+(b)) & 0x80000000) ? silk_int32_MAX : ((a)+(b))) -#define silk_ADD_POS_SAT64(a, b) ((((a)+(b)) & 0x8000000000000000LL) ? silk_int64_MAX : ((a)+(b))) +#define silk_ADD_POS_SAT32(a, b) ((((opus_uint32)(a)+(opus_uint32)(b)) & 0x80000000) ? silk_int32_MAX : ((a)+(b))) #define silk_LSHIFT8(a, shift) ((opus_int8)((opus_uint8)(a)<<(shift))) /* shift >= 0, shift < 8 */ #define silk_LSHIFT16(a, shift) ((opus_int16)((opus_uint16)(a)<<(shift))) /* shift >= 0, shift < 16 */ @@ -572,7 +596,9 @@ static OPUS_INLINE opus_int64 silk_max_64(opus_int64 a, opus_int64 b) /* Make sure to store the result as the seed for the next call (also in between */ /* frames), otherwise result won't be random at all. When only using some of the */ /* bits, take the most significant bits by right-shifting. */ -#define silk_RAND(seed) (silk_MLA_ovflw(907633515, (seed), 196314165)) +#define RAND_MULTIPLIER 196314165 +#define RAND_INCREMENT 907633515 +#define silk_RAND(seed) (silk_MLA_ovflw((RAND_INCREMENT), (seed), (RAND_MULTIPLIER))) /* Add some multiplication functions that can be easily mapped to ARM. */ diff --git a/thirdparty/opus/silk/VAD.c b/thirdparty/opus/silk/VAD.c index 0a782af2f1..d0cda52162 100644 --- a/thirdparty/opus/silk/VAD.c +++ b/thirdparty/opus/silk/VAD.c @@ -101,9 +101,9 @@ opus_int silk_VAD_GetSA_Q8_c( /* O Return v /* Safety checks */ silk_assert( VAD_N_BANDS == 4 ); - silk_assert( MAX_FRAME_LENGTH >= psEncC->frame_length ); - silk_assert( psEncC->frame_length <= 512 ); - silk_assert( psEncC->frame_length == 8 * silk_RSHIFT( psEncC->frame_length, 3 ) ); + celt_assert( MAX_FRAME_LENGTH >= psEncC->frame_length ); + celt_assert( psEncC->frame_length <= 512 ); + celt_assert( psEncC->frame_length == 8 * silk_RSHIFT( psEncC->frame_length, 3 ) ); /***********************/ /* Filter and Decimate */ @@ -252,15 +252,14 @@ opus_int silk_VAD_GetSA_Q8_c( /* O Return v speech_nrg += ( b + 1 ) * silk_RSHIFT( Xnrg[ b ] - psSilk_VAD->NL[ b ], 4 ); } + if( psEncC->frame_length == 20 * psEncC->fs_kHz ) { + speech_nrg = silk_RSHIFT32( speech_nrg, 1 ); + } /* Power scaling */ if( speech_nrg <= 0 ) { SA_Q15 = silk_RSHIFT( SA_Q15, 1 ); - } else if( speech_nrg < 32768 ) { - if( psEncC->frame_length == 10 * psEncC->fs_kHz ) { - speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 16 ); - } else { - speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 15 ); - } + } else if( speech_nrg < 16384 ) { + speech_nrg = silk_LSHIFT32( speech_nrg, 16 ); /* square-root */ speech_nrg = silk_SQRT_APPROX( speech_nrg ); @@ -313,6 +312,8 @@ void silk_VAD_GetNoiseLevels( /* Initially faster smoothing */ if( psSilk_VAD->counter < 1000 ) { /* 1000 = 20 sec */ min_coef = silk_DIV32_16( silk_int16_MAX, silk_RSHIFT( psSilk_VAD->counter, 4 ) + 1 ); + /* Increment frame counter */ + psSilk_VAD->counter++; } else { min_coef = 0; } @@ -356,7 +357,4 @@ void silk_VAD_GetNoiseLevels( /* Store as part of state */ psSilk_VAD->NL[ k ] = nl; } - - /* Increment frame counter */ - psSilk_VAD->counter++; } diff --git a/thirdparty/opus/silk/VQ_WMat_EC.c b/thirdparty/opus/silk/VQ_WMat_EC.c index 7983f1db80..0f3d545c4e 100644 --- a/thirdparty/opus/silk/VQ_WMat_EC.c +++ b/thirdparty/opus/silk/VQ_WMat_EC.c @@ -34,84 +34,95 @@ POSSIBILITY OF SUCH DAMAGE. /* Entropy constrained matrix-weighted VQ, hard-coded to 5-element vectors, for a single input data vector */ void silk_VQ_WMat_EC_c( opus_int8 *ind, /* O index of best codebook vector */ - opus_int32 *rate_dist_Q14, /* O best weighted quant error + mu * rate */ + opus_int32 *res_nrg_Q15, /* O best residual energy */ + opus_int32 *rate_dist_Q8, /* O best total bitrate */ opus_int *gain_Q7, /* O sum of absolute LTP coefficients */ - const opus_int16 *in_Q14, /* I input vector to be quantized */ - const opus_int32 *W_Q18, /* I weighting matrix */ + const opus_int32 *XX_Q17, /* I correlation matrix */ + const opus_int32 *xX_Q17, /* I correlation vector */ const opus_int8 *cb_Q7, /* I codebook */ const opus_uint8 *cb_gain_Q7, /* I codebook effective gain */ const opus_uint8 *cl_Q5, /* I code length for each codebook vector */ - const opus_int mu_Q9, /* I tradeoff betw. weighted error and rate */ + const opus_int subfr_len, /* I number of samples per subframe */ const opus_int32 max_gain_Q7, /* I maximum sum of absolute LTP coefficients */ - opus_int L /* I number of vectors in codebook */ + const opus_int L /* I number of vectors in codebook */ ) { opus_int k, gain_tmp_Q7; const opus_int8 *cb_row_Q7; - opus_int16 diff_Q14[ 5 ]; - opus_int32 sum1_Q14, sum2_Q16; + opus_int32 neg_xX_Q24[ 5 ]; + opus_int32 sum1_Q15, sum2_Q24; + opus_int32 bits_res_Q8, bits_tot_Q8; + + /* Negate and convert to new Q domain */ + neg_xX_Q24[ 0 ] = -silk_LSHIFT32( xX_Q17[ 0 ], 7 ); + neg_xX_Q24[ 1 ] = -silk_LSHIFT32( xX_Q17[ 1 ], 7 ); + neg_xX_Q24[ 2 ] = -silk_LSHIFT32( xX_Q17[ 2 ], 7 ); + neg_xX_Q24[ 3 ] = -silk_LSHIFT32( xX_Q17[ 3 ], 7 ); + neg_xX_Q24[ 4 ] = -silk_LSHIFT32( xX_Q17[ 4 ], 7 ); /* Loop over codebook */ - *rate_dist_Q14 = silk_int32_MAX; + *rate_dist_Q8 = silk_int32_MAX; + *res_nrg_Q15 = silk_int32_MAX; cb_row_Q7 = cb_Q7; + /* In things go really bad, at least *ind is set to something safe. */ + *ind = 0; for( k = 0; k < L; k++ ) { + opus_int32 penalty; gain_tmp_Q7 = cb_gain_Q7[k]; - - diff_Q14[ 0 ] = in_Q14[ 0 ] - silk_LSHIFT( cb_row_Q7[ 0 ], 7 ); - diff_Q14[ 1 ] = in_Q14[ 1 ] - silk_LSHIFT( cb_row_Q7[ 1 ], 7 ); - diff_Q14[ 2 ] = in_Q14[ 2 ] - silk_LSHIFT( cb_row_Q7[ 2 ], 7 ); - diff_Q14[ 3 ] = in_Q14[ 3 ] - silk_LSHIFT( cb_row_Q7[ 3 ], 7 ); - diff_Q14[ 4 ] = in_Q14[ 4 ] - silk_LSHIFT( cb_row_Q7[ 4 ], 7 ); - /* Weighted rate */ - sum1_Q14 = silk_SMULBB( mu_Q9, cl_Q5[ k ] ); + /* Quantization error: 1 - 2 * xX * cb + cb' * XX * cb */ + sum1_Q15 = SILK_FIX_CONST( 1.001, 15 ); /* Penalty for too large gain */ - sum1_Q14 = silk_ADD_LSHIFT32( sum1_Q14, silk_max( silk_SUB32( gain_tmp_Q7, max_gain_Q7 ), 0 ), 10 ); - - silk_assert( sum1_Q14 >= 0 ); - - /* first row of W_Q18 */ - sum2_Q16 = silk_SMULWB( W_Q18[ 1 ], diff_Q14[ 1 ] ); - sum2_Q16 = silk_SMLAWB( sum2_Q16, W_Q18[ 2 ], diff_Q14[ 2 ] ); - sum2_Q16 = silk_SMLAWB( sum2_Q16, W_Q18[ 3 ], diff_Q14[ 3 ] ); - sum2_Q16 = silk_SMLAWB( sum2_Q16, W_Q18[ 4 ], diff_Q14[ 4 ] ); - sum2_Q16 = silk_LSHIFT( sum2_Q16, 1 ); - sum2_Q16 = silk_SMLAWB( sum2_Q16, W_Q18[ 0 ], diff_Q14[ 0 ] ); - sum1_Q14 = silk_SMLAWB( sum1_Q14, sum2_Q16, diff_Q14[ 0 ] ); - - /* second row of W_Q18 */ - sum2_Q16 = silk_SMULWB( W_Q18[ 7 ], diff_Q14[ 2 ] ); - sum2_Q16 = silk_SMLAWB( sum2_Q16, W_Q18[ 8 ], diff_Q14[ 3 ] ); - sum2_Q16 = silk_SMLAWB( sum2_Q16, W_Q18[ 9 ], diff_Q14[ 4 ] ); - sum2_Q16 = silk_LSHIFT( sum2_Q16, 1 ); - sum2_Q16 = silk_SMLAWB( sum2_Q16, W_Q18[ 6 ], diff_Q14[ 1 ] ); - sum1_Q14 = silk_SMLAWB( sum1_Q14, sum2_Q16, diff_Q14[ 1 ] ); - - /* third row of W_Q18 */ - sum2_Q16 = silk_SMULWB( W_Q18[ 13 ], diff_Q14[ 3 ] ); - sum2_Q16 = silk_SMLAWB( sum2_Q16, W_Q18[ 14 ], diff_Q14[ 4 ] ); - sum2_Q16 = silk_LSHIFT( sum2_Q16, 1 ); - sum2_Q16 = silk_SMLAWB( sum2_Q16, W_Q18[ 12 ], diff_Q14[ 2 ] ); - sum1_Q14 = silk_SMLAWB( sum1_Q14, sum2_Q16, diff_Q14[ 2 ] ); - - /* fourth row of W_Q18 */ - sum2_Q16 = silk_SMULWB( W_Q18[ 19 ], diff_Q14[ 4 ] ); - sum2_Q16 = silk_LSHIFT( sum2_Q16, 1 ); - sum2_Q16 = silk_SMLAWB( sum2_Q16, W_Q18[ 18 ], diff_Q14[ 3 ] ); - sum1_Q14 = silk_SMLAWB( sum1_Q14, sum2_Q16, diff_Q14[ 3 ] ); - - /* last row of W_Q18 */ - sum2_Q16 = silk_SMULWB( W_Q18[ 24 ], diff_Q14[ 4 ] ); - sum1_Q14 = silk_SMLAWB( sum1_Q14, sum2_Q16, diff_Q14[ 4 ] ); - - silk_assert( sum1_Q14 >= 0 ); + penalty = silk_LSHIFT32( silk_max( silk_SUB32( gain_tmp_Q7, max_gain_Q7 ), 0 ), 11 ); + + /* first row of XX_Q17 */ + sum2_Q24 = silk_MLA( neg_xX_Q24[ 0 ], XX_Q17[ 1 ], cb_row_Q7[ 1 ] ); + sum2_Q24 = silk_MLA( sum2_Q24, XX_Q17[ 2 ], cb_row_Q7[ 2 ] ); + sum2_Q24 = silk_MLA( sum2_Q24, XX_Q17[ 3 ], cb_row_Q7[ 3 ] ); + sum2_Q24 = silk_MLA( sum2_Q24, XX_Q17[ 4 ], cb_row_Q7[ 4 ] ); + sum2_Q24 = silk_LSHIFT32( sum2_Q24, 1 ); + sum2_Q24 = silk_MLA( sum2_Q24, XX_Q17[ 0 ], cb_row_Q7[ 0 ] ); + sum1_Q15 = silk_SMLAWB( sum1_Q15, sum2_Q24, cb_row_Q7[ 0 ] ); + + /* second row of XX_Q17 */ + sum2_Q24 = silk_MLA( neg_xX_Q24[ 1 ], XX_Q17[ 7 ], cb_row_Q7[ 2 ] ); + sum2_Q24 = silk_MLA( sum2_Q24, XX_Q17[ 8 ], cb_row_Q7[ 3 ] ); + sum2_Q24 = silk_MLA( sum2_Q24, XX_Q17[ 9 ], cb_row_Q7[ 4 ] ); + sum2_Q24 = silk_LSHIFT32( sum2_Q24, 1 ); + sum2_Q24 = silk_MLA( sum2_Q24, XX_Q17[ 6 ], cb_row_Q7[ 1 ] ); + sum1_Q15 = silk_SMLAWB( sum1_Q15, sum2_Q24, cb_row_Q7[ 1 ] ); + + /* third row of XX_Q17 */ + sum2_Q24 = silk_MLA( neg_xX_Q24[ 2 ], XX_Q17[ 13 ], cb_row_Q7[ 3 ] ); + sum2_Q24 = silk_MLA( sum2_Q24, XX_Q17[ 14 ], cb_row_Q7[ 4 ] ); + sum2_Q24 = silk_LSHIFT32( sum2_Q24, 1 ); + sum2_Q24 = silk_MLA( sum2_Q24, XX_Q17[ 12 ], cb_row_Q7[ 2 ] ); + sum1_Q15 = silk_SMLAWB( sum1_Q15, sum2_Q24, cb_row_Q7[ 2 ] ); + + /* fourth row of XX_Q17 */ + sum2_Q24 = silk_MLA( neg_xX_Q24[ 3 ], XX_Q17[ 19 ], cb_row_Q7[ 4 ] ); + sum2_Q24 = silk_LSHIFT32( sum2_Q24, 1 ); + sum2_Q24 = silk_MLA( sum2_Q24, XX_Q17[ 18 ], cb_row_Q7[ 3 ] ); + sum1_Q15 = silk_SMLAWB( sum1_Q15, sum2_Q24, cb_row_Q7[ 3 ] ); + + /* last row of XX_Q17 */ + sum2_Q24 = silk_LSHIFT32( neg_xX_Q24[ 4 ], 1 ); + sum2_Q24 = silk_MLA( sum2_Q24, XX_Q17[ 24 ], cb_row_Q7[ 4 ] ); + sum1_Q15 = silk_SMLAWB( sum1_Q15, sum2_Q24, cb_row_Q7[ 4 ] ); /* find best */ - if( sum1_Q14 < *rate_dist_Q14 ) { - *rate_dist_Q14 = sum1_Q14; - *ind = (opus_int8)k; - *gain_Q7 = gain_tmp_Q7; + if( sum1_Q15 >= 0 ) { + /* Translate residual energy to bits using high-rate assumption (6 dB ==> 1 bit/sample) */ + bits_res_Q8 = silk_SMULBB( subfr_len, silk_lin2log( sum1_Q15 + penalty) - (15 << 7) ); + /* In the following line we reduce the codelength component by half ("-1"); seems to slghtly improve quality */ + bits_tot_Q8 = silk_ADD_LSHIFT32( bits_res_Q8, cl_Q5[ k ], 3-1 ); + if( bits_tot_Q8 <= *rate_dist_Q8 ) { + *rate_dist_Q8 = bits_tot_Q8; + *res_nrg_Q15 = sum1_Q15 + penalty; + *ind = (opus_int8)k; + *gain_Q7 = gain_tmp_Q7; + } } /* Go to next cbk vector */ diff --git a/thirdparty/opus/silk/arm/LPC_inv_pred_gain_arm.h b/thirdparty/opus/silk/arm/LPC_inv_pred_gain_arm.h new file mode 100644 index 0000000000..9895b555c8 --- /dev/null +++ b/thirdparty/opus/silk/arm/LPC_inv_pred_gain_arm.h @@ -0,0 +1,57 @@ +/*********************************************************************** +Copyright (c) 2017 Google Inc. +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +- Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. +- Neither the name of Internet Society, IETF or IETF Trust, nor the +names of specific contributors, may be used to endorse or promote +products derived from this software without specific prior written +permission. +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. +***********************************************************************/ + +#ifndef SILK_LPC_INV_PRED_GAIN_ARM_H +# define SILK_LPC_INV_PRED_GAIN_ARM_H + +# include "celt/arm/armcpu.h" + +# if defined(OPUS_ARM_MAY_HAVE_NEON_INTR) +opus_int32 silk_LPC_inverse_pred_gain_neon( /* O Returns inverse prediction gain in energy domain, Q30 */ + const opus_int16 *A_Q12, /* I Prediction coefficients, Q12 [order] */ + const opus_int order /* I Prediction order */ +); + +# if !defined(OPUS_HAVE_RTCD) && defined(OPUS_ARM_PRESUME_NEON) +# define OVERRIDE_silk_LPC_inverse_pred_gain (1) +# define silk_LPC_inverse_pred_gain(A_Q12, order, arch) ((void)(arch), PRESUME_NEON(silk_LPC_inverse_pred_gain)(A_Q12, order)) +# endif +# endif + +# if !defined(OVERRIDE_silk_LPC_inverse_pred_gain) +/*Is run-time CPU detection enabled on this platform?*/ +# if defined(OPUS_HAVE_RTCD) && (defined(OPUS_ARM_MAY_HAVE_NEON_INTR) && !defined(OPUS_ARM_PRESUME_NEON_INTR)) +extern opus_int32 (*const SILK_LPC_INVERSE_PRED_GAIN_IMPL[OPUS_ARCHMASK+1])(const opus_int16 *A_Q12, const opus_int order); +# define OVERRIDE_silk_LPC_inverse_pred_gain (1) +# define silk_LPC_inverse_pred_gain(A_Q12, order, arch) ((*SILK_LPC_INVERSE_PRED_GAIN_IMPL[(arch)&OPUS_ARCHMASK])(A_Q12, order)) +# elif defined(OPUS_ARM_PRESUME_NEON_INTR) +# define OVERRIDE_silk_LPC_inverse_pred_gain (1) +# define silk_LPC_inverse_pred_gain(A_Q12, order, arch) ((void)(arch), silk_LPC_inverse_pred_gain_neon(A_Q12, order)) +# endif +# endif + +#endif /* end SILK_LPC_INV_PRED_GAIN_ARM_H */ diff --git a/thirdparty/opus/silk/arm/LPC_inv_pred_gain_neon_intr.c b/thirdparty/opus/silk/arm/LPC_inv_pred_gain_neon_intr.c new file mode 100644 index 0000000000..ab426bcd66 --- /dev/null +++ b/thirdparty/opus/silk/arm/LPC_inv_pred_gain_neon_intr.c @@ -0,0 +1,280 @@ +/*********************************************************************** +Copyright (c) 2017 Google Inc. +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +- Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. +- Neither the name of Internet Society, IETF or IETF Trust, nor the +names of specific contributors, may be used to endorse or promote +products derived from this software without specific prior written +permission. +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. +***********************************************************************/ + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#include <arm_neon.h> +#include "SigProc_FIX.h" +#include "define.h" + +#define QA 24 +#define A_LIMIT SILK_FIX_CONST( 0.99975, QA ) + +#define MUL32_FRAC_Q(a32, b32, Q) ((opus_int32)(silk_RSHIFT_ROUND64(silk_SMULL(a32, b32), Q))) + +/* The difficulty is how to judge a 64-bit signed integer tmp64 is 32-bit overflowed, + * since NEON has no 64-bit min, max or comparison instructions. + * A failed idea is to compare the results of vmovn(tmp64) and vqmovn(tmp64) whether they are equal or not. + * However, this idea fails when the tmp64 is something like 0xFFFFFFF980000000. + * Here we know that mult2Q >= 1, so the highest bit (bit 63, sign bit) of tmp64 must equal to bit 62. + * tmp64 was shifted left by 1 and we got tmp64'. If high_half(tmp64') != 0 and high_half(tmp64') != -1, + * then we know that bit 31 to bit 63 of tmp64 can not all be the sign bit, and therefore tmp64 is 32-bit overflowed. + * That is, we judge if tmp64' > 0x00000000FFFFFFFF, or tmp64' <= 0xFFFFFFFF00000000. + * We use narrowing shift right 31 bits to tmp32' to save data bandwidth and instructions. + * That is, we judge if tmp32' > 0x00000000, or tmp32' <= 0xFFFFFFFF. + */ + +/* Compute inverse of LPC prediction gain, and */ +/* test if LPC coefficients are stable (all poles within unit circle) */ +static OPUS_INLINE opus_int32 LPC_inverse_pred_gain_QA_neon( /* O Returns inverse prediction gain in energy domain, Q30 */ + opus_int32 A_QA[ SILK_MAX_ORDER_LPC ], /* I Prediction coefficients */ + const opus_int order /* I Prediction order */ +) +{ + opus_int k, n, mult2Q; + opus_int32 invGain_Q30, rc_Q31, rc_mult1_Q30, rc_mult2, tmp1, tmp2; + opus_int32 max, min; + int32x4_t max_s32x4, min_s32x4; + int32x2_t max_s32x2, min_s32x2; + + max_s32x4 = vdupq_n_s32( silk_int32_MIN ); + min_s32x4 = vdupq_n_s32( silk_int32_MAX ); + invGain_Q30 = SILK_FIX_CONST( 1, 30 ); + for( k = order - 1; k > 0; k-- ) { + int32x2_t rc_Q31_s32x2, rc_mult2_s32x2; + int64x2_t mult2Q_s64x2; + + /* Check for stability */ + if( ( A_QA[ k ] > A_LIMIT ) || ( A_QA[ k ] < -A_LIMIT ) ) { + return 0; + } + + /* Set RC equal to negated AR coef */ + rc_Q31 = -silk_LSHIFT( A_QA[ k ], 31 - QA ); + + /* rc_mult1_Q30 range: [ 1 : 2^30 ] */ + rc_mult1_Q30 = silk_SUB32( SILK_FIX_CONST( 1, 30 ), silk_SMMUL( rc_Q31, rc_Q31 ) ); + silk_assert( rc_mult1_Q30 > ( 1 << 15 ) ); /* reduce A_LIMIT if fails */ + silk_assert( rc_mult1_Q30 <= ( 1 << 30 ) ); + + /* Update inverse gain */ + /* invGain_Q30 range: [ 0 : 2^30 ] */ + invGain_Q30 = silk_LSHIFT( silk_SMMUL( invGain_Q30, rc_mult1_Q30 ), 2 ); + silk_assert( invGain_Q30 >= 0 ); + silk_assert( invGain_Q30 <= ( 1 << 30 ) ); + if( invGain_Q30 < SILK_FIX_CONST( 1.0f / MAX_PREDICTION_POWER_GAIN, 30 ) ) { + return 0; + } + + /* rc_mult2 range: [ 2^30 : silk_int32_MAX ] */ + mult2Q = 32 - silk_CLZ32( silk_abs( rc_mult1_Q30 ) ); + rc_mult2 = silk_INVERSE32_varQ( rc_mult1_Q30, mult2Q + 30 ); + + /* Update AR coefficient */ + rc_Q31_s32x2 = vdup_n_s32( rc_Q31 ); + mult2Q_s64x2 = vdupq_n_s64( -mult2Q ); + rc_mult2_s32x2 = vdup_n_s32( rc_mult2 ); + + for( n = 0; n < ( ( k + 1 ) >> 1 ) - 3; n += 4 ) { + /* We always calculate extra elements of A_QA buffer when ( k % 4 ) != 0, to take the advantage of SIMD parallelization. */ + int32x4_t tmp1_s32x4, tmp2_s32x4, t0_s32x4, t1_s32x4, s0_s32x4, s1_s32x4, t_QA0_s32x4, t_QA1_s32x4; + int64x2_t t0_s64x2, t1_s64x2, t2_s64x2, t3_s64x2; + tmp1_s32x4 = vld1q_s32( A_QA + n ); + tmp2_s32x4 = vld1q_s32( A_QA + k - n - 4 ); + tmp2_s32x4 = vrev64q_s32( tmp2_s32x4 ); + tmp2_s32x4 = vcombine_s32( vget_high_s32( tmp2_s32x4 ), vget_low_s32( tmp2_s32x4 ) ); + t0_s32x4 = vqrdmulhq_lane_s32( tmp2_s32x4, rc_Q31_s32x2, 0 ); + t1_s32x4 = vqrdmulhq_lane_s32( tmp1_s32x4, rc_Q31_s32x2, 0 ); + t_QA0_s32x4 = vqsubq_s32( tmp1_s32x4, t0_s32x4 ); + t_QA1_s32x4 = vqsubq_s32( tmp2_s32x4, t1_s32x4 ); + t0_s64x2 = vmull_s32( vget_low_s32 ( t_QA0_s32x4 ), rc_mult2_s32x2 ); + t1_s64x2 = vmull_s32( vget_high_s32( t_QA0_s32x4 ), rc_mult2_s32x2 ); + t2_s64x2 = vmull_s32( vget_low_s32 ( t_QA1_s32x4 ), rc_mult2_s32x2 ); + t3_s64x2 = vmull_s32( vget_high_s32( t_QA1_s32x4 ), rc_mult2_s32x2 ); + t0_s64x2 = vrshlq_s64( t0_s64x2, mult2Q_s64x2 ); + t1_s64x2 = vrshlq_s64( t1_s64x2, mult2Q_s64x2 ); + t2_s64x2 = vrshlq_s64( t2_s64x2, mult2Q_s64x2 ); + t3_s64x2 = vrshlq_s64( t3_s64x2, mult2Q_s64x2 ); + t0_s32x4 = vcombine_s32( vmovn_s64( t0_s64x2 ), vmovn_s64( t1_s64x2 ) ); + t1_s32x4 = vcombine_s32( vmovn_s64( t2_s64x2 ), vmovn_s64( t3_s64x2 ) ); + s0_s32x4 = vcombine_s32( vshrn_n_s64( t0_s64x2, 31 ), vshrn_n_s64( t1_s64x2, 31 ) ); + s1_s32x4 = vcombine_s32( vshrn_n_s64( t2_s64x2, 31 ), vshrn_n_s64( t3_s64x2, 31 ) ); + max_s32x4 = vmaxq_s32( max_s32x4, s0_s32x4 ); + min_s32x4 = vminq_s32( min_s32x4, s0_s32x4 ); + max_s32x4 = vmaxq_s32( max_s32x4, s1_s32x4 ); + min_s32x4 = vminq_s32( min_s32x4, s1_s32x4 ); + t1_s32x4 = vrev64q_s32( t1_s32x4 ); + t1_s32x4 = vcombine_s32( vget_high_s32( t1_s32x4 ), vget_low_s32( t1_s32x4 ) ); + vst1q_s32( A_QA + n, t0_s32x4 ); + vst1q_s32( A_QA + k - n - 4, t1_s32x4 ); + } + for( ; n < (k + 1) >> 1; n++ ) { + opus_int64 tmp64; + tmp1 = A_QA[ n ]; + tmp2 = A_QA[ k - n - 1 ]; + tmp64 = silk_RSHIFT_ROUND64( silk_SMULL( silk_SUB_SAT32(tmp1, + MUL32_FRAC_Q( tmp2, rc_Q31, 31 ) ), rc_mult2 ), mult2Q); + if( tmp64 > silk_int32_MAX || tmp64 < silk_int32_MIN ) { + return 0; + } + A_QA[ n ] = ( opus_int32 )tmp64; + tmp64 = silk_RSHIFT_ROUND64( silk_SMULL( silk_SUB_SAT32(tmp2, + MUL32_FRAC_Q( tmp1, rc_Q31, 31 ) ), rc_mult2), mult2Q); + if( tmp64 > silk_int32_MAX || tmp64 < silk_int32_MIN ) { + return 0; + } + A_QA[ k - n - 1 ] = ( opus_int32 )tmp64; + } + } + + /* Check for stability */ + if( ( A_QA[ k ] > A_LIMIT ) || ( A_QA[ k ] < -A_LIMIT ) ) { + return 0; + } + + max_s32x2 = vmax_s32( vget_low_s32( max_s32x4 ), vget_high_s32( max_s32x4 ) ); + min_s32x2 = vmin_s32( vget_low_s32( min_s32x4 ), vget_high_s32( min_s32x4 ) ); + max_s32x2 = vmax_s32( max_s32x2, vreinterpret_s32_s64( vshr_n_s64( vreinterpret_s64_s32( max_s32x2 ), 32 ) ) ); + min_s32x2 = vmin_s32( min_s32x2, vreinterpret_s32_s64( vshr_n_s64( vreinterpret_s64_s32( min_s32x2 ), 32 ) ) ); + max = vget_lane_s32( max_s32x2, 0 ); + min = vget_lane_s32( min_s32x2, 0 ); + if( ( max > 0 ) || ( min < -1 ) ) { + return 0; + } + + /* Set RC equal to negated AR coef */ + rc_Q31 = -silk_LSHIFT( A_QA[ 0 ], 31 - QA ); + + /* Range: [ 1 : 2^30 ] */ + rc_mult1_Q30 = silk_SUB32( SILK_FIX_CONST( 1, 30 ), silk_SMMUL( rc_Q31, rc_Q31 ) ); + + /* Update inverse gain */ + /* Range: [ 0 : 2^30 ] */ + invGain_Q30 = silk_LSHIFT( silk_SMMUL( invGain_Q30, rc_mult1_Q30 ), 2 ); + silk_assert( invGain_Q30 >= 0 ); + silk_assert( invGain_Q30 <= ( 1 << 30 ) ); + if( invGain_Q30 < SILK_FIX_CONST( 1.0f / MAX_PREDICTION_POWER_GAIN, 30 ) ) { + return 0; + } + + return invGain_Q30; +} + +/* For input in Q12 domain */ +opus_int32 silk_LPC_inverse_pred_gain_neon( /* O Returns inverse prediction gain in energy domain, Q30 */ + const opus_int16 *A_Q12, /* I Prediction coefficients, Q12 [order] */ + const opus_int order /* I Prediction order */ +) +{ +#ifdef OPUS_CHECK_ASM + const opus_int32 invGain_Q30_c = silk_LPC_inverse_pred_gain_c( A_Q12, order ); +#endif + + opus_int32 invGain_Q30; + if( ( SILK_MAX_ORDER_LPC != 24 ) || ( order & 1 )) { + invGain_Q30 = silk_LPC_inverse_pred_gain_c( A_Q12, order ); + } + else { + opus_int32 Atmp_QA[ SILK_MAX_ORDER_LPC ]; + opus_int32 DC_resp; + int16x8_t t0_s16x8, t1_s16x8, t2_s16x8; + int32x4_t t0_s32x4; + const opus_int leftover = order & 7; + + /* Increase Q domain of the AR coefficients */ + t0_s16x8 = vld1q_s16( A_Q12 + 0 ); + t1_s16x8 = vld1q_s16( A_Q12 + 8 ); + t2_s16x8 = vld1q_s16( A_Q12 + 16 ); + t0_s32x4 = vpaddlq_s16( t0_s16x8 ); + + switch( order - leftover ) + { + case 24: + t0_s32x4 = vpadalq_s16( t0_s32x4, t2_s16x8 ); + /* FALLTHROUGH */ + + case 16: + t0_s32x4 = vpadalq_s16( t0_s32x4, t1_s16x8 ); + vst1q_s32( Atmp_QA + 16, vshll_n_s16( vget_low_s16 ( t2_s16x8 ), QA - 12 ) ); + vst1q_s32( Atmp_QA + 20, vshll_n_s16( vget_high_s16( t2_s16x8 ), QA - 12 ) ); + /* FALLTHROUGH */ + + case 8: + { + const int32x2_t t_s32x2 = vpadd_s32( vget_low_s32( t0_s32x4 ), vget_high_s32( t0_s32x4 ) ); + const int64x1_t t_s64x1 = vpaddl_s32( t_s32x2 ); + DC_resp = vget_lane_s32( vreinterpret_s32_s64( t_s64x1 ), 0 ); + vst1q_s32( Atmp_QA + 8, vshll_n_s16( vget_low_s16 ( t1_s16x8 ), QA - 12 ) ); + vst1q_s32( Atmp_QA + 12, vshll_n_s16( vget_high_s16( t1_s16x8 ), QA - 12 ) ); + } + break; + + default: + DC_resp = 0; + break; + } + A_Q12 += order - leftover; + + switch( leftover ) + { + case 6: + DC_resp += (opus_int32)A_Q12[ 5 ]; + DC_resp += (opus_int32)A_Q12[ 4 ]; + /* FALLTHROUGH */ + + case 4: + DC_resp += (opus_int32)A_Q12[ 3 ]; + DC_resp += (opus_int32)A_Q12[ 2 ]; + /* FALLTHROUGH */ + + case 2: + DC_resp += (opus_int32)A_Q12[ 1 ]; + DC_resp += (opus_int32)A_Q12[ 0 ]; + /* FALLTHROUGH */ + + default: + break; + } + + /* If the DC is unstable, we don't even need to do the full calculations */ + if( DC_resp >= 4096 ) { + invGain_Q30 = 0; + } else { + vst1q_s32( Atmp_QA + 0, vshll_n_s16( vget_low_s16 ( t0_s16x8 ), QA - 12 ) ); + vst1q_s32( Atmp_QA + 4, vshll_n_s16( vget_high_s16( t0_s16x8 ), QA - 12 ) ); + invGain_Q30 = LPC_inverse_pred_gain_QA_neon( Atmp_QA, order ); + } + } + +#ifdef OPUS_CHECK_ASM + silk_assert( invGain_Q30_c == invGain_Q30 ); +#endif + + return invGain_Q30; +} diff --git a/thirdparty/opus/silk/arm/NSQ_del_dec_arm.h b/thirdparty/opus/silk/arm/NSQ_del_dec_arm.h new file mode 100644 index 0000000000..9e76e16927 --- /dev/null +++ b/thirdparty/opus/silk/arm/NSQ_del_dec_arm.h @@ -0,0 +1,100 @@ +/*********************************************************************** +Copyright (c) 2017 Google Inc. +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +- Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. +- Neither the name of Internet Society, IETF or IETF Trust, nor the +names of specific contributors, may be used to endorse or promote +products derived from this software without specific prior written +permission. +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. +***********************************************************************/ + +#ifndef SILK_NSQ_DEL_DEC_ARM_H +#define SILK_NSQ_DEL_DEC_ARM_H + +#include "celt/arm/armcpu.h" + +#if defined(OPUS_ARM_MAY_HAVE_NEON_INTR) +void silk_NSQ_del_dec_neon( + const silk_encoder_state *psEncC, silk_nsq_state *NSQ, + SideInfoIndices *psIndices, const opus_int16 x16[], opus_int8 pulses[], + const opus_int16 PredCoef_Q12[2 * MAX_LPC_ORDER], + const opus_int16 LTPCoef_Q14[LTP_ORDER * MAX_NB_SUBFR], + const opus_int16 AR_Q13[MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER], + const opus_int HarmShapeGain_Q14[MAX_NB_SUBFR], + const opus_int Tilt_Q14[MAX_NB_SUBFR], + const opus_int32 LF_shp_Q14[MAX_NB_SUBFR], + const opus_int32 Gains_Q16[MAX_NB_SUBFR], + const opus_int pitchL[MAX_NB_SUBFR], const opus_int Lambda_Q10, + const opus_int LTP_scale_Q14); + +#if !defined(OPUS_HAVE_RTCD) +#define OVERRIDE_silk_NSQ_del_dec (1) +#define silk_NSQ_del_dec(psEncC, NSQ, psIndices, x16, pulses, PredCoef_Q12, \ + LTPCoef_Q14, AR_Q13, HarmShapeGain_Q14, Tilt_Q14, \ + LF_shp_Q14, Gains_Q16, pitchL, Lambda_Q10, \ + LTP_scale_Q14, arch) \ + ((void)(arch), \ + PRESUME_NEON(silk_NSQ_del_dec)( \ + psEncC, NSQ, psIndices, x16, pulses, PredCoef_Q12, LTPCoef_Q14, \ + AR_Q13, HarmShapeGain_Q14, Tilt_Q14, LF_shp_Q14, Gains_Q16, pitchL, \ + Lambda_Q10, LTP_scale_Q14)) +#endif +#endif + +#if !defined(OVERRIDE_silk_NSQ_del_dec) +/*Is run-time CPU detection enabled on this platform?*/ +#if defined(OPUS_HAVE_RTCD) && (defined(OPUS_ARM_MAY_HAVE_NEON_INTR) && \ + !defined(OPUS_ARM_PRESUME_NEON_INTR)) +extern void (*const SILK_NSQ_DEL_DEC_IMPL[OPUS_ARCHMASK + 1])( + const silk_encoder_state *psEncC, silk_nsq_state *NSQ, + SideInfoIndices *psIndices, const opus_int16 x16[], opus_int8 pulses[], + const opus_int16 PredCoef_Q12[2 * MAX_LPC_ORDER], + const opus_int16 LTPCoef_Q14[LTP_ORDER * MAX_NB_SUBFR], + const opus_int16 AR_Q13[MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER], + const opus_int HarmShapeGain_Q14[MAX_NB_SUBFR], + const opus_int Tilt_Q14[MAX_NB_SUBFR], + const opus_int32 LF_shp_Q14[MAX_NB_SUBFR], + const opus_int32 Gains_Q16[MAX_NB_SUBFR], + const opus_int pitchL[MAX_NB_SUBFR], const opus_int Lambda_Q10, + const opus_int LTP_scale_Q14); +#define OVERRIDE_silk_NSQ_del_dec (1) +#define silk_NSQ_del_dec(psEncC, NSQ, psIndices, x16, pulses, PredCoef_Q12, \ + LTPCoef_Q14, AR_Q13, HarmShapeGain_Q14, Tilt_Q14, \ + LF_shp_Q14, Gains_Q16, pitchL, Lambda_Q10, \ + LTP_scale_Q14, arch) \ + ((*SILK_NSQ_DEL_DEC_IMPL[(arch)&OPUS_ARCHMASK])( \ + psEncC, NSQ, psIndices, x16, pulses, PredCoef_Q12, LTPCoef_Q14, \ + AR_Q13, HarmShapeGain_Q14, Tilt_Q14, LF_shp_Q14, Gains_Q16, pitchL, \ + Lambda_Q10, LTP_scale_Q14)) +#elif defined(OPUS_ARM_PRESUME_NEON_INTR) +#define OVERRIDE_silk_NSQ_del_dec (1) +#define silk_NSQ_del_dec(psEncC, NSQ, psIndices, x16, pulses, PredCoef_Q12, \ + LTPCoef_Q14, AR_Q13, HarmShapeGain_Q14, Tilt_Q14, \ + LF_shp_Q14, Gains_Q16, pitchL, Lambda_Q10, \ + LTP_scale_Q14, arch) \ + ((void)(arch), \ + silk_NSQ_del_dec_neon(psEncC, NSQ, psIndices, x16, pulses, PredCoef_Q12, \ + LTPCoef_Q14, AR_Q13, HarmShapeGain_Q14, Tilt_Q14, \ + LF_shp_Q14, Gains_Q16, pitchL, Lambda_Q10, \ + LTP_scale_Q14)) +#endif +#endif + +#endif /* end SILK_NSQ_DEL_DEC_ARM_H */ diff --git a/thirdparty/opus/silk/arm/NSQ_del_dec_neon_intr.c b/thirdparty/opus/silk/arm/NSQ_del_dec_neon_intr.c new file mode 100644 index 0000000000..212410f362 --- /dev/null +++ b/thirdparty/opus/silk/arm/NSQ_del_dec_neon_intr.c @@ -0,0 +1,1124 @@ +/*********************************************************************** +Copyright (c) 2017 Google Inc. +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +- Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. +- Neither the name of Internet Society, IETF or IETF Trust, nor the +names of specific contributors, may be used to endorse or promote +products derived from this software without specific prior written +permission. +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. +***********************************************************************/ + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#include <arm_neon.h> +#ifdef OPUS_CHECK_ASM +# include <string.h> +#endif +#include "main.h" +#include "stack_alloc.h" + +/* NEON intrinsics optimization now can only parallelize up to 4 delay decision states. */ +/* If there are more states, C function is called, and this optimization must be expanded. */ +#define NEON_MAX_DEL_DEC_STATES 4 + +typedef struct { + opus_int32 sLPC_Q14[ MAX_SUB_FRAME_LENGTH + NSQ_LPC_BUF_LENGTH ][ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 RandState[ DECISION_DELAY ][ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 Q_Q10[ DECISION_DELAY ][ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 Xq_Q14[ DECISION_DELAY ][ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 Pred_Q15[ DECISION_DELAY ][ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 Shape_Q14[ DECISION_DELAY ][ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 sAR2_Q14[ MAX_SHAPE_LPC_ORDER ][ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 LF_AR_Q14[ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 Diff_Q14[ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 Seed[ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 SeedInit[ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 RD_Q10[ NEON_MAX_DEL_DEC_STATES ]; +} NSQ_del_decs_struct; + +typedef struct { + opus_int32 Q_Q10[ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 RD_Q10[ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 xq_Q14[ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 LF_AR_Q14[ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 Diff_Q14[ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 sLTP_shp_Q14[ NEON_MAX_DEL_DEC_STATES ]; + opus_int32 LPC_exc_Q14[ NEON_MAX_DEL_DEC_STATES ]; +} NSQ_samples_struct; + +static OPUS_INLINE void silk_nsq_del_dec_scale_states_neon( + const silk_encoder_state *psEncC, /* I Encoder State */ + silk_nsq_state *NSQ, /* I/O NSQ state */ + NSQ_del_decs_struct psDelDec[], /* I/O Delayed decision states */ + const opus_int16 x16[], /* I Input */ + opus_int32 x_sc_Q10[], /* O Input scaled with 1/Gain in Q10 */ + const opus_int16 sLTP[], /* I Re-whitened LTP state in Q0 */ + opus_int32 sLTP_Q15[], /* O LTP state matching scaled input */ + opus_int subfr, /* I Subframe number */ + const opus_int LTP_scale_Q14, /* I LTP state scaling */ + const opus_int32 Gains_Q16[ MAX_NB_SUBFR ], /* I */ + const opus_int pitchL[ MAX_NB_SUBFR ], /* I Pitch lag */ + const opus_int signal_type, /* I Signal type */ + const opus_int decisionDelay /* I Decision delay */ +); + +/******************************************/ +/* Noise shape quantizer for one subframe */ +/******************************************/ +static OPUS_INLINE void silk_noise_shape_quantizer_del_dec_neon( + silk_nsq_state *NSQ, /* I/O NSQ state */ + NSQ_del_decs_struct psDelDec[], /* I/O Delayed decision states */ + opus_int signalType, /* I Signal type */ + const opus_int32 x_Q10[], /* I */ + opus_int8 pulses[], /* O */ + opus_int16 xq[], /* O */ + opus_int32 sLTP_Q15[], /* I/O LTP filter state */ + opus_int32 delayedGain_Q10[], /* I/O Gain delay buffer */ + const opus_int16 a_Q12[], /* I Short term prediction coefs */ + const opus_int16 b_Q14[], /* I Long term prediction coefs */ + const opus_int16 AR_shp_Q13[], /* I Noise shaping coefs */ + opus_int lag, /* I Pitch lag */ + opus_int32 HarmShapeFIRPacked_Q14, /* I */ + opus_int Tilt_Q14, /* I Spectral tilt */ + opus_int32 LF_shp_Q14, /* I */ + opus_int32 Gain_Q16, /* I */ + opus_int Lambda_Q10, /* I */ + opus_int offset_Q10, /* I */ + opus_int length, /* I Input length */ + opus_int subfr, /* I Subframe number */ + opus_int shapingLPCOrder, /* I Shaping LPC filter order */ + opus_int predictLPCOrder, /* I Prediction filter order */ + opus_int warping_Q16, /* I */ + opus_int nStatesDelayedDecision, /* I Number of states in decision tree */ + opus_int *smpl_buf_idx, /* I/O Index to newest samples in buffers */ + opus_int decisionDelay /* I */ +); + +static OPUS_INLINE void copy_winner_state_kernel( + const NSQ_del_decs_struct *psDelDec, + const opus_int offset, + const opus_int last_smple_idx, + const opus_int Winner_ind, + const int32x2_t gain_lo_s32x2, + const int32x2_t gain_hi_s32x2, + const int32x4_t shift_s32x4, + int32x4_t t0_s32x4, + int32x4_t t1_s32x4, + opus_int8 *const pulses, + opus_int16 *pxq, + silk_nsq_state *NSQ +) +{ + int16x8_t t_s16x8; + int32x4_t o0_s32x4, o1_s32x4; + + t0_s32x4 = vld1q_lane_s32( &psDelDec->Q_Q10[ last_smple_idx - 0 ][ Winner_ind ], t0_s32x4, 0 ); + t0_s32x4 = vld1q_lane_s32( &psDelDec->Q_Q10[ last_smple_idx - 1 ][ Winner_ind ], t0_s32x4, 1 ); + t0_s32x4 = vld1q_lane_s32( &psDelDec->Q_Q10[ last_smple_idx - 2 ][ Winner_ind ], t0_s32x4, 2 ); + t0_s32x4 = vld1q_lane_s32( &psDelDec->Q_Q10[ last_smple_idx - 3 ][ Winner_ind ], t0_s32x4, 3 ); + t1_s32x4 = vld1q_lane_s32( &psDelDec->Q_Q10[ last_smple_idx - 4 ][ Winner_ind ], t1_s32x4, 0 ); + t1_s32x4 = vld1q_lane_s32( &psDelDec->Q_Q10[ last_smple_idx - 5 ][ Winner_ind ], t1_s32x4, 1 ); + t1_s32x4 = vld1q_lane_s32( &psDelDec->Q_Q10[ last_smple_idx - 6 ][ Winner_ind ], t1_s32x4, 2 ); + t1_s32x4 = vld1q_lane_s32( &psDelDec->Q_Q10[ last_smple_idx - 7 ][ Winner_ind ], t1_s32x4, 3 ); + t_s16x8 = vcombine_s16( vrshrn_n_s32( t0_s32x4, 10 ), vrshrn_n_s32( t1_s32x4, 10 ) ); + vst1_s8( &pulses[ offset ], vmovn_s16( t_s16x8 ) ); + + t0_s32x4 = vld1q_lane_s32( &psDelDec->Xq_Q14[ last_smple_idx - 0 ][ Winner_ind ], t0_s32x4, 0 ); + t0_s32x4 = vld1q_lane_s32( &psDelDec->Xq_Q14[ last_smple_idx - 1 ][ Winner_ind ], t0_s32x4, 1 ); + t0_s32x4 = vld1q_lane_s32( &psDelDec->Xq_Q14[ last_smple_idx - 2 ][ Winner_ind ], t0_s32x4, 2 ); + t0_s32x4 = vld1q_lane_s32( &psDelDec->Xq_Q14[ last_smple_idx - 3 ][ Winner_ind ], t0_s32x4, 3 ); + t1_s32x4 = vld1q_lane_s32( &psDelDec->Xq_Q14[ last_smple_idx - 4 ][ Winner_ind ], t1_s32x4, 0 ); + t1_s32x4 = vld1q_lane_s32( &psDelDec->Xq_Q14[ last_smple_idx - 5 ][ Winner_ind ], t1_s32x4, 1 ); + t1_s32x4 = vld1q_lane_s32( &psDelDec->Xq_Q14[ last_smple_idx - 6 ][ Winner_ind ], t1_s32x4, 2 ); + t1_s32x4 = vld1q_lane_s32( &psDelDec->Xq_Q14[ last_smple_idx - 7 ][ Winner_ind ], t1_s32x4, 3 ); + o0_s32x4 = vqdmulhq_lane_s32( t0_s32x4, gain_lo_s32x2, 0 ); + o1_s32x4 = vqdmulhq_lane_s32( t1_s32x4, gain_lo_s32x2, 0 ); + o0_s32x4 = vmlaq_lane_s32( o0_s32x4, t0_s32x4, gain_hi_s32x2, 0 ); + o1_s32x4 = vmlaq_lane_s32( o1_s32x4, t1_s32x4, gain_hi_s32x2, 0 ); + o0_s32x4 = vrshlq_s32( o0_s32x4, shift_s32x4 ); + o1_s32x4 = vrshlq_s32( o1_s32x4, shift_s32x4 ); + vst1_s16( &pxq[ offset + 0 ], vqmovn_s32( o0_s32x4 ) ); + vst1_s16( &pxq[ offset + 4 ], vqmovn_s32( o1_s32x4 ) ); + + t0_s32x4 = vld1q_lane_s32( &psDelDec->Shape_Q14[ last_smple_idx - 0 ][ Winner_ind ], t0_s32x4, 0 ); + t0_s32x4 = vld1q_lane_s32( &psDelDec->Shape_Q14[ last_smple_idx - 1 ][ Winner_ind ], t0_s32x4, 1 ); + t0_s32x4 = vld1q_lane_s32( &psDelDec->Shape_Q14[ last_smple_idx - 2 ][ Winner_ind ], t0_s32x4, 2 ); + t0_s32x4 = vld1q_lane_s32( &psDelDec->Shape_Q14[ last_smple_idx - 3 ][ Winner_ind ], t0_s32x4, 3 ); + t1_s32x4 = vld1q_lane_s32( &psDelDec->Shape_Q14[ last_smple_idx - 4 ][ Winner_ind ], t1_s32x4, 0 ); + t1_s32x4 = vld1q_lane_s32( &psDelDec->Shape_Q14[ last_smple_idx - 5 ][ Winner_ind ], t1_s32x4, 1 ); + t1_s32x4 = vld1q_lane_s32( &psDelDec->Shape_Q14[ last_smple_idx - 6 ][ Winner_ind ], t1_s32x4, 2 ); + t1_s32x4 = vld1q_lane_s32( &psDelDec->Shape_Q14[ last_smple_idx - 7 ][ Winner_ind ], t1_s32x4, 3 ); + vst1q_s32( &NSQ->sLTP_shp_Q14[ NSQ->sLTP_shp_buf_idx + offset + 0 ], t0_s32x4 ); + vst1q_s32( &NSQ->sLTP_shp_Q14[ NSQ->sLTP_shp_buf_idx + offset + 4 ], t1_s32x4 ); +} + +static OPUS_INLINE void copy_winner_state( + const NSQ_del_decs_struct *psDelDec, + const opus_int decisionDelay, + const opus_int smpl_buf_idx, + const opus_int Winner_ind, + const opus_int32 gain, + const opus_int32 shift, + opus_int8 *const pulses, + opus_int16 *pxq, + silk_nsq_state *NSQ +) +{ + opus_int i, last_smple_idx; + const int32x2_t gain_lo_s32x2 = vdup_n_s32( silk_LSHIFT32( gain & 0x0000FFFF, 15 ) ); + const int32x2_t gain_hi_s32x2 = vdup_n_s32( gain >> 16 ); + const int32x4_t shift_s32x4 = vdupq_n_s32( -shift ); + int32x4_t t0_s32x4, t1_s32x4; + + t0_s32x4 = t1_s32x4 = vdupq_n_s32( 0 ); /* initialization */ + last_smple_idx = smpl_buf_idx + decisionDelay - 1 + DECISION_DELAY; + if( last_smple_idx >= DECISION_DELAY ) last_smple_idx -= DECISION_DELAY; + if( last_smple_idx >= DECISION_DELAY ) last_smple_idx -= DECISION_DELAY; + + for( i = 0; ( i < ( decisionDelay - 7 ) ) && ( last_smple_idx >= 7 ); i += 8, last_smple_idx -= 8 ) { + copy_winner_state_kernel( psDelDec, i - decisionDelay, last_smple_idx, Winner_ind, gain_lo_s32x2, gain_hi_s32x2, shift_s32x4, t0_s32x4, t1_s32x4, pulses, pxq, NSQ ); + } + for( ; ( i < decisionDelay ) && ( last_smple_idx >= 0 ); i++, last_smple_idx-- ) { + pulses[ i - decisionDelay ] = (opus_int8)silk_RSHIFT_ROUND( psDelDec->Q_Q10[ last_smple_idx ][ Winner_ind ], 10 ); + pxq[ i - decisionDelay ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( silk_SMULWW( psDelDec->Xq_Q14[ last_smple_idx ][ Winner_ind ], gain ), shift ) ); + NSQ->sLTP_shp_Q14[ NSQ->sLTP_shp_buf_idx - decisionDelay + i ] = psDelDec->Shape_Q14[ last_smple_idx ][ Winner_ind ]; + } + + last_smple_idx += DECISION_DELAY; + for( ; i < ( decisionDelay - 7 ); i++, last_smple_idx-- ) { + copy_winner_state_kernel( psDelDec, i - decisionDelay, last_smple_idx, Winner_ind, gain_lo_s32x2, gain_hi_s32x2, shift_s32x4, t0_s32x4, t1_s32x4, pulses, pxq, NSQ ); + } + for( ; i < decisionDelay; i++, last_smple_idx-- ) { + pulses[ i - decisionDelay ] = (opus_int8)silk_RSHIFT_ROUND( psDelDec->Q_Q10[ last_smple_idx ][ Winner_ind ], 10 ); + pxq[ i - decisionDelay ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( silk_SMULWW( psDelDec->Xq_Q14[ last_smple_idx ][ Winner_ind ], gain ), shift ) ); + NSQ->sLTP_shp_Q14[ NSQ->sLTP_shp_buf_idx - decisionDelay + i ] = psDelDec->Shape_Q14[ last_smple_idx ][ Winner_ind ]; + } +} + +void silk_NSQ_del_dec_neon( + const silk_encoder_state *psEncC, /* I Encoder State */ + silk_nsq_state *NSQ, /* I/O NSQ state */ + SideInfoIndices *psIndices, /* I/O Quantization Indices */ + const opus_int16 x16[], /* I Input */ + opus_int8 pulses[], /* O Quantized pulse signal */ + const opus_int16 PredCoef_Q12[ 2 * MAX_LPC_ORDER ], /* I Short term prediction coefs */ + const opus_int16 LTPCoef_Q14[ LTP_ORDER * MAX_NB_SUBFR ], /* I Long term prediction coefs */ + const opus_int16 AR_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ], /* I Noise shaping coefs */ + const opus_int HarmShapeGain_Q14[ MAX_NB_SUBFR ], /* I Long term shaping coefs */ + const opus_int Tilt_Q14[ MAX_NB_SUBFR ], /* I Spectral tilt */ + const opus_int32 LF_shp_Q14[ MAX_NB_SUBFR ], /* I Low frequency shaping coefs */ + const opus_int32 Gains_Q16[ MAX_NB_SUBFR ], /* I Quantization step sizes */ + const opus_int pitchL[ MAX_NB_SUBFR ], /* I Pitch lags */ + const opus_int Lambda_Q10, /* I Rate/distortion tradeoff */ + const opus_int LTP_scale_Q14 /* I LTP state scaling */ +) +{ +#ifdef OPUS_CHECK_ASM + silk_nsq_state NSQ_c; + SideInfoIndices psIndices_c; + opus_int8 pulses_c[ MAX_FRAME_LENGTH ]; + const opus_int8 *const pulses_a = pulses; + + ( void )pulses_a; + silk_memcpy( &NSQ_c, NSQ, sizeof( NSQ_c ) ); + silk_memcpy( &psIndices_c, psIndices, sizeof( psIndices_c ) ); + silk_memcpy( pulses_c, pulses, sizeof( pulses_c ) ); + silk_NSQ_del_dec_c( psEncC, &NSQ_c, &psIndices_c, x16, pulses_c, PredCoef_Q12, LTPCoef_Q14, AR_Q13, HarmShapeGain_Q14, Tilt_Q14, LF_shp_Q14, Gains_Q16, + pitchL, Lambda_Q10, LTP_scale_Q14 ); +#endif + + /* The optimization parallelizes the different delay decision states. */ + if(( psEncC->nStatesDelayedDecision > NEON_MAX_DEL_DEC_STATES ) || ( psEncC->nStatesDelayedDecision <= 2 )) { + /* NEON intrinsics optimization now can only parallelize up to 4 delay decision states. */ + /* If there are more states, C function is called, and this optimization must be expanded. */ + /* When the number of delay decision states is less than 3, there are penalties using this */ + /* optimization, and C function is called. */ + /* When the number of delay decision states is 2, it's better to specialize another */ + /* structure NSQ_del_dec2_struct and optimize with shorter NEON registers. (Low priority) */ + silk_NSQ_del_dec_c( psEncC, NSQ, psIndices, x16, pulses, PredCoef_Q12, LTPCoef_Q14, AR_Q13, HarmShapeGain_Q14, + Tilt_Q14, LF_shp_Q14, Gains_Q16, pitchL, Lambda_Q10, LTP_scale_Q14 ); + } else { + opus_int i, k, lag, start_idx, LSF_interpolation_flag, Winner_ind, subfr; + opus_int smpl_buf_idx, decisionDelay; + const opus_int16 *A_Q12, *B_Q14, *AR_shp_Q13; + opus_int16 *pxq; + VARDECL( opus_int32, sLTP_Q15 ); + VARDECL( opus_int16, sLTP ); + opus_int32 HarmShapeFIRPacked_Q14; + opus_int offset_Q10; + opus_int32 RDmin_Q10, Gain_Q10; + VARDECL( opus_int32, x_sc_Q10 ); + VARDECL( opus_int32, delayedGain_Q10 ); + VARDECL( NSQ_del_decs_struct, psDelDec ); + int32x4_t t_s32x4; + SAVE_STACK; + + /* Set unvoiced lag to the previous one, overwrite later for voiced */ + lag = NSQ->lagPrev; + + silk_assert( NSQ->prev_gain_Q16 != 0 ); + + /* Initialize delayed decision states */ + ALLOC( psDelDec, 1, NSQ_del_decs_struct ); + /* Only RandState and RD_Q10 need to be initialized to 0. */ + silk_memset( psDelDec->RandState, 0, sizeof( psDelDec->RandState ) ); + vst1q_s32( psDelDec->RD_Q10, vdupq_n_s32( 0 ) ); + + for( k = 0; k < psEncC->nStatesDelayedDecision; k++ ) { + psDelDec->SeedInit[ k ] = psDelDec->Seed[ k ] = ( k + psIndices->Seed ) & 3; + } + vst1q_s32( psDelDec->LF_AR_Q14, vld1q_dup_s32( &NSQ->sLF_AR_shp_Q14 ) ); + vst1q_s32( psDelDec->Diff_Q14, vld1q_dup_s32( &NSQ->sDiff_shp_Q14 ) ); + vst1q_s32( psDelDec->Shape_Q14[ 0 ], vld1q_dup_s32( &NSQ->sLTP_shp_Q14[ psEncC->ltp_mem_length - 1 ] ) ); + for( i = 0; i < NSQ_LPC_BUF_LENGTH; i++ ) { + vst1q_s32( psDelDec->sLPC_Q14[ i ], vld1q_dup_s32( &NSQ->sLPC_Q14[ i ] ) ); + } + for( i = 0; i < (opus_int)( sizeof( NSQ->sAR2_Q14 ) / sizeof( NSQ->sAR2_Q14[ 0 ] ) ); i++ ) { + vst1q_s32( psDelDec->sAR2_Q14[ i ], vld1q_dup_s32( &NSQ->sAR2_Q14[ i ] ) ); + } + + offset_Q10 = silk_Quantization_Offsets_Q10[ psIndices->signalType >> 1 ][ psIndices->quantOffsetType ]; + smpl_buf_idx = 0; /* index of oldest samples */ + + decisionDelay = silk_min_int( DECISION_DELAY, psEncC->subfr_length ); + + /* For voiced frames limit the decision delay to lower than the pitch lag */ + if( psIndices->signalType == TYPE_VOICED ) { + opus_int pitch_min = pitchL[ 0 ]; + for( k = 1; k < psEncC->nb_subfr; k++ ) { + pitch_min = silk_min_int( pitch_min, pitchL[ k ] ); + } + decisionDelay = silk_min_int( decisionDelay, pitch_min - LTP_ORDER / 2 - 1 ); + } else { + if( lag > 0 ) { + decisionDelay = silk_min_int( decisionDelay, lag - LTP_ORDER / 2 - 1 ); + } + } + + if( psIndices->NLSFInterpCoef_Q2 == 4 ) { + LSF_interpolation_flag = 0; + } else { + LSF_interpolation_flag = 1; + } + + ALLOC( sLTP_Q15, psEncC->ltp_mem_length + psEncC->frame_length, opus_int32 ); + ALLOC( sLTP, psEncC->ltp_mem_length + psEncC->frame_length, opus_int16 ); + ALLOC( x_sc_Q10, psEncC->subfr_length, opus_int32 ); + ALLOC( delayedGain_Q10, DECISION_DELAY, opus_int32 ); + /* Set up pointers to start of sub frame */ + pxq = &NSQ->xq[ psEncC->ltp_mem_length ]; + NSQ->sLTP_shp_buf_idx = psEncC->ltp_mem_length; + NSQ->sLTP_buf_idx = psEncC->ltp_mem_length; + subfr = 0; + for( k = 0; k < psEncC->nb_subfr; k++ ) { + A_Q12 = &PredCoef_Q12[ ( ( k >> 1 ) | ( 1 - LSF_interpolation_flag ) ) * MAX_LPC_ORDER ]; + B_Q14 = <PCoef_Q14[ k * LTP_ORDER ]; + AR_shp_Q13 = &AR_Q13[ k * MAX_SHAPE_LPC_ORDER ]; + + /* Noise shape parameters */ + silk_assert( HarmShapeGain_Q14[ k ] >= 0 ); + HarmShapeFIRPacked_Q14 = silk_RSHIFT( HarmShapeGain_Q14[ k ], 2 ); + HarmShapeFIRPacked_Q14 |= silk_LSHIFT( (opus_int32)silk_RSHIFT( HarmShapeGain_Q14[ k ], 1 ), 16 ); + + NSQ->rewhite_flag = 0; + if( psIndices->signalType == TYPE_VOICED ) { + /* Voiced */ + lag = pitchL[ k ]; + + /* Re-whitening */ + if( ( k & ( 3 - silk_LSHIFT( LSF_interpolation_flag, 1 ) ) ) == 0 ) { + if( k == 2 ) { + /* RESET DELAYED DECISIONS */ + /* Find winner */ + int32x4_t RD_Q10_s32x4; + RDmin_Q10 = psDelDec->RD_Q10[ 0 ]; + Winner_ind = 0; + for( i = 1; i < psEncC->nStatesDelayedDecision; i++ ) { + if( psDelDec->RD_Q10[ i ] < RDmin_Q10 ) { + RDmin_Q10 = psDelDec->RD_Q10[ i ]; + Winner_ind = i; + } + } + psDelDec->RD_Q10[ Winner_ind ] -= ( silk_int32_MAX >> 4 ); + RD_Q10_s32x4 = vld1q_s32( psDelDec->RD_Q10 ); + RD_Q10_s32x4 = vaddq_s32( RD_Q10_s32x4, vdupq_n_s32( silk_int32_MAX >> 4 ) ); + vst1q_s32( psDelDec->RD_Q10, RD_Q10_s32x4 ); + + /* Copy final part of signals from winner state to output and long-term filter states */ + copy_winner_state( psDelDec, decisionDelay, smpl_buf_idx, Winner_ind, Gains_Q16[ 1 ], 14, pulses, pxq, NSQ ); + + subfr = 0; + } + + /* Rewhiten with new A coefs */ + start_idx = psEncC->ltp_mem_length - lag - psEncC->predictLPCOrder - LTP_ORDER / 2; + silk_assert( start_idx > 0 ); + + silk_LPC_analysis_filter( &sLTP[ start_idx ], &NSQ->xq[ start_idx + k * psEncC->subfr_length ], + A_Q12, psEncC->ltp_mem_length - start_idx, psEncC->predictLPCOrder, psEncC->arch ); + + NSQ->sLTP_buf_idx = psEncC->ltp_mem_length; + NSQ->rewhite_flag = 1; + } + } + + silk_nsq_del_dec_scale_states_neon( psEncC, NSQ, psDelDec, x16, x_sc_Q10, sLTP, sLTP_Q15, k, + LTP_scale_Q14, Gains_Q16, pitchL, psIndices->signalType, decisionDelay ); + + silk_noise_shape_quantizer_del_dec_neon( NSQ, psDelDec, psIndices->signalType, x_sc_Q10, pulses, pxq, sLTP_Q15, + delayedGain_Q10, A_Q12, B_Q14, AR_shp_Q13, lag, HarmShapeFIRPacked_Q14, Tilt_Q14[ k ], LF_shp_Q14[ k ], + Gains_Q16[ k ], Lambda_Q10, offset_Q10, psEncC->subfr_length, subfr++, psEncC->shapingLPCOrder, + psEncC->predictLPCOrder, psEncC->warping_Q16, psEncC->nStatesDelayedDecision, &smpl_buf_idx, decisionDelay ); + + x16 += psEncC->subfr_length; + pulses += psEncC->subfr_length; + pxq += psEncC->subfr_length; + } + + /* Find winner */ + RDmin_Q10 = psDelDec->RD_Q10[ 0 ]; + Winner_ind = 0; + for( k = 1; k < psEncC->nStatesDelayedDecision; k++ ) { + if( psDelDec->RD_Q10[ k ] < RDmin_Q10 ) { + RDmin_Q10 = psDelDec->RD_Q10[ k ]; + Winner_ind = k; + } + } + + /* Copy final part of signals from winner state to output and long-term filter states */ + psIndices->Seed = psDelDec->SeedInit[ Winner_ind ]; + Gain_Q10 = silk_RSHIFT32( Gains_Q16[ psEncC->nb_subfr - 1 ], 6 ); + copy_winner_state( psDelDec, decisionDelay, smpl_buf_idx, Winner_ind, Gain_Q10, 8, pulses, pxq, NSQ ); + + t_s32x4 = vdupq_n_s32( 0 ); /* initialization */ + for( i = 0; i < ( NSQ_LPC_BUF_LENGTH - 3 ); i += 4 ) { + t_s32x4 = vld1q_lane_s32( &psDelDec->sLPC_Q14[ i + 0 ][ Winner_ind ], t_s32x4, 0 ); + t_s32x4 = vld1q_lane_s32( &psDelDec->sLPC_Q14[ i + 1 ][ Winner_ind ], t_s32x4, 1 ); + t_s32x4 = vld1q_lane_s32( &psDelDec->sLPC_Q14[ i + 2 ][ Winner_ind ], t_s32x4, 2 ); + t_s32x4 = vld1q_lane_s32( &psDelDec->sLPC_Q14[ i + 3 ][ Winner_ind ], t_s32x4, 3 ); + vst1q_s32( &NSQ->sLPC_Q14[ i ], t_s32x4 ); + } + + for( ; i < NSQ_LPC_BUF_LENGTH; i++ ) { + NSQ->sLPC_Q14[ i ] = psDelDec->sLPC_Q14[ i ][ Winner_ind ]; + } + + for( i = 0; i < (opus_int)( sizeof( NSQ->sAR2_Q14 ) / sizeof( NSQ->sAR2_Q14[ 0 ] ) - 3 ); i += 4 ) { + t_s32x4 = vld1q_lane_s32( &psDelDec->sAR2_Q14[ i + 0 ][ Winner_ind ], t_s32x4, 0 ); + t_s32x4 = vld1q_lane_s32( &psDelDec->sAR2_Q14[ i + 1 ][ Winner_ind ], t_s32x4, 1 ); + t_s32x4 = vld1q_lane_s32( &psDelDec->sAR2_Q14[ i + 2 ][ Winner_ind ], t_s32x4, 2 ); + t_s32x4 = vld1q_lane_s32( &psDelDec->sAR2_Q14[ i + 3 ][ Winner_ind ], t_s32x4, 3 ); + vst1q_s32( &NSQ->sAR2_Q14[ i ], t_s32x4 ); + } + + for( ; i < (opus_int)( sizeof( NSQ->sAR2_Q14 ) / sizeof( NSQ->sAR2_Q14[ 0 ] ) ); i++ ) { + NSQ->sAR2_Q14[ i ] = psDelDec->sAR2_Q14[ i ][ Winner_ind ]; + } + + /* Update states */ + NSQ->sLF_AR_shp_Q14 = psDelDec->LF_AR_Q14[ Winner_ind ]; + NSQ->sDiff_shp_Q14 = psDelDec->Diff_Q14[ Winner_ind ]; + NSQ->lagPrev = pitchL[ psEncC->nb_subfr - 1 ]; + + /* Save quantized speech signal */ + silk_memmove( NSQ->xq, &NSQ->xq[ psEncC->frame_length ], psEncC->ltp_mem_length * sizeof( opus_int16 ) ); + silk_memmove( NSQ->sLTP_shp_Q14, &NSQ->sLTP_shp_Q14[ psEncC->frame_length ], psEncC->ltp_mem_length * sizeof( opus_int32 ) ); + RESTORE_STACK; + } + +#ifdef OPUS_CHECK_ASM + silk_assert( !memcmp( &NSQ_c, NSQ, sizeof( NSQ_c ) ) ); + silk_assert( !memcmp( &psIndices_c, psIndices, sizeof( psIndices_c ) ) ); + silk_assert( !memcmp( pulses_c, pulses_a, sizeof( pulses_c ) ) ); +#endif +} + +/******************************************/ +/* Noise shape quantizer for one subframe */ +/******************************************/ +/* Note: Function silk_short_prediction_create_arch_coef_neon() defined in NSQ_neon.h is actually a hacking C function. */ +/* Therefore here we append "_local" to the NEON function name to avoid confusion. */ +static OPUS_INLINE void silk_short_prediction_create_arch_coef_neon_local(opus_int32 *out, const opus_int16 *in, opus_int order) +{ + int16x8_t t_s16x8; + int32x4_t t0_s32x4, t1_s32x4, t2_s32x4, t3_s32x4; + silk_assert( order == 10 || order == 16 ); + + t_s16x8 = vld1q_s16( in + 0 ); /* 7 6 5 4 3 2 1 0 */ + t_s16x8 = vrev64q_s16( t_s16x8 ); /* 4 5 6 7 0 1 2 3 */ + t2_s32x4 = vshll_n_s16( vget_high_s16( t_s16x8 ), 15 ); /* 4 5 6 7 */ + t3_s32x4 = vshll_n_s16( vget_low_s16( t_s16x8 ), 15 ); /* 0 1 2 3 */ + + if( order == 16 ) { + t_s16x8 = vld1q_s16( in + 8 ); /* F E D C B A 9 8 */ + t_s16x8 = vrev64q_s16( t_s16x8 ); /* C D E F 8 9 A B */ + t0_s32x4 = vshll_n_s16( vget_high_s16( t_s16x8 ), 15 ); /* C D E F */ + t1_s32x4 = vshll_n_s16( vget_low_s16( t_s16x8 ), 15 ); /* 8 9 A B */ + } else { + int16x4_t t_s16x4; + + t0_s32x4 = vdupq_n_s32( 0 ); /* zero zero zero zero */ + t_s16x4 = vld1_s16( in + 6 ); /* 9 8 7 6 */ + t_s16x4 = vrev64_s16( t_s16x4 ); /* 6 7 8 9 */ + t1_s32x4 = vshll_n_s16( t_s16x4, 15 ); + t1_s32x4 = vcombine_s32( vget_low_s32(t0_s32x4), vget_low_s32( t1_s32x4 ) ); /* 8 9 zero zero */ + } + vst1q_s32( out + 0, t0_s32x4 ); + vst1q_s32( out + 4, t1_s32x4 ); + vst1q_s32( out + 8, t2_s32x4 ); + vst1q_s32( out + 12, t3_s32x4 ); +} + +static OPUS_INLINE int32x4_t silk_SMLAWB_lane0_neon( + const int32x4_t out_s32x4, + const int32x4_t in_s32x4, + const int32x2_t coef_s32x2 +) +{ + return vaddq_s32( out_s32x4, vqdmulhq_lane_s32( in_s32x4, coef_s32x2, 0 ) ); +} + +static OPUS_INLINE int32x4_t silk_SMLAWB_lane1_neon( + const int32x4_t out_s32x4, + const int32x4_t in_s32x4, + const int32x2_t coef_s32x2 +) +{ + return vaddq_s32( out_s32x4, vqdmulhq_lane_s32( in_s32x4, coef_s32x2, 1 ) ); +} + +/* Note: This function has different return value than silk_noise_shape_quantizer_short_prediction_neon(). */ +/* Therefore here we append "_local" to the function name to avoid confusion. */ +static OPUS_INLINE int32x4_t silk_noise_shape_quantizer_short_prediction_neon_local(const opus_int32 *buf32, const opus_int32 *a_Q12_arch, opus_int order) +{ + const int32x4_t a_Q12_arch0_s32x4 = vld1q_s32( a_Q12_arch + 0 ); + const int32x4_t a_Q12_arch1_s32x4 = vld1q_s32( a_Q12_arch + 4 ); + const int32x4_t a_Q12_arch2_s32x4 = vld1q_s32( a_Q12_arch + 8 ); + const int32x4_t a_Q12_arch3_s32x4 = vld1q_s32( a_Q12_arch + 12 ); + int32x4_t LPC_pred_Q14_s32x4; + + silk_assert( order == 10 || order == 16 ); + /* Avoids introducing a bias because silk_SMLAWB() always rounds to -inf */ + LPC_pred_Q14_s32x4 = vdupq_n_s32( silk_RSHIFT( order, 1 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane0_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 0 * NEON_MAX_DEL_DEC_STATES ), vget_low_s32( a_Q12_arch0_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane1_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 1 * NEON_MAX_DEL_DEC_STATES ), vget_low_s32( a_Q12_arch0_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane0_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 2 * NEON_MAX_DEL_DEC_STATES ), vget_high_s32( a_Q12_arch0_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane1_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 3 * NEON_MAX_DEL_DEC_STATES ), vget_high_s32( a_Q12_arch0_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane0_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 4 * NEON_MAX_DEL_DEC_STATES ), vget_low_s32( a_Q12_arch1_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane1_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 5 * NEON_MAX_DEL_DEC_STATES ), vget_low_s32( a_Q12_arch1_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane0_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 6 * NEON_MAX_DEL_DEC_STATES ), vget_high_s32( a_Q12_arch1_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane1_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 7 * NEON_MAX_DEL_DEC_STATES ), vget_high_s32( a_Q12_arch1_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane0_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 8 * NEON_MAX_DEL_DEC_STATES ), vget_low_s32( a_Q12_arch2_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane1_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 9 * NEON_MAX_DEL_DEC_STATES ), vget_low_s32( a_Q12_arch2_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane0_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 10 * NEON_MAX_DEL_DEC_STATES ), vget_high_s32( a_Q12_arch2_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane1_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 11 * NEON_MAX_DEL_DEC_STATES ), vget_high_s32( a_Q12_arch2_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane0_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 12 * NEON_MAX_DEL_DEC_STATES ), vget_low_s32( a_Q12_arch3_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane1_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 13 * NEON_MAX_DEL_DEC_STATES ), vget_low_s32( a_Q12_arch3_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane0_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 14 * NEON_MAX_DEL_DEC_STATES ), vget_high_s32( a_Q12_arch3_s32x4 ) ); + LPC_pred_Q14_s32x4 = silk_SMLAWB_lane1_neon( LPC_pred_Q14_s32x4, vld1q_s32( buf32 + 15 * NEON_MAX_DEL_DEC_STATES ), vget_high_s32( a_Q12_arch3_s32x4 ) ); + + return LPC_pred_Q14_s32x4; +} + +static OPUS_INLINE void silk_noise_shape_quantizer_del_dec_neon( + silk_nsq_state *NSQ, /* I/O NSQ state */ + NSQ_del_decs_struct psDelDec[], /* I/O Delayed decision states */ + opus_int signalType, /* I Signal type */ + const opus_int32 x_Q10[], /* I */ + opus_int8 pulses[], /* O */ + opus_int16 xq[], /* O */ + opus_int32 sLTP_Q15[], /* I/O LTP filter state */ + opus_int32 delayedGain_Q10[], /* I/O Gain delay buffer */ + const opus_int16 a_Q12[], /* I Short term prediction coefs */ + const opus_int16 b_Q14[], /* I Long term prediction coefs */ + const opus_int16 AR_shp_Q13[], /* I Noise shaping coefs */ + opus_int lag, /* I Pitch lag */ + opus_int32 HarmShapeFIRPacked_Q14, /* I */ + opus_int Tilt_Q14, /* I Spectral tilt */ + opus_int32 LF_shp_Q14, /* I */ + opus_int32 Gain_Q16, /* I */ + opus_int Lambda_Q10, /* I */ + opus_int offset_Q10, /* I */ + opus_int length, /* I Input length */ + opus_int subfr, /* I Subframe number */ + opus_int shapingLPCOrder, /* I Shaping LPC filter order */ + opus_int predictLPCOrder, /* I Prediction filter order */ + opus_int warping_Q16, /* I */ + opus_int nStatesDelayedDecision, /* I Number of states in decision tree */ + opus_int *smpl_buf_idx, /* I/O Index to newest samples in buffers */ + opus_int decisionDelay /* I */ +) +{ + opus_int i, j, k, Winner_ind, RDmin_ind, RDmax_ind, last_smple_idx; + opus_int32 Winner_rand_state; + opus_int32 LTP_pred_Q14, n_LTP_Q14; + opus_int32 RDmin_Q10, RDmax_Q10; + opus_int32 Gain_Q10; + opus_int32 *pred_lag_ptr, *shp_lag_ptr; + opus_int32 a_Q12_arch[MAX_LPC_ORDER]; + const int32x2_t warping_Q16_s32x2 = vdup_n_s32( silk_LSHIFT32( warping_Q16, 16 ) >> 1 ); + const opus_int32 LF_shp_Q29 = silk_LSHIFT32( LF_shp_Q14, 16 ) >> 1; + opus_int32 AR_shp_Q28[ MAX_SHAPE_LPC_ORDER ]; + const uint32x4_t rand_multiplier_u32x4 = vdupq_n_u32( RAND_MULTIPLIER ); + const uint32x4_t rand_increment_u32x4 = vdupq_n_u32( RAND_INCREMENT ); + + VARDECL( NSQ_samples_struct, psSampleState ); + SAVE_STACK; + + silk_assert( nStatesDelayedDecision > 0 ); + silk_assert( ( shapingLPCOrder & 1 ) == 0 ); /* check that order is even */ + ALLOC( psSampleState, 2, NSQ_samples_struct ); + + shp_lag_ptr = &NSQ->sLTP_shp_Q14[ NSQ->sLTP_shp_buf_idx - lag + HARM_SHAPE_FIR_TAPS / 2 ]; + pred_lag_ptr = &sLTP_Q15[ NSQ->sLTP_buf_idx - lag + LTP_ORDER / 2 ]; + Gain_Q10 = silk_RSHIFT( Gain_Q16, 6 ); + + for( i = 0; i < ( MAX_SHAPE_LPC_ORDER - 7 ); i += 8 ) { + const int16x8_t t_s16x8 = vld1q_s16( AR_shp_Q13 + i ); + vst1q_s32( AR_shp_Q28 + i + 0, vshll_n_s16( vget_low_s16( t_s16x8 ), 15 ) ); + vst1q_s32( AR_shp_Q28 + i + 4, vshll_n_s16( vget_high_s16( t_s16x8 ), 15 ) ); + } + + for( ; i < MAX_SHAPE_LPC_ORDER; i++ ) { + AR_shp_Q28[i] = silk_LSHIFT32( AR_shp_Q13[i], 15 ); + } + + silk_short_prediction_create_arch_coef_neon_local( a_Q12_arch, a_Q12, predictLPCOrder ); + + for( i = 0; i < length; i++ ) { + int32x4_t Seed_s32x4, LPC_pred_Q14_s32x4; + int32x4_t sign_s32x4, tmp1_s32x4, tmp2_s32x4; + int32x4_t n_AR_Q14_s32x4, n_LF_Q14_s32x4; + int32x2_t AR_shp_Q28_s32x2; + int16x4_t r_Q10_s16x4, rr_Q10_s16x4; + + /* Perform common calculations used in all states */ + + /* Long-term prediction */ + if( signalType == TYPE_VOICED ) { + /* Unrolled loop */ + /* Avoids introducing a bias because silk_SMLAWB() always rounds to -inf */ + LTP_pred_Q14 = 2; + LTP_pred_Q14 = silk_SMLAWB( LTP_pred_Q14, pred_lag_ptr[ 0 ], b_Q14[ 0 ] ); + LTP_pred_Q14 = silk_SMLAWB( LTP_pred_Q14, pred_lag_ptr[ -1 ], b_Q14[ 1 ] ); + LTP_pred_Q14 = silk_SMLAWB( LTP_pred_Q14, pred_lag_ptr[ -2 ], b_Q14[ 2 ] ); + LTP_pred_Q14 = silk_SMLAWB( LTP_pred_Q14, pred_lag_ptr[ -3 ], b_Q14[ 3 ] ); + LTP_pred_Q14 = silk_SMLAWB( LTP_pred_Q14, pred_lag_ptr[ -4 ], b_Q14[ 4 ] ); + LTP_pred_Q14 = silk_LSHIFT( LTP_pred_Q14, 1 ); /* Q13 -> Q14 */ + pred_lag_ptr++; + } else { + LTP_pred_Q14 = 0; + } + + /* Long-term shaping */ + if( lag > 0 ) { + /* Symmetric, packed FIR coefficients */ + n_LTP_Q14 = silk_SMULWB( silk_ADD32( shp_lag_ptr[ 0 ], shp_lag_ptr[ -2 ] ), HarmShapeFIRPacked_Q14 ); + n_LTP_Q14 = silk_SMLAWT( n_LTP_Q14, shp_lag_ptr[ -1 ], HarmShapeFIRPacked_Q14 ); + n_LTP_Q14 = silk_SUB_LSHIFT32( LTP_pred_Q14, n_LTP_Q14, 2 ); /* Q12 -> Q14 */ + shp_lag_ptr++; + } else { + n_LTP_Q14 = 0; + } + + /* Generate dither */ + Seed_s32x4 = vld1q_s32( psDelDec->Seed ); + Seed_s32x4 = vreinterpretq_s32_u32( vmlaq_u32( rand_increment_u32x4, vreinterpretq_u32_s32( Seed_s32x4 ), rand_multiplier_u32x4 ) ); + vst1q_s32( psDelDec->Seed, Seed_s32x4 ); + + /* Short-term prediction */ + LPC_pred_Q14_s32x4 = silk_noise_shape_quantizer_short_prediction_neon_local(psDelDec->sLPC_Q14[ NSQ_LPC_BUF_LENGTH - 16 + i ], a_Q12_arch, predictLPCOrder); + LPC_pred_Q14_s32x4 = vshlq_n_s32( LPC_pred_Q14_s32x4, 4 ); /* Q10 -> Q14 */ + + /* Noise shape feedback */ + /* Output of lowpass section */ + tmp2_s32x4 = silk_SMLAWB_lane0_neon( vld1q_s32( psDelDec->Diff_Q14 ), vld1q_s32( psDelDec->sAR2_Q14[ 0 ] ), warping_Q16_s32x2 ); + /* Output of allpass section */ + tmp1_s32x4 = vsubq_s32( vld1q_s32( psDelDec->sAR2_Q14[ 1 ] ), tmp2_s32x4 ); + tmp1_s32x4 = silk_SMLAWB_lane0_neon( vld1q_s32( psDelDec->sAR2_Q14[ 0 ] ), tmp1_s32x4, warping_Q16_s32x2 ); + vst1q_s32( psDelDec->sAR2_Q14[ 0 ], tmp2_s32x4 ); + AR_shp_Q28_s32x2 = vld1_s32( AR_shp_Q28 ); + n_AR_Q14_s32x4 = vaddq_s32( vdupq_n_s32( silk_RSHIFT( shapingLPCOrder, 1 ) ), vqdmulhq_lane_s32( tmp2_s32x4, AR_shp_Q28_s32x2, 0 ) ); + + /* Loop over allpass sections */ + for( j = 2; j < shapingLPCOrder; j += 2 ) { + /* Output of allpass section */ + tmp2_s32x4 = vsubq_s32( vld1q_s32( psDelDec->sAR2_Q14[ j + 0 ] ), tmp1_s32x4 ); + tmp2_s32x4 = silk_SMLAWB_lane0_neon( vld1q_s32( psDelDec->sAR2_Q14[ j - 1 ] ), tmp2_s32x4, warping_Q16_s32x2 ); + vst1q_s32( psDelDec->sAR2_Q14[ j - 1 ], tmp1_s32x4 ); + n_AR_Q14_s32x4 = vaddq_s32( n_AR_Q14_s32x4, vqdmulhq_lane_s32( tmp1_s32x4, AR_shp_Q28_s32x2, 1 ) ); + /* Output of allpass section */ + tmp1_s32x4 = vsubq_s32( vld1q_s32( psDelDec->sAR2_Q14[ j + 1 ] ), tmp2_s32x4 ); + tmp1_s32x4 = silk_SMLAWB_lane0_neon( vld1q_s32( psDelDec->sAR2_Q14[ j + 0 ] ), tmp1_s32x4, warping_Q16_s32x2 ); + vst1q_s32( psDelDec->sAR2_Q14[ j + 0 ], tmp2_s32x4 ); + AR_shp_Q28_s32x2 = vld1_s32( &AR_shp_Q28[ j ] ); + n_AR_Q14_s32x4 = vaddq_s32( n_AR_Q14_s32x4, vqdmulhq_lane_s32( tmp2_s32x4, AR_shp_Q28_s32x2, 0 ) ); + } + vst1q_s32( psDelDec->sAR2_Q14[ shapingLPCOrder - 1 ], tmp1_s32x4 ); + n_AR_Q14_s32x4 = vaddq_s32( n_AR_Q14_s32x4, vqdmulhq_lane_s32( tmp1_s32x4, AR_shp_Q28_s32x2, 1 ) ); + n_AR_Q14_s32x4 = vshlq_n_s32( n_AR_Q14_s32x4, 1 ); /* Q11 -> Q12 */ + n_AR_Q14_s32x4 = vaddq_s32( n_AR_Q14_s32x4, vqdmulhq_n_s32( vld1q_s32( psDelDec->LF_AR_Q14 ), silk_LSHIFT32( Tilt_Q14, 16 ) >> 1 ) ); /* Q12 */ + n_AR_Q14_s32x4 = vshlq_n_s32( n_AR_Q14_s32x4, 2 ); /* Q12 -> Q14 */ + n_LF_Q14_s32x4 = vqdmulhq_n_s32( vld1q_s32( psDelDec->Shape_Q14[ *smpl_buf_idx ] ), LF_shp_Q29 ); /* Q12 */ + n_LF_Q14_s32x4 = vaddq_s32( n_LF_Q14_s32x4, vqdmulhq_n_s32( vld1q_s32( psDelDec->LF_AR_Q14 ), silk_LSHIFT32( LF_shp_Q14 >> 16 , 15 ) ) ); /* Q12 */ + n_LF_Q14_s32x4 = vshlq_n_s32( n_LF_Q14_s32x4, 2 ); /* Q12 -> Q14 */ + + /* Input minus prediction plus noise feedback */ + /* r = x[ i ] - LTP_pred - LPC_pred + n_AR + n_Tilt + n_LF + n_LTP */ + tmp1_s32x4 = vaddq_s32( n_AR_Q14_s32x4, n_LF_Q14_s32x4 ); /* Q14 */ + tmp2_s32x4 = vaddq_s32( vdupq_n_s32( n_LTP_Q14 ), LPC_pred_Q14_s32x4 ); /* Q13 */ + tmp1_s32x4 = vsubq_s32( tmp2_s32x4, tmp1_s32x4 ); /* Q13 */ + tmp1_s32x4 = vrshrq_n_s32( tmp1_s32x4, 4 ); /* Q10 */ + tmp1_s32x4 = vsubq_s32( vdupq_n_s32( x_Q10[ i ] ), tmp1_s32x4 ); /* residual error Q10 */ + + /* Flip sign depending on dither */ + sign_s32x4 = vreinterpretq_s32_u32( vcltq_s32( Seed_s32x4, vdupq_n_s32( 0 ) ) ); + tmp1_s32x4 = veorq_s32( tmp1_s32x4, sign_s32x4 ); + tmp1_s32x4 = vsubq_s32( tmp1_s32x4, sign_s32x4 ); + tmp1_s32x4 = vmaxq_s32( tmp1_s32x4, vdupq_n_s32( -( 31 << 10 ) ) ); + tmp1_s32x4 = vminq_s32( tmp1_s32x4, vdupq_n_s32( 30 << 10 ) ); + r_Q10_s16x4 = vmovn_s32( tmp1_s32x4 ); + + /* Find two quantization level candidates and measure their rate-distortion */ + { + int16x4_t q1_Q10_s16x4 = vsub_s16( r_Q10_s16x4, vdup_n_s16( offset_Q10 ) ); + int16x4_t q1_Q0_s16x4 = vshr_n_s16( q1_Q10_s16x4, 10 ); + int16x4_t q2_Q10_s16x4; + int32x4_t rd1_Q10_s32x4, rd2_Q10_s32x4; + uint32x4_t t_u32x4; + + if( Lambda_Q10 > 2048 ) { + /* For aggressive RDO, the bias becomes more than one pulse. */ + const int rdo_offset = Lambda_Q10/2 - 512; + const uint16x4_t greaterThanRdo = vcgt_s16( q1_Q10_s16x4, vdup_n_s16( rdo_offset ) ); + const uint16x4_t lessThanMinusRdo = vclt_s16( q1_Q10_s16x4, vdup_n_s16( -rdo_offset ) ); + /* If Lambda_Q10 > 32767, then q1_Q0, q1_Q10 and q2_Q10 must change to 32-bit. */ + silk_assert( Lambda_Q10 <= 32767 ); + + q1_Q0_s16x4 = vreinterpret_s16_u16( vclt_s16( q1_Q10_s16x4, vdup_n_s16( 0 ) ) ); + q1_Q0_s16x4 = vbsl_s16( greaterThanRdo, vsub_s16( q1_Q10_s16x4, vdup_n_s16( rdo_offset ) ), q1_Q0_s16x4 ); + q1_Q0_s16x4 = vbsl_s16( lessThanMinusRdo, vadd_s16( q1_Q10_s16x4, vdup_n_s16( rdo_offset ) ), q1_Q0_s16x4 ); + q1_Q0_s16x4 = vshr_n_s16( q1_Q0_s16x4, 10 ); + } + { + const uint16x4_t equal0_u16x4 = vceq_s16( q1_Q0_s16x4, vdup_n_s16( 0 ) ); + const uint16x4_t equalMinus1_u16x4 = vceq_s16( q1_Q0_s16x4, vdup_n_s16( -1 ) ); + const uint16x4_t lessThanMinus1_u16x4 = vclt_s16( q1_Q0_s16x4, vdup_n_s16( -1 ) ); + int16x4_t tmp1_s16x4, tmp2_s16x4; + + q1_Q10_s16x4 = vshl_n_s16( q1_Q0_s16x4, 10 ); + tmp1_s16x4 = vadd_s16( q1_Q10_s16x4, vdup_n_s16( offset_Q10 - QUANT_LEVEL_ADJUST_Q10 ) ); + q1_Q10_s16x4 = vadd_s16( q1_Q10_s16x4, vdup_n_s16( offset_Q10 + QUANT_LEVEL_ADJUST_Q10 ) ); + q1_Q10_s16x4 = vbsl_s16( lessThanMinus1_u16x4, q1_Q10_s16x4, tmp1_s16x4 ); + q1_Q10_s16x4 = vbsl_s16( equal0_u16x4, vdup_n_s16( offset_Q10 ), q1_Q10_s16x4 ); + q1_Q10_s16x4 = vbsl_s16( equalMinus1_u16x4, vdup_n_s16( offset_Q10 - ( 1024 - QUANT_LEVEL_ADJUST_Q10 ) ), q1_Q10_s16x4 ); + q2_Q10_s16x4 = vadd_s16( q1_Q10_s16x4, vdup_n_s16( 1024 ) ); + q2_Q10_s16x4 = vbsl_s16( equal0_u16x4, vdup_n_s16( offset_Q10 + 1024 - QUANT_LEVEL_ADJUST_Q10 ), q2_Q10_s16x4 ); + q2_Q10_s16x4 = vbsl_s16( equalMinus1_u16x4, vdup_n_s16( offset_Q10 ), q2_Q10_s16x4 ); + tmp1_s16x4 = q1_Q10_s16x4; + tmp2_s16x4 = q2_Q10_s16x4; + tmp1_s16x4 = vbsl_s16( vorr_u16( equalMinus1_u16x4, lessThanMinus1_u16x4 ), vneg_s16( tmp1_s16x4 ), tmp1_s16x4 ); + tmp2_s16x4 = vbsl_s16( lessThanMinus1_u16x4, vneg_s16( tmp2_s16x4 ), tmp2_s16x4 ); + rd1_Q10_s32x4 = vmull_s16( tmp1_s16x4, vdup_n_s16( Lambda_Q10 ) ); + rd2_Q10_s32x4 = vmull_s16( tmp2_s16x4, vdup_n_s16( Lambda_Q10 ) ); + } + + rr_Q10_s16x4 = vsub_s16( r_Q10_s16x4, q1_Q10_s16x4 ); + rd1_Q10_s32x4 = vmlal_s16( rd1_Q10_s32x4, rr_Q10_s16x4, rr_Q10_s16x4 ); + rd1_Q10_s32x4 = vshrq_n_s32( rd1_Q10_s32x4, 10 ); + + rr_Q10_s16x4 = vsub_s16( r_Q10_s16x4, q2_Q10_s16x4 ); + rd2_Q10_s32x4 = vmlal_s16( rd2_Q10_s32x4, rr_Q10_s16x4, rr_Q10_s16x4 ); + rd2_Q10_s32x4 = vshrq_n_s32( rd2_Q10_s32x4, 10 ); + + tmp2_s32x4 = vld1q_s32( psDelDec->RD_Q10 ); + tmp1_s32x4 = vaddq_s32( tmp2_s32x4, vminq_s32( rd1_Q10_s32x4, rd2_Q10_s32x4 ) ); + tmp2_s32x4 = vaddq_s32( tmp2_s32x4, vmaxq_s32( rd1_Q10_s32x4, rd2_Q10_s32x4 ) ); + vst1q_s32( psSampleState[ 0 ].RD_Q10, tmp1_s32x4 ); + vst1q_s32( psSampleState[ 1 ].RD_Q10, tmp2_s32x4 ); + t_u32x4 = vcltq_s32( rd1_Q10_s32x4, rd2_Q10_s32x4 ); + tmp1_s32x4 = vbslq_s32( t_u32x4, vmovl_s16( q1_Q10_s16x4 ), vmovl_s16( q2_Q10_s16x4 ) ); + tmp2_s32x4 = vbslq_s32( t_u32x4, vmovl_s16( q2_Q10_s16x4 ), vmovl_s16( q1_Q10_s16x4 ) ); + vst1q_s32( psSampleState[ 0 ].Q_Q10, tmp1_s32x4 ); + vst1q_s32( psSampleState[ 1 ].Q_Q10, tmp2_s32x4 ); + } + + { + /* Update states for best quantization */ + int32x4_t exc_Q14_s32x4, LPC_exc_Q14_s32x4, xq_Q14_s32x4, sLF_AR_shp_Q14_s32x4; + + /* Quantized excitation */ + exc_Q14_s32x4 = vshlq_n_s32( tmp1_s32x4, 4 ); + exc_Q14_s32x4 = veorq_s32( exc_Q14_s32x4, sign_s32x4 ); + exc_Q14_s32x4 = vsubq_s32( exc_Q14_s32x4, sign_s32x4 ); + + /* Add predictions */ + LPC_exc_Q14_s32x4 = vaddq_s32( exc_Q14_s32x4, vdupq_n_s32( LTP_pred_Q14 ) ); + xq_Q14_s32x4 = vaddq_s32( LPC_exc_Q14_s32x4, LPC_pred_Q14_s32x4 ); + + /* Update states */ + tmp1_s32x4 = vsubq_s32( xq_Q14_s32x4, vshlq_n_s32( vdupq_n_s32( x_Q10[ i ] ), 4 ) ); + vst1q_s32( psSampleState[ 0 ].Diff_Q14, tmp1_s32x4 ); + sLF_AR_shp_Q14_s32x4 = vsubq_s32( tmp1_s32x4, n_AR_Q14_s32x4 ); + vst1q_s32( psSampleState[ 0 ].sLTP_shp_Q14, vsubq_s32( sLF_AR_shp_Q14_s32x4, n_LF_Q14_s32x4 ) ); + vst1q_s32( psSampleState[ 0 ].LF_AR_Q14, sLF_AR_shp_Q14_s32x4 ); + vst1q_s32( psSampleState[ 0 ].LPC_exc_Q14, LPC_exc_Q14_s32x4 ); + vst1q_s32( psSampleState[ 0 ].xq_Q14, xq_Q14_s32x4 ); + + /* Quantized excitation */ + exc_Q14_s32x4 = vshlq_n_s32( tmp2_s32x4, 4 ); + exc_Q14_s32x4 = veorq_s32( exc_Q14_s32x4, sign_s32x4 ); + exc_Q14_s32x4 = vsubq_s32( exc_Q14_s32x4, sign_s32x4 ); + + /* Add predictions */ + LPC_exc_Q14_s32x4 = vaddq_s32( exc_Q14_s32x4, vdupq_n_s32( LTP_pred_Q14 ) ); + xq_Q14_s32x4 = vaddq_s32( LPC_exc_Q14_s32x4, LPC_pred_Q14_s32x4 ); + + /* Update states */ + tmp1_s32x4 = vsubq_s32( xq_Q14_s32x4, vshlq_n_s32( vdupq_n_s32( x_Q10[ i ] ), 4 ) ); + vst1q_s32( psSampleState[ 1 ].Diff_Q14, tmp1_s32x4 ); + sLF_AR_shp_Q14_s32x4 = vsubq_s32( tmp1_s32x4, n_AR_Q14_s32x4 ); + vst1q_s32( psSampleState[ 1 ].sLTP_shp_Q14, vsubq_s32( sLF_AR_shp_Q14_s32x4, n_LF_Q14_s32x4 ) ); + vst1q_s32( psSampleState[ 1 ].LF_AR_Q14, sLF_AR_shp_Q14_s32x4 ); + vst1q_s32( psSampleState[ 1 ].LPC_exc_Q14, LPC_exc_Q14_s32x4 ); + vst1q_s32( psSampleState[ 1 ].xq_Q14, xq_Q14_s32x4 ); + } + + *smpl_buf_idx = *smpl_buf_idx ? ( *smpl_buf_idx - 1 ) : ( DECISION_DELAY - 1); + last_smple_idx = *smpl_buf_idx + decisionDelay + DECISION_DELAY; + if( last_smple_idx >= DECISION_DELAY ) last_smple_idx -= DECISION_DELAY; + if( last_smple_idx >= DECISION_DELAY ) last_smple_idx -= DECISION_DELAY; + + /* Find winner */ + RDmin_Q10 = psSampleState[ 0 ].RD_Q10[ 0 ]; + Winner_ind = 0; + for( k = 1; k < nStatesDelayedDecision; k++ ) { + if( psSampleState[ 0 ].RD_Q10[ k ] < RDmin_Q10 ) { + RDmin_Q10 = psSampleState[ 0 ].RD_Q10[ k ]; + Winner_ind = k; + } + } + + /* Increase RD values of expired states */ + { + uint32x4_t t_u32x4; + Winner_rand_state = psDelDec->RandState[ last_smple_idx ][ Winner_ind ]; + t_u32x4 = vceqq_s32( vld1q_s32( psDelDec->RandState[ last_smple_idx ] ), vdupq_n_s32( Winner_rand_state ) ); + t_u32x4 = vmvnq_u32( t_u32x4 ); + t_u32x4 = vshrq_n_u32( t_u32x4, 5 ); + tmp1_s32x4 = vld1q_s32( psSampleState[ 0 ].RD_Q10 ); + tmp2_s32x4 = vld1q_s32( psSampleState[ 1 ].RD_Q10 ); + tmp1_s32x4 = vaddq_s32( tmp1_s32x4, vreinterpretq_s32_u32( t_u32x4 ) ); + tmp2_s32x4 = vaddq_s32( tmp2_s32x4, vreinterpretq_s32_u32( t_u32x4 ) ); + vst1q_s32( psSampleState[ 0 ].RD_Q10, tmp1_s32x4 ); + vst1q_s32( psSampleState[ 1 ].RD_Q10, tmp2_s32x4 ); + + /* Find worst in first set and best in second set */ + RDmax_Q10 = psSampleState[ 0 ].RD_Q10[ 0 ]; + RDmin_Q10 = psSampleState[ 1 ].RD_Q10[ 0 ]; + RDmax_ind = 0; + RDmin_ind = 0; + for( k = 1; k < nStatesDelayedDecision; k++ ) { + /* find worst in first set */ + if( psSampleState[ 0 ].RD_Q10[ k ] > RDmax_Q10 ) { + RDmax_Q10 = psSampleState[ 0 ].RD_Q10[ k ]; + RDmax_ind = k; + } + /* find best in second set */ + if( psSampleState[ 1 ].RD_Q10[ k ] < RDmin_Q10 ) { + RDmin_Q10 = psSampleState[ 1 ].RD_Q10[ k ]; + RDmin_ind = k; + } + } + } + + /* Replace a state if best from second set outperforms worst in first set */ + if( RDmin_Q10 < RDmax_Q10 ) { + opus_int32 (*ptr)[NEON_MAX_DEL_DEC_STATES] = psDelDec->RandState; + const int numOthers = (int)( ( sizeof( NSQ_del_decs_struct ) - sizeof( ( (NSQ_del_decs_struct *)0 )->sLPC_Q14 ) ) + / ( NEON_MAX_DEL_DEC_STATES * sizeof( opus_int32 ) ) ); + /* Only ( predictLPCOrder - 1 ) of sLPC_Q14 buffer need to be updated, though the first several */ + /* useless sLPC_Q14[] will be different comparing with C when predictLPCOrder < NSQ_LPC_BUF_LENGTH. */ + /* Here just update constant ( NSQ_LPC_BUF_LENGTH - 1 ) for simplicity. */ + for( j = i + 1; j < i + NSQ_LPC_BUF_LENGTH; j++ ) { + psDelDec->sLPC_Q14[ j ][ RDmax_ind ] = psDelDec->sLPC_Q14[ j ][ RDmin_ind ]; + } + for( j = 0; j < numOthers; j++ ) { + ptr[ j ][ RDmax_ind ] = ptr[ j ][ RDmin_ind ]; + } + + psSampleState[ 0 ].Q_Q10[ RDmax_ind ] = psSampleState[ 1 ].Q_Q10[ RDmin_ind ]; + psSampleState[ 0 ].RD_Q10[ RDmax_ind ] = psSampleState[ 1 ].RD_Q10[ RDmin_ind ]; + psSampleState[ 0 ].xq_Q14[ RDmax_ind ] = psSampleState[ 1 ].xq_Q14[ RDmin_ind ]; + psSampleState[ 0 ].LF_AR_Q14[ RDmax_ind ] = psSampleState[ 1 ].LF_AR_Q14[ RDmin_ind ]; + psSampleState[ 0 ].Diff_Q14[ RDmax_ind ] = psSampleState[ 1 ].Diff_Q14[ RDmin_ind ]; + psSampleState[ 0 ].sLTP_shp_Q14[ RDmax_ind ] = psSampleState[ 1 ].sLTP_shp_Q14[ RDmin_ind ]; + psSampleState[ 0 ].LPC_exc_Q14[ RDmax_ind ] = psSampleState[ 1 ].LPC_exc_Q14[ RDmin_ind ]; + } + + /* Write samples from winner to output and long-term filter states */ + if( subfr > 0 || i >= decisionDelay ) { + pulses[ i - decisionDelay ] = (opus_int8)silk_RSHIFT_ROUND( psDelDec->Q_Q10[ last_smple_idx ][ Winner_ind ], 10 ); + xq[ i - decisionDelay ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( + silk_SMULWW( psDelDec->Xq_Q14[ last_smple_idx ][ Winner_ind ], delayedGain_Q10[ last_smple_idx ] ), 8 ) ); + NSQ->sLTP_shp_Q14[ NSQ->sLTP_shp_buf_idx - decisionDelay ] = psDelDec->Shape_Q14[ last_smple_idx ][ Winner_ind ]; + sLTP_Q15[ NSQ->sLTP_buf_idx - decisionDelay ] = psDelDec->Pred_Q15[ last_smple_idx ][ Winner_ind ]; + } + NSQ->sLTP_shp_buf_idx++; + NSQ->sLTP_buf_idx++; + + /* Update states */ + vst1q_s32( psDelDec->LF_AR_Q14, vld1q_s32( psSampleState[ 0 ].LF_AR_Q14 ) ); + vst1q_s32( psDelDec->Diff_Q14, vld1q_s32( psSampleState[ 0 ].Diff_Q14 ) ); + vst1q_s32( psDelDec->sLPC_Q14[ NSQ_LPC_BUF_LENGTH + i ], vld1q_s32( psSampleState[ 0 ].xq_Q14 ) ); + vst1q_s32( psDelDec->Xq_Q14[ *smpl_buf_idx ], vld1q_s32( psSampleState[ 0 ].xq_Q14 ) ); + tmp1_s32x4 = vld1q_s32( psSampleState[ 0 ].Q_Q10 ); + vst1q_s32( psDelDec->Q_Q10[ *smpl_buf_idx ], tmp1_s32x4 ); + vst1q_s32( psDelDec->Pred_Q15[ *smpl_buf_idx ], vshlq_n_s32( vld1q_s32( psSampleState[ 0 ].LPC_exc_Q14 ), 1 ) ); + vst1q_s32( psDelDec->Shape_Q14[ *smpl_buf_idx ], vld1q_s32( psSampleState[ 0 ].sLTP_shp_Q14 ) ); + tmp1_s32x4 = vrshrq_n_s32( tmp1_s32x4, 10 ); + tmp1_s32x4 = vaddq_s32( vld1q_s32( psDelDec->Seed ), tmp1_s32x4 ); + vst1q_s32( psDelDec->Seed, tmp1_s32x4 ); + vst1q_s32( psDelDec->RandState[ *smpl_buf_idx ], tmp1_s32x4 ); + vst1q_s32( psDelDec->RD_Q10, vld1q_s32( psSampleState[ 0 ].RD_Q10 ) ); + delayedGain_Q10[ *smpl_buf_idx ] = Gain_Q10; + } + /* Update LPC states */ + silk_memcpy( psDelDec->sLPC_Q14[ 0 ], psDelDec->sLPC_Q14[ length ], NEON_MAX_DEL_DEC_STATES * NSQ_LPC_BUF_LENGTH * sizeof( opus_int32 ) ); + + RESTORE_STACK; +} + +static OPUS_INLINE void silk_SMULWB_8_neon( + const opus_int16 *a, + const int32x2_t b, + opus_int32 *o +) +{ + const int16x8_t a_s16x8 = vld1q_s16( a ); + int32x4_t o0_s32x4, o1_s32x4; + + o0_s32x4 = vshll_n_s16( vget_low_s16( a_s16x8 ), 15 ); + o1_s32x4 = vshll_n_s16( vget_high_s16( a_s16x8 ), 15 ); + o0_s32x4 = vqdmulhq_lane_s32( o0_s32x4, b, 0 ); + o1_s32x4 = vqdmulhq_lane_s32( o1_s32x4, b, 0 ); + vst1q_s32( o, o0_s32x4 ); + vst1q_s32( o + 4, o1_s32x4 ); +} + +/* Only works when ( b >= -65536 ) && ( b < 65536 ). */ +static OPUS_INLINE void silk_SMULWW_small_b_4_neon( + opus_int32 *a, + const int32x2_t b_s32x2) +{ + int32x4_t o_s32x4; + + o_s32x4 = vld1q_s32( a ); + o_s32x4 = vqdmulhq_lane_s32( o_s32x4, b_s32x2, 0 ); + vst1q_s32( a, o_s32x4 ); +} + +/* Only works when ( b >= -65536 ) && ( b < 65536 ). */ +static OPUS_INLINE void silk_SMULWW_small_b_8_neon( + opus_int32 *a, + const int32x2_t b_s32x2 +) +{ + int32x4_t o0_s32x4, o1_s32x4; + + o0_s32x4 = vld1q_s32( a ); + o1_s32x4 = vld1q_s32( a + 4 ); + o0_s32x4 = vqdmulhq_lane_s32( o0_s32x4, b_s32x2, 0 ); + o1_s32x4 = vqdmulhq_lane_s32( o1_s32x4, b_s32x2, 0 ); + vst1q_s32( a, o0_s32x4 ); + vst1q_s32( a + 4, o1_s32x4 ); +} + +static OPUS_INLINE void silk_SMULWW_4_neon( + opus_int32 *a, + const int32x2_t b_s32x2) +{ + int32x4_t a_s32x4, o_s32x4; + + a_s32x4 = vld1q_s32( a ); + o_s32x4 = vqdmulhq_lane_s32( a_s32x4, b_s32x2, 0 ); + o_s32x4 = vmlaq_lane_s32( o_s32x4, a_s32x4, b_s32x2, 1 ); + vst1q_s32( a, o_s32x4 ); +} + +static OPUS_INLINE void silk_SMULWW_8_neon( + opus_int32 *a, + const int32x2_t b_s32x2 +) +{ + int32x4_t a0_s32x4, a1_s32x4, o0_s32x4, o1_s32x4; + + a0_s32x4 = vld1q_s32( a ); + a1_s32x4 = vld1q_s32( a + 4 ); + o0_s32x4 = vqdmulhq_lane_s32( a0_s32x4, b_s32x2, 0 ); + o1_s32x4 = vqdmulhq_lane_s32( a1_s32x4, b_s32x2, 0 ); + o0_s32x4 = vmlaq_lane_s32( o0_s32x4, a0_s32x4, b_s32x2, 1 ); + o1_s32x4 = vmlaq_lane_s32( o1_s32x4, a1_s32x4, b_s32x2, 1 ); + vst1q_s32( a, o0_s32x4 ); + vst1q_s32( a + 4, o1_s32x4 ); +} + +static OPUS_INLINE void silk_SMULWW_loop_neon( + const opus_int16 *a, + const opus_int32 b, + opus_int32 *o, + const opus_int loop_num +) +{ + opus_int i; + int32x2_t b_s32x2; + + b_s32x2 = vdup_n_s32( b ); + for( i = 0; i < loop_num - 7; i += 8 ) { + silk_SMULWB_8_neon( a + i, b_s32x2, o + i ); + } + for( ; i < loop_num; i++ ) { + o[ i ] = silk_SMULWW( a[ i ], b ); + } +} + +static OPUS_INLINE void silk_nsq_del_dec_scale_states_neon( + const silk_encoder_state *psEncC, /* I Encoder State */ + silk_nsq_state *NSQ, /* I/O NSQ state */ + NSQ_del_decs_struct psDelDec[], /* I/O Delayed decision states */ + const opus_int16 x16[], /* I Input */ + opus_int32 x_sc_Q10[], /* O Input scaled with 1/Gain in Q10 */ + const opus_int16 sLTP[], /* I Re-whitened LTP state in Q0 */ + opus_int32 sLTP_Q15[], /* O LTP state matching scaled input */ + opus_int subfr, /* I Subframe number */ + const opus_int LTP_scale_Q14, /* I LTP state scaling */ + const opus_int32 Gains_Q16[ MAX_NB_SUBFR ], /* I */ + const opus_int pitchL[ MAX_NB_SUBFR ], /* I Pitch lag */ + const opus_int signal_type, /* I Signal type */ + const opus_int decisionDelay /* I Decision delay */ +) +{ + opus_int i, lag; + opus_int32 gain_adj_Q16, inv_gain_Q31, inv_gain_Q26; + + lag = pitchL[ subfr ]; + inv_gain_Q31 = silk_INVERSE32_varQ( silk_max( Gains_Q16[ subfr ], 1 ), 47 ); + silk_assert( inv_gain_Q31 != 0 ); + + /* Scale input */ + inv_gain_Q26 = silk_RSHIFT_ROUND( inv_gain_Q31, 5 ); + silk_SMULWW_loop_neon( x16, inv_gain_Q26, x_sc_Q10, psEncC->subfr_length ); + + /* After rewhitening the LTP state is un-scaled, so scale with inv_gain_Q16 */ + if( NSQ->rewhite_flag ) { + if( subfr == 0 ) { + /* Do LTP downscaling */ + inv_gain_Q31 = silk_LSHIFT( silk_SMULWB( inv_gain_Q31, LTP_scale_Q14 ), 2 ); + } + silk_SMULWW_loop_neon( sLTP + NSQ->sLTP_buf_idx - lag - LTP_ORDER / 2, inv_gain_Q31, sLTP_Q15 + NSQ->sLTP_buf_idx - lag - LTP_ORDER / 2, lag + LTP_ORDER / 2 ); + } + + /* Adjust for changing gain */ + if( Gains_Q16[ subfr ] != NSQ->prev_gain_Q16 ) { + int32x2_t gain_adj_Q16_s32x2; + gain_adj_Q16 = silk_DIV32_varQ( NSQ->prev_gain_Q16, Gains_Q16[ subfr ], 16 ); + + /* Scale long-term shaping state */ + if( ( gain_adj_Q16 >= -65536 ) && ( gain_adj_Q16 < 65536 ) ) { + gain_adj_Q16_s32x2 = vdup_n_s32( silk_LSHIFT32( gain_adj_Q16, 15 ) ); + for( i = NSQ->sLTP_shp_buf_idx - psEncC->ltp_mem_length; i < NSQ->sLTP_shp_buf_idx - 7; i += 8 ) { + silk_SMULWW_small_b_8_neon( NSQ->sLTP_shp_Q14 + i, gain_adj_Q16_s32x2 ); + } + for( ; i < NSQ->sLTP_shp_buf_idx; i++ ) { + NSQ->sLTP_shp_Q14[ i ] = silk_SMULWW( gain_adj_Q16, NSQ->sLTP_shp_Q14[ i ] ); + } + + /* Scale long-term prediction state */ + if( signal_type == TYPE_VOICED && NSQ->rewhite_flag == 0 ) { + for( i = NSQ->sLTP_buf_idx - lag - LTP_ORDER / 2; i < NSQ->sLTP_buf_idx - decisionDelay - 7; i += 8 ) { + silk_SMULWW_small_b_8_neon( sLTP_Q15 + i, gain_adj_Q16_s32x2 ); + } + for( ; i < NSQ->sLTP_buf_idx - decisionDelay; i++ ) { + sLTP_Q15[ i ] = silk_SMULWW( gain_adj_Q16, sLTP_Q15[ i ] ); + } + } + + /* Scale scalar states */ + silk_SMULWW_small_b_4_neon( psDelDec->LF_AR_Q14, gain_adj_Q16_s32x2 ); + silk_SMULWW_small_b_4_neon( psDelDec->Diff_Q14, gain_adj_Q16_s32x2 ); + + /* Scale short-term prediction and shaping states */ + for( i = 0; i < NSQ_LPC_BUF_LENGTH; i++ ) { + silk_SMULWW_small_b_4_neon( psDelDec->sLPC_Q14[ i ], gain_adj_Q16_s32x2 ); + } + + for( i = 0; i < MAX_SHAPE_LPC_ORDER; i++ ) { + silk_SMULWW_small_b_4_neon( psDelDec->sAR2_Q14[ i ], gain_adj_Q16_s32x2 ); + } + + for( i = 0; i < DECISION_DELAY; i++ ) { + silk_SMULWW_small_b_4_neon( psDelDec->Pred_Q15[ i ], gain_adj_Q16_s32x2 ); + silk_SMULWW_small_b_4_neon( psDelDec->Shape_Q14[ i ], gain_adj_Q16_s32x2 ); + } + } else { + gain_adj_Q16_s32x2 = vdup_n_s32( silk_LSHIFT32( gain_adj_Q16 & 0x0000FFFF, 15 ) ); + gain_adj_Q16_s32x2 = vset_lane_s32( gain_adj_Q16 >> 16, gain_adj_Q16_s32x2, 1 ); + for( i = NSQ->sLTP_shp_buf_idx - psEncC->ltp_mem_length; i < NSQ->sLTP_shp_buf_idx - 7; i += 8 ) { + silk_SMULWW_8_neon( NSQ->sLTP_shp_Q14 + i, gain_adj_Q16_s32x2 ); + } + for( ; i < NSQ->sLTP_shp_buf_idx; i++ ) { + NSQ->sLTP_shp_Q14[ i ] = silk_SMULWW( gain_adj_Q16, NSQ->sLTP_shp_Q14[ i ] ); + } + + /* Scale long-term prediction state */ + if( signal_type == TYPE_VOICED && NSQ->rewhite_flag == 0 ) { + for( i = NSQ->sLTP_buf_idx - lag - LTP_ORDER / 2; i < NSQ->sLTP_buf_idx - decisionDelay - 7; i += 8 ) { + silk_SMULWW_8_neon( sLTP_Q15 + i, gain_adj_Q16_s32x2 ); + } + for( ; i < NSQ->sLTP_buf_idx - decisionDelay; i++ ) { + sLTP_Q15[ i ] = silk_SMULWW( gain_adj_Q16, sLTP_Q15[ i ] ); + } + } + + /* Scale scalar states */ + silk_SMULWW_4_neon( psDelDec->LF_AR_Q14, gain_adj_Q16_s32x2 ); + silk_SMULWW_4_neon( psDelDec->Diff_Q14, gain_adj_Q16_s32x2 ); + + /* Scale short-term prediction and shaping states */ + for( i = 0; i < NSQ_LPC_BUF_LENGTH; i++ ) { + silk_SMULWW_4_neon( psDelDec->sLPC_Q14[ i ], gain_adj_Q16_s32x2 ); + } + + for( i = 0; i < MAX_SHAPE_LPC_ORDER; i++ ) { + silk_SMULWW_4_neon( psDelDec->sAR2_Q14[ i ], gain_adj_Q16_s32x2 ); + } + + for( i = 0; i < DECISION_DELAY; i++ ) { + silk_SMULWW_4_neon( psDelDec->Pred_Q15[ i ], gain_adj_Q16_s32x2 ); + silk_SMULWW_4_neon( psDelDec->Shape_Q14[ i ], gain_adj_Q16_s32x2 ); + } + } + + /* Save inverse gain */ + NSQ->prev_gain_Q16 = Gains_Q16[ subfr ]; + } +} diff --git a/thirdparty/opus/silk/arm/NSQ_neon.h b/thirdparty/opus/silk/arm/NSQ_neon.h index 77c946af85..b31d9442d6 100644 --- a/thirdparty/opus/silk/arm/NSQ_neon.h +++ b/thirdparty/opus/silk/arm/NSQ_neon.h @@ -28,30 +28,31 @@ POSSIBILITY OF SUCH DAMAGE. #define SILK_NSQ_NEON_H #include "cpu_support.h" +#include "SigProc_FIX.h" #undef silk_short_prediction_create_arch_coef /* For vectorized calc, reverse a_Q12 coefs, convert to 32-bit, and shift for vqdmulhq_s32. */ static OPUS_INLINE void silk_short_prediction_create_arch_coef_neon(opus_int32 *out, const opus_int16 *in, opus_int order) { - out[15] = in[0] << 15; - out[14] = in[1] << 15; - out[13] = in[2] << 15; - out[12] = in[3] << 15; - out[11] = in[4] << 15; - out[10] = in[5] << 15; - out[9] = in[6] << 15; - out[8] = in[7] << 15; - out[7] = in[8] << 15; - out[6] = in[9] << 15; + out[15] = silk_LSHIFT32(in[0], 15); + out[14] = silk_LSHIFT32(in[1], 15); + out[13] = silk_LSHIFT32(in[2], 15); + out[12] = silk_LSHIFT32(in[3], 15); + out[11] = silk_LSHIFT32(in[4], 15); + out[10] = silk_LSHIFT32(in[5], 15); + out[9] = silk_LSHIFT32(in[6], 15); + out[8] = silk_LSHIFT32(in[7], 15); + out[7] = silk_LSHIFT32(in[8], 15); + out[6] = silk_LSHIFT32(in[9], 15); if (order == 16) { - out[5] = in[10] << 15; - out[4] = in[11] << 15; - out[3] = in[12] << 15; - out[2] = in[13] << 15; - out[1] = in[14] << 15; - out[0] = in[15] << 15; + out[5] = silk_LSHIFT32(in[10], 15); + out[4] = silk_LSHIFT32(in[11], 15); + out[3] = silk_LSHIFT32(in[12], 15); + out[2] = silk_LSHIFT32(in[13], 15); + out[1] = silk_LSHIFT32(in[14], 15); + out[0] = silk_LSHIFT32(in[15], 15); } else { diff --git a/thirdparty/opus/silk/arm/arm_silk_map.c b/thirdparty/opus/silk/arm/arm_silk_map.c index 9bd86a7b21..0b9bfec2ca 100644 --- a/thirdparty/opus/silk/arm/arm_silk_map.c +++ b/thirdparty/opus/silk/arm/arm_silk_map.c @@ -28,13 +28,62 @@ POSSIBILITY OF SUCH DAMAGE. # include "config.h" #endif +#include "main_FIX.h" #include "NSQ.h" +#include "SigProc_FIX.h" #if defined(OPUS_HAVE_RTCD) # if (defined(OPUS_ARM_MAY_HAVE_NEON_INTR) && \ !defined(OPUS_ARM_PRESUME_NEON_INTR)) +void (*const SILK_BIQUAD_ALT_STRIDE2_IMPL[OPUS_ARCHMASK + 1])( + const opus_int16 *in, /* I input signal */ + const opus_int32 *B_Q28, /* I MA coefficients [3] */ + const opus_int32 *A_Q28, /* I AR coefficients [2] */ + opus_int32 *S, /* I/O State vector [4] */ + opus_int16 *out, /* O output signal */ + const opus_int32 len /* I signal length (must be even) */ +) = { + silk_biquad_alt_stride2_c, /* ARMv4 */ + silk_biquad_alt_stride2_c, /* EDSP */ + silk_biquad_alt_stride2_c, /* Media */ + silk_biquad_alt_stride2_neon, /* Neon */ +}; + +opus_int32 (*const SILK_LPC_INVERSE_PRED_GAIN_IMPL[OPUS_ARCHMASK + 1])( /* O Returns inverse prediction gain in energy domain, Q30 */ + const opus_int16 *A_Q12, /* I Prediction coefficients, Q12 [order] */ + const opus_int order /* I Prediction order */ +) = { + silk_LPC_inverse_pred_gain_c, /* ARMv4 */ + silk_LPC_inverse_pred_gain_c, /* EDSP */ + silk_LPC_inverse_pred_gain_c, /* Media */ + silk_LPC_inverse_pred_gain_neon, /* Neon */ +}; + +void (*const SILK_NSQ_DEL_DEC_IMPL[OPUS_ARCHMASK + 1])( + const silk_encoder_state *psEncC, /* I Encoder State */ + silk_nsq_state *NSQ, /* I/O NSQ state */ + SideInfoIndices *psIndices, /* I/O Quantization Indices */ + const opus_int16 x16[], /* I Input */ + opus_int8 pulses[], /* O Quantized pulse signal */ + const opus_int16 PredCoef_Q12[ 2 * MAX_LPC_ORDER ], /* I Short term prediction coefs */ + const opus_int16 LTPCoef_Q14[ LTP_ORDER * MAX_NB_SUBFR ], /* I Long term prediction coefs */ + const opus_int16 AR_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ], /* I Noise shaping coefs */ + const opus_int HarmShapeGain_Q14[ MAX_NB_SUBFR ], /* I Long term shaping coefs */ + const opus_int Tilt_Q14[ MAX_NB_SUBFR ], /* I Spectral tilt */ + const opus_int32 LF_shp_Q14[ MAX_NB_SUBFR ], /* I Low frequency shaping coefs */ + const opus_int32 Gains_Q16[ MAX_NB_SUBFR ], /* I Quantization step sizes */ + const opus_int pitchL[ MAX_NB_SUBFR ], /* I Pitch lags */ + const opus_int Lambda_Q10, /* I Rate/distortion tradeoff */ + const opus_int LTP_scale_Q14 /* I LTP state scaling */ +) = { + silk_NSQ_del_dec_c, /* ARMv4 */ + silk_NSQ_del_dec_c, /* EDSP */ + silk_NSQ_del_dec_c, /* Media */ + silk_NSQ_del_dec_neon, /* Neon */ +}; + /*There is no table for silk_noise_shape_quantizer_short_prediction because the NEON version takes different parameters than the C version. Instead RTCD is done via if statements at the call sites. @@ -52,4 +101,23 @@ opus_int32 # endif +# if defined(FIXED_POINT) && \ + defined(OPUS_ARM_MAY_HAVE_NEON_INTR) && !defined(OPUS_ARM_PRESUME_NEON_INTR) + +void (*const SILK_WARPED_AUTOCORRELATION_FIX_IMPL[OPUS_ARCHMASK + 1])( + opus_int32 *corr, /* O Result [order + 1] */ + opus_int *scale, /* O Scaling of the correlation vector */ + const opus_int16 *input, /* I Input data to correlate */ + const opus_int warping_Q16, /* I Warping coefficient */ + const opus_int length, /* I Length of input */ + const opus_int order /* I Correlation order (even) */ +) = { + silk_warped_autocorrelation_FIX_c, /* ARMv4 */ + silk_warped_autocorrelation_FIX_c, /* EDSP */ + silk_warped_autocorrelation_FIX_c, /* Media */ + silk_warped_autocorrelation_FIX_neon, /* Neon */ +}; + +# endif + #endif /* OPUS_HAVE_RTCD */ diff --git a/thirdparty/opus/silk/arm/biquad_alt_arm.h b/thirdparty/opus/silk/arm/biquad_alt_arm.h new file mode 100644 index 0000000000..66ea9f43dd --- /dev/null +++ b/thirdparty/opus/silk/arm/biquad_alt_arm.h @@ -0,0 +1,68 @@ +/*********************************************************************** +Copyright (c) 2017 Google Inc. +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +- Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. +- Neither the name of Internet Society, IETF or IETF Trust, nor the +names of specific contributors, may be used to endorse or promote +products derived from this software without specific prior written +permission. +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. +***********************************************************************/ + +#ifndef SILK_BIQUAD_ALT_ARM_H +# define SILK_BIQUAD_ALT_ARM_H + +# include "celt/arm/armcpu.h" + +# if defined(OPUS_ARM_MAY_HAVE_NEON_INTR) +void silk_biquad_alt_stride2_neon( + const opus_int16 *in, /* I input signal */ + const opus_int32 *B_Q28, /* I MA coefficients [3] */ + const opus_int32 *A_Q28, /* I AR coefficients [2] */ + opus_int32 *S, /* I/O State vector [4] */ + opus_int16 *out, /* O output signal */ + const opus_int32 len /* I signal length (must be even) */ +); + +# if !defined(OPUS_HAVE_RTCD) && defined(OPUS_ARM_PRESUME_NEON) +# define OVERRIDE_silk_biquad_alt_stride2 (1) +# define silk_biquad_alt_stride2(in, B_Q28, A_Q28, S, out, len, arch) ((void)(arch), PRESUME_NEON(silk_biquad_alt_stride2)(in, B_Q28, A_Q28, S, out, len)) +# endif +# endif + +# if !defined(OVERRIDE_silk_biquad_alt_stride2) +/*Is run-time CPU detection enabled on this platform?*/ +# if defined(OPUS_HAVE_RTCD) && (defined(OPUS_ARM_MAY_HAVE_NEON_INTR) && !defined(OPUS_ARM_PRESUME_NEON_INTR)) +extern void (*const SILK_BIQUAD_ALT_STRIDE2_IMPL[OPUS_ARCHMASK+1])( + const opus_int16 *in, /* I input signal */ + const opus_int32 *B_Q28, /* I MA coefficients [3] */ + const opus_int32 *A_Q28, /* I AR coefficients [2] */ + opus_int32 *S, /* I/O State vector [4] */ + opus_int16 *out, /* O output signal */ + const opus_int32 len /* I signal length (must be even) */ + ); +# define OVERRIDE_silk_biquad_alt_stride2 (1) +# define silk_biquad_alt_stride2(in, B_Q28, A_Q28, S, out, len, arch) ((*SILK_BIQUAD_ALT_STRIDE2_IMPL[(arch)&OPUS_ARCHMASK])(in, B_Q28, A_Q28, S, out, len)) +# elif defined(OPUS_ARM_PRESUME_NEON_INTR) +# define OVERRIDE_silk_biquad_alt_stride2 (1) +# define silk_biquad_alt_stride2(in, B_Q28, A_Q28, S, out, len, arch) ((void)(arch), silk_biquad_alt_stride2_neon(in, B_Q28, A_Q28, S, out, len)) +# endif +# endif + +#endif /* end SILK_BIQUAD_ALT_ARM_H */ diff --git a/thirdparty/opus/silk/arm/biquad_alt_neon_intr.c b/thirdparty/opus/silk/arm/biquad_alt_neon_intr.c new file mode 100644 index 0000000000..9715733185 --- /dev/null +++ b/thirdparty/opus/silk/arm/biquad_alt_neon_intr.c @@ -0,0 +1,156 @@ +/*********************************************************************** +Copyright (c) 2017 Google Inc. +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +- Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. +- Neither the name of Internet Society, IETF or IETF Trust, nor the +names of specific contributors, may be used to endorse or promote +products derived from this software without specific prior written +permission. +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. +***********************************************************************/ + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#include <arm_neon.h> +#ifdef OPUS_CHECK_ASM +# include <string.h> +# include "stack_alloc.h" +#endif +#include "SigProc_FIX.h" + +static inline void silk_biquad_alt_stride2_kernel( const int32x4_t A_L_s32x4, const int32x4_t A_U_s32x4, const int32x4_t B_Q28_s32x4, const int32x2_t t_s32x2, const int32x4_t in_s32x4, int32x4_t *S_s32x4, int32x2_t *out32_Q14_s32x2 ) +{ + int32x4_t t_s32x4, out32_Q14_s32x4; + + *out32_Q14_s32x2 = vadd_s32( vget_low_s32( *S_s32x4 ), t_s32x2 ); /* silk_SMLAWB( S{0,1}, B_Q28[ 0 ], in{0,1} ) */ + *S_s32x4 = vcombine_s32( vget_high_s32( *S_s32x4 ), vdup_n_s32( 0 ) ); /* S{0,1} = S{2,3}; S{2,3} = 0; */ + *out32_Q14_s32x2 = vshl_n_s32( *out32_Q14_s32x2, 2 ); /* out32_Q14_{0,1} = silk_LSHIFT( silk_SMLAWB( S{0,1}, B_Q28[ 0 ], in{0,1} ), 2 ); */ + out32_Q14_s32x4 = vcombine_s32( *out32_Q14_s32x2, *out32_Q14_s32x2 ); /* out32_Q14_{0,1,0,1} */ + t_s32x4 = vqdmulhq_s32( out32_Q14_s32x4, A_L_s32x4 ); /* silk_SMULWB( out32_Q14_{0,1,0,1}, A{0,0,1,1}_L_Q28 ) */ + *S_s32x4 = vrsraq_n_s32( *S_s32x4, t_s32x4, 14 ); /* S{0,1} = S{2,3} + silk_RSHIFT_ROUND(); S{2,3} = silk_RSHIFT_ROUND(); */ + t_s32x4 = vqdmulhq_s32( out32_Q14_s32x4, A_U_s32x4 ); /* silk_SMULWB( out32_Q14_{0,1,0,1}, A{0,0,1,1}_U_Q28 ) */ + *S_s32x4 = vaddq_s32( *S_s32x4, t_s32x4 ); /* S0 = silk_SMLAWB( S{0,1,2,3}, out32_Q14_{0,1,0,1}, A{0,0,1,1}_U_Q28 ); */ + t_s32x4 = vqdmulhq_s32( in_s32x4, B_Q28_s32x4 ); /* silk_SMULWB( B_Q28[ {1,1,2,2} ], in{0,1,0,1} ) */ + *S_s32x4 = vaddq_s32( *S_s32x4, t_s32x4 ); /* S0 = silk_SMLAWB( S0, B_Q28[ {1,1,2,2} ], in{0,1,0,1} ); */ +} + +void silk_biquad_alt_stride2_neon( + const opus_int16 *in, /* I input signal */ + const opus_int32 *B_Q28, /* I MA coefficients [3] */ + const opus_int32 *A_Q28, /* I AR coefficients [2] */ + opus_int32 *S, /* I/O State vector [4] */ + opus_int16 *out, /* O output signal */ + const opus_int32 len /* I signal length (must be even) */ +) +{ + /* DIRECT FORM II TRANSPOSED (uses 2 element state vector) */ + opus_int k = 0; + const int32x2_t offset_s32x2 = vdup_n_s32( (1<<14) - 1 ); + const int32x4_t offset_s32x4 = vcombine_s32( offset_s32x2, offset_s32x2 ); + int16x4_t in_s16x4 = vdup_n_s16( 0 ); + int16x4_t out_s16x4; + int32x2_t A_Q28_s32x2, A_L_s32x2, A_U_s32x2, B_Q28_s32x2, t_s32x2; + int32x4_t A_L_s32x4, A_U_s32x4, B_Q28_s32x4, S_s32x4, out32_Q14_s32x4; + int32x2x2_t t0_s32x2x2, t1_s32x2x2, t2_s32x2x2, S_s32x2x2; + +#ifdef OPUS_CHECK_ASM + opus_int32 S_c[ 4 ]; + VARDECL( opus_int16, out_c ); + SAVE_STACK; + ALLOC( out_c, 2 * len, opus_int16 ); + + silk_memcpy( &S_c, S, sizeof( S_c ) ); + silk_biquad_alt_stride2_c( in, B_Q28, A_Q28, S_c, out_c, len ); +#endif + + /* Negate A_Q28 values and split in two parts */ + A_Q28_s32x2 = vld1_s32( A_Q28 ); + A_Q28_s32x2 = vneg_s32( A_Q28_s32x2 ); + A_L_s32x2 = vshl_n_s32( A_Q28_s32x2, 18 ); /* ( -A_Q28[] & 0x00003FFF ) << 18 */ + A_L_s32x2 = vreinterpret_s32_u32( vshr_n_u32( vreinterpret_u32_s32( A_L_s32x2 ), 3 ) ); /* ( -A_Q28[] & 0x00003FFF ) << 15 */ + A_U_s32x2 = vshr_n_s32( A_Q28_s32x2, 14 ); /* silk_RSHIFT( -A_Q28[], 14 ) */ + A_U_s32x2 = vshl_n_s32( A_U_s32x2, 16 ); /* silk_RSHIFT( -A_Q28[], 14 ) << 16 (Clip two leading bits to conform to C function.) */ + A_U_s32x2 = vshr_n_s32( A_U_s32x2, 1 ); /* silk_RSHIFT( -A_Q28[], 14 ) << 15 */ + + B_Q28_s32x2 = vld1_s32( B_Q28 ); + t_s32x2 = vld1_s32( B_Q28 + 1 ); + t0_s32x2x2 = vzip_s32( A_L_s32x2, A_L_s32x2 ); + t1_s32x2x2 = vzip_s32( A_U_s32x2, A_U_s32x2 ); + t2_s32x2x2 = vzip_s32( t_s32x2, t_s32x2 ); + A_L_s32x4 = vcombine_s32( t0_s32x2x2.val[ 0 ], t0_s32x2x2.val[ 1 ] ); /* A{0,0,1,1}_L_Q28 */ + A_U_s32x4 = vcombine_s32( t1_s32x2x2.val[ 0 ], t1_s32x2x2.val[ 1 ] ); /* A{0,0,1,1}_U_Q28 */ + B_Q28_s32x4 = vcombine_s32( t2_s32x2x2.val[ 0 ], t2_s32x2x2.val[ 1 ] ); /* B_Q28[ {1,1,2,2} ] */ + S_s32x4 = vld1q_s32( S ); /* S0 = S[ 0 ]; S3 = S[ 3 ]; */ + S_s32x2x2 = vtrn_s32( vget_low_s32( S_s32x4 ), vget_high_s32( S_s32x4 ) ); /* S2 = S[ 1 ]; S1 = S[ 2 ]; */ + S_s32x4 = vcombine_s32( S_s32x2x2.val[ 0 ], S_s32x2x2.val[ 1 ] ); + + for( ; k < len - 1; k += 2 ) { + int32x4_t in_s32x4[ 2 ], t_s32x4; + int32x2_t out32_Q14_s32x2[ 2 ]; + + /* S[ 2 * i + 0 ], S[ 2 * i + 1 ], S[ 2 * i + 2 ], S[ 2 * i + 3 ]: Q12 */ + in_s16x4 = vld1_s16( &in[ 2 * k ] ); /* in{0,1,2,3} = in[ 2 * k + {0,1,2,3} ]; */ + in_s32x4[ 0 ] = vshll_n_s16( in_s16x4, 15 ); /* in{0,1,2,3} << 15 */ + t_s32x4 = vqdmulhq_lane_s32( in_s32x4[ 0 ], B_Q28_s32x2, 0 ); /* silk_SMULWB( B_Q28[ 0 ], in{0,1,2,3} ) */ + in_s32x4[ 1 ] = vcombine_s32( vget_high_s32( in_s32x4[ 0 ] ), vget_high_s32( in_s32x4[ 0 ] ) ); /* in{2,3,2,3} << 15 */ + in_s32x4[ 0 ] = vcombine_s32( vget_low_s32 ( in_s32x4[ 0 ] ), vget_low_s32 ( in_s32x4[ 0 ] ) ); /* in{0,1,0,1} << 15 */ + silk_biquad_alt_stride2_kernel( A_L_s32x4, A_U_s32x4, B_Q28_s32x4, vget_low_s32 ( t_s32x4 ), in_s32x4[ 0 ], &S_s32x4, &out32_Q14_s32x2[ 0 ] ); + silk_biquad_alt_stride2_kernel( A_L_s32x4, A_U_s32x4, B_Q28_s32x4, vget_high_s32( t_s32x4 ), in_s32x4[ 1 ], &S_s32x4, &out32_Q14_s32x2[ 1 ] ); + + /* Scale back to Q0 and saturate */ + out32_Q14_s32x4 = vcombine_s32( out32_Q14_s32x2[ 0 ], out32_Q14_s32x2[ 1 ] ); /* out32_Q14_{0,1,2,3} */ + out32_Q14_s32x4 = vaddq_s32( out32_Q14_s32x4, offset_s32x4 ); /* out32_Q14_{0,1,2,3} + (1<<14) - 1 */ + out_s16x4 = vqshrn_n_s32( out32_Q14_s32x4, 14 ); /* (opus_int16)silk_SAT16( silk_RSHIFT( out32_Q14_{0,1,2,3} + (1<<14) - 1, 14 ) ) */ + vst1_s16( &out[ 2 * k ], out_s16x4 ); /* out[ 2 * k + {0,1,2,3} ] = (opus_int16)silk_SAT16( silk_RSHIFT( out32_Q14_{0,1,2,3} + (1<<14) - 1, 14 ) ); */ + } + + /* Process leftover. */ + if( k < len ) { + int32x4_t in_s32x4; + int32x2_t out32_Q14_s32x2; + + /* S[ 2 * i + 0 ], S[ 2 * i + 1 ]: Q12 */ + in_s16x4 = vld1_lane_s16( &in[ 2 * k + 0 ], in_s16x4, 0 ); /* in{0,1} = in[ 2 * k + {0,1} ]; */ + in_s16x4 = vld1_lane_s16( &in[ 2 * k + 1 ], in_s16x4, 1 ); /* in{0,1} = in[ 2 * k + {0,1} ]; */ + in_s32x4 = vshll_n_s16( in_s16x4, 15 ); /* in{0,1} << 15 */ + t_s32x2 = vqdmulh_lane_s32( vget_low_s32( in_s32x4 ), B_Q28_s32x2, 0 ); /* silk_SMULWB( B_Q28[ 0 ], in{0,1} ) */ + in_s32x4 = vcombine_s32( vget_low_s32( in_s32x4 ), vget_low_s32( in_s32x4 ) ); /* in{0,1,0,1} << 15 */ + silk_biquad_alt_stride2_kernel( A_L_s32x4, A_U_s32x4, B_Q28_s32x4, t_s32x2, in_s32x4, &S_s32x4, &out32_Q14_s32x2 ); + + /* Scale back to Q0 and saturate */ + out32_Q14_s32x2 = vadd_s32( out32_Q14_s32x2, offset_s32x2 ); /* out32_Q14_{0,1} + (1<<14) - 1 */ + out32_Q14_s32x4 = vcombine_s32( out32_Q14_s32x2, out32_Q14_s32x2 ); /* out32_Q14_{0,1,0,1} + (1<<14) - 1 */ + out_s16x4 = vqshrn_n_s32( out32_Q14_s32x4, 14 ); /* (opus_int16)silk_SAT16( silk_RSHIFT( out32_Q14_{0,1,0,1} + (1<<14) - 1, 14 ) ) */ + vst1_lane_s16( &out[ 2 * k + 0 ], out_s16x4, 0 ); /* out[ 2 * k + 0 ] = (opus_int16)silk_SAT16( silk_RSHIFT( out32_Q14_0 + (1<<14) - 1, 14 ) ); */ + vst1_lane_s16( &out[ 2 * k + 1 ], out_s16x4, 1 ); /* out[ 2 * k + 1 ] = (opus_int16)silk_SAT16( silk_RSHIFT( out32_Q14_1 + (1<<14) - 1, 14 ) ); */ + } + + vst1q_lane_s32( &S[ 0 ], S_s32x4, 0 ); /* S[ 0 ] = S0; */ + vst1q_lane_s32( &S[ 1 ], S_s32x4, 2 ); /* S[ 1 ] = S2; */ + vst1q_lane_s32( &S[ 2 ], S_s32x4, 1 ); /* S[ 2 ] = S1; */ + vst1q_lane_s32( &S[ 3 ], S_s32x4, 3 ); /* S[ 3 ] = S3; */ + +#ifdef OPUS_CHECK_ASM + silk_assert( !memcmp( S_c, S, sizeof( S_c ) ) ); + silk_assert( !memcmp( out_c, out, 2 * len * sizeof( opus_int16 ) ) ); + RESTORE_STACK; +#endif +} diff --git a/thirdparty/opus/silk/arm/macros_armv4.h b/thirdparty/opus/silk/arm/macros_armv4.h index 3f30e97288..877eb18dd5 100644 --- a/thirdparty/opus/silk/arm/macros_armv4.h +++ b/thirdparty/opus/silk/arm/macros_armv4.h @@ -28,6 +28,11 @@ POSSIBILITY OF SUCH DAMAGE. #ifndef SILK_MACROS_ARMv4_H #define SILK_MACROS_ARMv4_H +/* This macro only avoids the undefined behaviour from a left shift of + a negative value. It should only be used in macros that can't include + SigProc_FIX.h. In other cases, use silk_LSHIFT32(). */ +#define SAFE_SHL(a,b) ((opus_int32)((opus_uint32)(a) << (b))) + /* (a32 * (opus_int32)((opus_int16)(b32))) >> 16 output have to be 32bit int */ #undef silk_SMULWB static OPUS_INLINE opus_int32 silk_SMULWB_armv4(opus_int32 a, opus_int16 b) @@ -38,7 +43,7 @@ static OPUS_INLINE opus_int32 silk_SMULWB_armv4(opus_int32 a, opus_int16 b) "#silk_SMULWB\n\t" "smull %0, %1, %2, %3\n\t" : "=&r"(rd_lo), "=&r"(rd_hi) - : "%r"(a), "r"(b<<16) + : "%r"(a), "r"(SAFE_SHL(b,16)) ); return rd_hi; } @@ -80,7 +85,7 @@ static OPUS_INLINE opus_int32 silk_SMULWW_armv4(opus_int32 a, opus_int32 b) : "=&r"(rd_lo), "=&r"(rd_hi) : "%r"(a), "r"(b) ); - return (rd_hi<<16)+(rd_lo>>16); + return SAFE_SHL(rd_hi,16)+(rd_lo>>16); } #define silk_SMULWW(a, b) (silk_SMULWW_armv4(a, b)) @@ -96,8 +101,10 @@ static OPUS_INLINE opus_int32 silk_SMLAWW_armv4(opus_int32 a, opus_int32 b, : "=&r"(rd_lo), "=&r"(rd_hi) : "%r"(b), "r"(c) ); - return a+(rd_hi<<16)+(rd_lo>>16); + return a+SAFE_SHL(rd_hi,16)+(rd_lo>>16); } #define silk_SMLAWW(a, b, c) (silk_SMLAWW_armv4(a, b, c)) +#undef SAFE_SHL + #endif /* SILK_MACROS_ARMv4_H */ diff --git a/thirdparty/opus/silk/arm/macros_armv5e.h b/thirdparty/opus/silk/arm/macros_armv5e.h index aad4117e46..b14ec65ddb 100644 --- a/thirdparty/opus/silk/arm/macros_armv5e.h +++ b/thirdparty/opus/silk/arm/macros_armv5e.h @@ -29,6 +29,11 @@ POSSIBILITY OF SUCH DAMAGE. #ifndef SILK_MACROS_ARMv5E_H #define SILK_MACROS_ARMv5E_H +/* This macro only avoids the undefined behaviour from a left shift of + a negative value. It should only be used in macros that can't include + SigProc_FIX.h. In other cases, use silk_LSHIFT32(). */ +#define SAFE_SHL(a,b) ((opus_int32)((opus_uint32)(a) << (b))) + /* (a32 * (opus_int32)((opus_int16)(b32))) >> 16 output have to be 32bit int */ #undef silk_SMULWB static OPUS_INLINE opus_int32 silk_SMULWB_armv5e(opus_int32 a, opus_int16 b) @@ -190,7 +195,7 @@ static OPUS_INLINE opus_int32 silk_CLZ16_armv5(opus_int16 in16) "#silk_CLZ16\n\t" "clz %0, %1;\n" : "=r"(res) - : "r"(in16<<16|0x8000) + : "r"(SAFE_SHL(in16,16)|0x8000) ); return res; } @@ -210,4 +215,6 @@ static OPUS_INLINE opus_int32 silk_CLZ32_armv5(opus_int32 in32) } #define silk_CLZ32(in32) (silk_CLZ32_armv5(in32)) +#undef SAFE_SHL + #endif /* SILK_MACROS_ARMv5E_H */ diff --git a/thirdparty/opus/silk/biquad_alt.c b/thirdparty/opus/silk/biquad_alt.c index d55f5ee92e..54566a43c0 100644 --- a/thirdparty/opus/silk/biquad_alt.c +++ b/thirdparty/opus/silk/biquad_alt.c @@ -39,14 +39,13 @@ POSSIBILITY OF SUCH DAMAGE. #include "SigProc_FIX.h" /* Second order ARMA filter, alternative implementation */ -void silk_biquad_alt( +void silk_biquad_alt_stride1( const opus_int16 *in, /* I input signal */ const opus_int32 *B_Q28, /* I MA coefficients [3] */ const opus_int32 *A_Q28, /* I AR coefficients [2] */ opus_int32 *S, /* I/O State vector [2] */ opus_int16 *out, /* O output signal */ - const opus_int32 len, /* I signal length (must be even) */ - opus_int stride /* I Operate on interleaved signal if > 1 */ + const opus_int32 len /* I signal length (must be even) */ ) { /* DIRECT FORM II TRANSPOSED (uses 2 element state vector) */ @@ -61,7 +60,7 @@ void silk_biquad_alt( for( k = 0; k < len; k++ ) { /* S[ 0 ], S[ 1 ]: Q12 */ - inval = in[ k * stride ]; + inval = in[ k ]; out32_Q14 = silk_LSHIFT( silk_SMLAWB( S[ 0 ], B_Q28[ 0 ], inval ), 2 ); S[ 0 ] = S[1] + silk_RSHIFT_ROUND( silk_SMULWB( out32_Q14, A0_L_Q28 ), 14 ); @@ -73,6 +72,50 @@ void silk_biquad_alt( S[ 1 ] = silk_SMLAWB( S[ 1 ], B_Q28[ 2 ], inval ); /* Scale back to Q0 and saturate */ - out[ k * stride ] = (opus_int16)silk_SAT16( silk_RSHIFT( out32_Q14 + (1<<14) - 1, 14 ) ); + out[ k ] = (opus_int16)silk_SAT16( silk_RSHIFT( out32_Q14 + (1<<14) - 1, 14 ) ); + } +} + +void silk_biquad_alt_stride2_c( + const opus_int16 *in, /* I input signal */ + const opus_int32 *B_Q28, /* I MA coefficients [3] */ + const opus_int32 *A_Q28, /* I AR coefficients [2] */ + opus_int32 *S, /* I/O State vector [4] */ + opus_int16 *out, /* O output signal */ + const opus_int32 len /* I signal length (must be even) */ +) +{ + /* DIRECT FORM II TRANSPOSED (uses 2 element state vector) */ + opus_int k; + opus_int32 A0_U_Q28, A0_L_Q28, A1_U_Q28, A1_L_Q28, out32_Q14[ 2 ]; + + /* Negate A_Q28 values and split in two parts */ + A0_L_Q28 = ( -A_Q28[ 0 ] ) & 0x00003FFF; /* lower part */ + A0_U_Q28 = silk_RSHIFT( -A_Q28[ 0 ], 14 ); /* upper part */ + A1_L_Q28 = ( -A_Q28[ 1 ] ) & 0x00003FFF; /* lower part */ + A1_U_Q28 = silk_RSHIFT( -A_Q28[ 1 ], 14 ); /* upper part */ + + for( k = 0; k < len; k++ ) { + /* S[ 0 ], S[ 1 ], S[ 2 ], S[ 3 ]: Q12 */ + out32_Q14[ 0 ] = silk_LSHIFT( silk_SMLAWB( S[ 0 ], B_Q28[ 0 ], in[ 2 * k + 0 ] ), 2 ); + out32_Q14[ 1 ] = silk_LSHIFT( silk_SMLAWB( S[ 2 ], B_Q28[ 0 ], in[ 2 * k + 1 ] ), 2 ); + + S[ 0 ] = S[ 1 ] + silk_RSHIFT_ROUND( silk_SMULWB( out32_Q14[ 0 ], A0_L_Q28 ), 14 ); + S[ 2 ] = S[ 3 ] + silk_RSHIFT_ROUND( silk_SMULWB( out32_Q14[ 1 ], A0_L_Q28 ), 14 ); + S[ 0 ] = silk_SMLAWB( S[ 0 ], out32_Q14[ 0 ], A0_U_Q28 ); + S[ 2 ] = silk_SMLAWB( S[ 2 ], out32_Q14[ 1 ], A0_U_Q28 ); + S[ 0 ] = silk_SMLAWB( S[ 0 ], B_Q28[ 1 ], in[ 2 * k + 0 ] ); + S[ 2 ] = silk_SMLAWB( S[ 2 ], B_Q28[ 1 ], in[ 2 * k + 1 ] ); + + S[ 1 ] = silk_RSHIFT_ROUND( silk_SMULWB( out32_Q14[ 0 ], A1_L_Q28 ), 14 ); + S[ 3 ] = silk_RSHIFT_ROUND( silk_SMULWB( out32_Q14[ 1 ], A1_L_Q28 ), 14 ); + S[ 1 ] = silk_SMLAWB( S[ 1 ], out32_Q14[ 0 ], A1_U_Q28 ); + S[ 3 ] = silk_SMLAWB( S[ 3 ], out32_Q14[ 1 ], A1_U_Q28 ); + S[ 1 ] = silk_SMLAWB( S[ 1 ], B_Q28[ 2 ], in[ 2 * k + 0 ] ); + S[ 3 ] = silk_SMLAWB( S[ 3 ], B_Q28[ 2 ], in[ 2 * k + 1 ] ); + + /* Scale back to Q0 and saturate */ + out[ 2 * k + 0 ] = (opus_int16)silk_SAT16( silk_RSHIFT( out32_Q14[ 0 ] + (1<<14) - 1, 14 ) ); + out[ 2 * k + 1 ] = (opus_int16)silk_SAT16( silk_RSHIFT( out32_Q14[ 1 ] + (1<<14) - 1, 14 ) ); } } diff --git a/thirdparty/opus/silk/bwexpander.c b/thirdparty/opus/silk/bwexpander.c index 2eb4456695..afa97907ec 100644 --- a/thirdparty/opus/silk/bwexpander.c +++ b/thirdparty/opus/silk/bwexpander.c @@ -45,7 +45,7 @@ void silk_bwexpander( /* Bias in silk_SMULWB can lead to unstable filters */ for( i = 0; i < d - 1; i++ ) { ar[ i ] = (opus_int16)silk_RSHIFT_ROUND( silk_MUL( chirp_Q16, ar[ i ] ), 16 ); - chirp_Q16 += silk_RSHIFT_ROUND( silk_MUL( chirp_Q16, chirp_minus_one_Q16 ), 16 ); + chirp_Q16 += silk_RSHIFT_ROUND( silk_MUL( chirp_Q16, chirp_minus_one_Q16 ), 16 ); } ar[ d - 1 ] = (opus_int16)silk_RSHIFT_ROUND( silk_MUL( chirp_Q16, ar[ d - 1 ] ), 16 ); } diff --git a/thirdparty/opus/silk/check_control_input.c b/thirdparty/opus/silk/check_control_input.c index b5de9ce48d..739fb01f1e 100644 --- a/thirdparty/opus/silk/check_control_input.c +++ b/thirdparty/opus/silk/check_control_input.c @@ -38,7 +38,7 @@ opus_int check_control_input( silk_EncControlStruct *encControl /* I Control structure */ ) { - silk_assert( encControl != NULL ); + celt_assert( encControl != NULL ); if( ( ( encControl->API_sampleRate != 8000 ) && ( encControl->API_sampleRate != 12000 ) && @@ -59,46 +59,46 @@ opus_int check_control_input( ( encControl->minInternalSampleRate > encControl->desiredInternalSampleRate ) || ( encControl->maxInternalSampleRate < encControl->desiredInternalSampleRate ) || ( encControl->minInternalSampleRate > encControl->maxInternalSampleRate ) ) { - silk_assert( 0 ); + celt_assert( 0 ); return SILK_ENC_FS_NOT_SUPPORTED; } if( encControl->payloadSize_ms != 10 && encControl->payloadSize_ms != 20 && encControl->payloadSize_ms != 40 && encControl->payloadSize_ms != 60 ) { - silk_assert( 0 ); + celt_assert( 0 ); return SILK_ENC_PACKET_SIZE_NOT_SUPPORTED; } if( encControl->packetLossPercentage < 0 || encControl->packetLossPercentage > 100 ) { - silk_assert( 0 ); + celt_assert( 0 ); return SILK_ENC_INVALID_LOSS_RATE; } if( encControl->useDTX < 0 || encControl->useDTX > 1 ) { - silk_assert( 0 ); + celt_assert( 0 ); return SILK_ENC_INVALID_DTX_SETTING; } if( encControl->useCBR < 0 || encControl->useCBR > 1 ) { - silk_assert( 0 ); + celt_assert( 0 ); return SILK_ENC_INVALID_CBR_SETTING; } if( encControl->useInBandFEC < 0 || encControl->useInBandFEC > 1 ) { - silk_assert( 0 ); + celt_assert( 0 ); return SILK_ENC_INVALID_INBAND_FEC_SETTING; } if( encControl->nChannelsAPI < 1 || encControl->nChannelsAPI > ENCODER_NUM_CHANNELS ) { - silk_assert( 0 ); + celt_assert( 0 ); return SILK_ENC_INVALID_NUMBER_OF_CHANNELS_ERROR; } if( encControl->nChannelsInternal < 1 || encControl->nChannelsInternal > ENCODER_NUM_CHANNELS ) { - silk_assert( 0 ); + celt_assert( 0 ); return SILK_ENC_INVALID_NUMBER_OF_CHANNELS_ERROR; } if( encControl->nChannelsInternal > encControl->nChannelsAPI ) { - silk_assert( 0 ); + celt_assert( 0 ); return SILK_ENC_INVALID_NUMBER_OF_CHANNELS_ERROR; } if( encControl->complexity < 0 || encControl->complexity > 10 ) { - silk_assert( 0 ); + celt_assert( 0 ); return SILK_ENC_INVALID_COMPLEXITY_SETTING; } diff --git a/thirdparty/opus/silk/control.h b/thirdparty/opus/silk/control.h index 747e5426a0..b76ec33cd6 100644 --- a/thirdparty/opus/silk/control.h +++ b/thirdparty/opus/silk/control.h @@ -77,6 +77,9 @@ typedef struct { /* I: Flag to enable in-band Forward Error Correction (FEC); 0/1 */ opus_int useInBandFEC; + /* I: Flag to actually code in-band Forward Error Correction (FEC) in the current packet; 0/1 */ + opus_int LBRR_coded; + /* I: Flag to enable discontinuous transmission (DTX); 0/1 */ opus_int useDTX; @@ -110,6 +113,11 @@ typedef struct { /* O: Tells the Opus encoder we're ready to switch */ opus_int switchReady; + /* O: SILK Signal type */ + opus_int signalType; + + /* O: SILK offset (dithering) */ + opus_int offset; } silk_EncControlStruct; /**************************************************************************/ diff --git a/thirdparty/opus/silk/control_SNR.c b/thirdparty/opus/silk/control_SNR.c index cee87eb0d8..9a6db27543 100644 --- a/thirdparty/opus/silk/control_SNR.c +++ b/thirdparty/opus/silk/control_SNR.c @@ -32,45 +32,82 @@ POSSIBILITY OF SUCH DAMAGE. #include "main.h" #include "tuning_parameters.h" +/* These tables hold SNR values divided by 21 (so they fit in 8 bits) + for different target bitrates spaced at 400 bps interval. The first + 10 values are omitted (0-4 kb/s) because they're all zeros. + These tables were obtained by running different SNRs through the + encoder and measuring the active bitrate. */ +static const unsigned char silk_TargetRate_NB_21[117 - 10] = { + 0, 15, 39, 52, 61, 68, + 74, 79, 84, 88, 92, 95, 99,102,105,108,111,114,117,119,122,124, + 126,129,131,133,135,137,139,142,143,145,147,149,151,153,155,157, + 158,160,162,163,165,167,168,170,171,173,174,176,177,179,180,182, + 183,185,186,187,189,190,192,193,194,196,197,199,200,201,203,204, + 205,207,208,209,211,212,213,215,216,217,219,220,221,223,224,225, + 227,228,230,231,232,234,235,236,238,239,241,242,243,245,246,248, + 249,250,252,253,255 +}; + +static const unsigned char silk_TargetRate_MB_21[165 - 10] = { + 0, 0, 28, 43, 52, 59, + 65, 70, 74, 78, 81, 85, 87, 90, 93, 95, 98,100,102,105,107,109, + 111,113,115,116,118,120,122,123,125,127,128,130,131,133,134,136, + 137,138,140,141,143,144,145,147,148,149,151,152,153,154,156,157, + 158,159,160,162,163,164,165,166,167,168,169,171,172,173,174,175, + 176,177,178,179,180,181,182,183,184,185,186,187,188,188,189,190, + 191,192,193,194,195,196,197,198,199,200,201,202,203,203,204,205, + 206,207,208,209,210,211,212,213,214,214,215,216,217,218,219,220, + 221,222,223,224,224,225,226,227,228,229,230,231,232,233,234,235, + 236,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250, + 251,252,253,254,255 +}; + +static const unsigned char silk_TargetRate_WB_21[201 - 10] = { + 0, 0, 0, 8, 29, 41, + 49, 56, 62, 66, 70, 74, 77, 80, 83, 86, 88, 91, 93, 95, 97, 99, + 101,103,105,107,108,110,112,113,115,116,118,119,121,122,123,125, + 126,127,129,130,131,132,134,135,136,137,138,140,141,142,143,144, + 145,146,147,148,149,150,151,152,153,154,156,157,158,159,159,160, + 161,162,163,164,165,166,167,168,169,170,171,171,172,173,174,175, + 176,177,177,178,179,180,181,181,182,183,184,185,185,186,187,188, + 189,189,190,191,192,192,193,194,195,195,196,197,198,198,199,200, + 200,201,202,203,203,204,205,206,206,207,208,209,209,210,211,211, + 212,213,214,214,215,216,216,217,218,219,219,220,221,221,222,223, + 224,224,225,226,226,227,228,229,229,230,231,232,232,233,234,234, + 235,236,237,237,238,239,240,240,241,242,243,243,244,245,246,246, + 247,248,249,249,250,251,252,253,255 +}; + /* Control SNR of redidual quantizer */ opus_int silk_control_SNR( silk_encoder_state *psEncC, /* I/O Pointer to Silk encoder state */ opus_int32 TargetRate_bps /* I Target max bitrate (bps) */ ) { - opus_int k, ret = SILK_NO_ERROR; - opus_int32 frac_Q6; - const opus_int32 *rateTable; - - /* Set bitrate/coding quality */ - TargetRate_bps = silk_LIMIT( TargetRate_bps, MIN_TARGET_RATE_BPS, MAX_TARGET_RATE_BPS ); - if( TargetRate_bps != psEncC->TargetRate_bps ) { - psEncC->TargetRate_bps = TargetRate_bps; - - /* If new TargetRate_bps, translate to SNR_dB value */ - if( psEncC->fs_kHz == 8 ) { - rateTable = silk_TargetRate_table_NB; - } else if( psEncC->fs_kHz == 12 ) { - rateTable = silk_TargetRate_table_MB; - } else { - rateTable = silk_TargetRate_table_WB; - } + int id; + int bound; + const unsigned char *snr_table; - /* Reduce bitrate for 10 ms modes in these calculations */ - if( psEncC->nb_subfr == 2 ) { - TargetRate_bps -= REDUCE_BITRATE_10_MS_BPS; - } - - /* Find bitrate interval in table and interpolate */ - for( k = 1; k < TARGET_RATE_TAB_SZ; k++ ) { - if( TargetRate_bps <= rateTable[ k ] ) { - frac_Q6 = silk_DIV32( silk_LSHIFT( TargetRate_bps - rateTable[ k - 1 ], 6 ), - rateTable[ k ] - rateTable[ k - 1 ] ); - psEncC->SNR_dB_Q7 = silk_LSHIFT( silk_SNR_table_Q1[ k - 1 ], 6 ) + silk_MUL( frac_Q6, silk_SNR_table_Q1[ k ] - silk_SNR_table_Q1[ k - 1 ] ); - break; - } - } + psEncC->TargetRate_bps = TargetRate_bps; + if( psEncC->nb_subfr == 2 ) { + TargetRate_bps -= 2000 + psEncC->fs_kHz/16; } - - return ret; + if( psEncC->fs_kHz == 8 ) { + bound = sizeof(silk_TargetRate_NB_21); + snr_table = silk_TargetRate_NB_21; + } else if( psEncC->fs_kHz == 12 ) { + bound = sizeof(silk_TargetRate_MB_21); + snr_table = silk_TargetRate_MB_21; + } else { + bound = sizeof(silk_TargetRate_WB_21); + snr_table = silk_TargetRate_WB_21; + } + id = (TargetRate_bps+200)/400; + id = silk_min(id - 10, bound-1); + if( id <= 0 ) { + psEncC->SNR_dB_Q7 = 0; + } else { + psEncC->SNR_dB_Q7 = snr_table[id]*21; + } + return SILK_NO_ERROR; } diff --git a/thirdparty/opus/silk/control_audio_bandwidth.c b/thirdparty/opus/silk/control_audio_bandwidth.c index 4f9bc5cbda..f6d22d8395 100644 --- a/thirdparty/opus/silk/control_audio_bandwidth.c +++ b/thirdparty/opus/silk/control_audio_bandwidth.c @@ -39,9 +39,15 @@ opus_int silk_control_audio_bandwidth( ) { opus_int fs_kHz; + opus_int orig_kHz; opus_int32 fs_Hz; - fs_kHz = psEncC->fs_kHz; + orig_kHz = psEncC->fs_kHz; + /* Handle a bandwidth-switching reset where we need to be aware what the last sampling rate was. */ + if( orig_kHz == 0 ) { + orig_kHz = psEncC->sLP.saved_fs_kHz; + } + fs_kHz = orig_kHz; fs_Hz = silk_SMULBB( fs_kHz, 1000 ); if( fs_Hz == 0 ) { /* Encoder has just been initialized */ @@ -61,7 +67,7 @@ opus_int silk_control_audio_bandwidth( } if( psEncC->allow_bandwidth_switch || encControl->opusCanSwitch ) { /* Check if we should switch down */ - if( silk_SMULBB( psEncC->fs_kHz, 1000 ) > psEncC->desiredInternal_fs_Hz ) + if( silk_SMULBB( orig_kHz, 1000 ) > psEncC->desiredInternal_fs_Hz ) { /* Switch down */ if( psEncC->sLP.mode == 0 ) { @@ -76,7 +82,7 @@ opus_int silk_control_audio_bandwidth( psEncC->sLP.mode = 0; /* Switch to a lower sample frequency */ - fs_kHz = psEncC->fs_kHz == 16 ? 12 : 8; + fs_kHz = orig_kHz == 16 ? 12 : 8; } else { if( psEncC->sLP.transition_frame_no <= 0 ) { encControl->switchReady = 1; @@ -90,12 +96,12 @@ opus_int silk_control_audio_bandwidth( } else /* Check if we should switch up */ - if( silk_SMULBB( psEncC->fs_kHz, 1000 ) < psEncC->desiredInternal_fs_Hz ) + if( silk_SMULBB( orig_kHz, 1000 ) < psEncC->desiredInternal_fs_Hz ) { /* Switch up */ if( encControl->opusCanSwitch ) { /* Switch to a higher sample frequency */ - fs_kHz = psEncC->fs_kHz == 8 ? 12 : 16; + fs_kHz = orig_kHz == 8 ? 12 : 16; /* New transition */ psEncC->sLP.transition_frame_no = 0; diff --git a/thirdparty/opus/silk/control_codec.c b/thirdparty/opus/silk/control_codec.c index 044eea3f2a..52aa8fded3 100644 --- a/thirdparty/opus/silk/control_codec.c +++ b/thirdparty/opus/silk/control_codec.c @@ -57,7 +57,7 @@ static opus_int silk_setup_complexity( static OPUS_INLINE opus_int silk_setup_LBRR( silk_encoder_state *psEncC, /* I/O */ - const opus_int32 TargetRate_bps /* I */ + const silk_EncControlStruct *encControl /* I */ ); @@ -65,7 +65,6 @@ static OPUS_INLINE opus_int silk_setup_LBRR( opus_int silk_control_encoder( silk_encoder_state_Fxx *psEnc, /* I/O Pointer to Silk encoder state */ silk_EncControlStruct *encControl, /* I Control structure */ - const opus_int32 TargetRate_bps, /* I Target max bitrate (bps) */ const opus_int allow_bw_switch, /* I Flag to allow switching audio bandwidth */ const opus_int channelNb, /* I Channel number */ const opus_int force_fs_kHz @@ -125,7 +124,7 @@ opus_int silk_control_encoder( /********************************************/ /* Set LBRR usage */ /********************************************/ - ret += silk_setup_LBRR( &psEnc->sCmn, TargetRate_bps ); + ret += silk_setup_LBRR( &psEnc->sCmn, encControl ); psEnc->sCmn.controlled_since_last_payload = 1; @@ -239,12 +238,11 @@ static opus_int silk_setup_fs( } /* Set internal sampling frequency */ - silk_assert( fs_kHz == 8 || fs_kHz == 12 || fs_kHz == 16 ); - silk_assert( psEnc->sCmn.nb_subfr == 2 || psEnc->sCmn.nb_subfr == 4 ); + celt_assert( fs_kHz == 8 || fs_kHz == 12 || fs_kHz == 16 ); + celt_assert( psEnc->sCmn.nb_subfr == 2 || psEnc->sCmn.nb_subfr == 4 ); if( psEnc->sCmn.fs_kHz != fs_kHz ) { /* reset part of the state */ silk_memset( &psEnc->sShape, 0, sizeof( psEnc->sShape ) ); - silk_memset( &psEnc->sPrefilt, 0, sizeof( psEnc->sPrefilt ) ); silk_memset( &psEnc->sCmn.sNSQ, 0, sizeof( psEnc->sCmn.sNSQ ) ); silk_memset( psEnc->sCmn.prev_NLSFq_Q15, 0, sizeof( psEnc->sCmn.prev_NLSFq_Q15 ) ); silk_memset( &psEnc->sCmn.sLP.In_LP_State, 0, sizeof( psEnc->sCmn.sLP.In_LP_State ) ); @@ -255,7 +253,6 @@ static opus_int silk_setup_fs( /* Initialize non-zero parameters */ psEnc->sCmn.prevLag = 100; psEnc->sCmn.first_frame_after_reset = 1; - psEnc->sPrefilt.lagPrev = 100; psEnc->sShape.LastGainIndex = 10; psEnc->sCmn.sNSQ.lagPrev = 100; psEnc->sCmn.sNSQ.prev_gain_Q16 = 65536; @@ -293,19 +290,16 @@ static opus_int silk_setup_fs( psEnc->sCmn.pitch_LPC_win_length = silk_SMULBB( FIND_PITCH_LPC_WIN_MS_2_SF, fs_kHz ); } if( psEnc->sCmn.fs_kHz == 16 ) { - psEnc->sCmn.mu_LTP_Q9 = SILK_FIX_CONST( MU_LTP_QUANT_WB, 9 ); psEnc->sCmn.pitch_lag_low_bits_iCDF = silk_uniform8_iCDF; } else if( psEnc->sCmn.fs_kHz == 12 ) { - psEnc->sCmn.mu_LTP_Q9 = SILK_FIX_CONST( MU_LTP_QUANT_MB, 9 ); psEnc->sCmn.pitch_lag_low_bits_iCDF = silk_uniform6_iCDF; } else { - psEnc->sCmn.mu_LTP_Q9 = SILK_FIX_CONST( MU_LTP_QUANT_NB, 9 ); psEnc->sCmn.pitch_lag_low_bits_iCDF = silk_uniform4_iCDF; } } /* Check that settings are valid */ - silk_assert( ( psEnc->sCmn.subfr_length * psEnc->sCmn.nb_subfr ) == psEnc->sCmn.frame_length ); + celt_assert( ( psEnc->sCmn.subfr_length * psEnc->sCmn.nb_subfr ) == psEnc->sCmn.frame_length ); return ret; } @@ -318,61 +312,76 @@ static opus_int silk_setup_complexity( opus_int ret = 0; /* Set encoding complexity */ - silk_assert( Complexity >= 0 && Complexity <= 10 ); - if( Complexity < 2 ) { + celt_assert( Complexity >= 0 && Complexity <= 10 ); + if( Complexity < 1 ) { psEncC->pitchEstimationComplexity = SILK_PE_MIN_COMPLEX; psEncC->pitchEstimationThreshold_Q16 = SILK_FIX_CONST( 0.8, 16 ); psEncC->pitchEstimationLPCOrder = 6; - psEncC->shapingLPCOrder = 8; + psEncC->shapingLPCOrder = 12; psEncC->la_shape = 3 * psEncC->fs_kHz; psEncC->nStatesDelayedDecision = 1; psEncC->useInterpolatedNLSFs = 0; - psEncC->LTPQuantLowComplexity = 1; psEncC->NLSF_MSVQ_Survivors = 2; psEncC->warping_Q16 = 0; - } else if( Complexity < 4 ) { + } else if( Complexity < 2 ) { psEncC->pitchEstimationComplexity = SILK_PE_MID_COMPLEX; psEncC->pitchEstimationThreshold_Q16 = SILK_FIX_CONST( 0.76, 16 ); psEncC->pitchEstimationLPCOrder = 8; - psEncC->shapingLPCOrder = 10; + psEncC->shapingLPCOrder = 14; psEncC->la_shape = 5 * psEncC->fs_kHz; psEncC->nStatesDelayedDecision = 1; psEncC->useInterpolatedNLSFs = 0; - psEncC->LTPQuantLowComplexity = 0; + psEncC->NLSF_MSVQ_Survivors = 3; + psEncC->warping_Q16 = 0; + } else if( Complexity < 3 ) { + psEncC->pitchEstimationComplexity = SILK_PE_MIN_COMPLEX; + psEncC->pitchEstimationThreshold_Q16 = SILK_FIX_CONST( 0.8, 16 ); + psEncC->pitchEstimationLPCOrder = 6; + psEncC->shapingLPCOrder = 12; + psEncC->la_shape = 3 * psEncC->fs_kHz; + psEncC->nStatesDelayedDecision = 2; + psEncC->useInterpolatedNLSFs = 0; + psEncC->NLSF_MSVQ_Survivors = 2; + psEncC->warping_Q16 = 0; + } else if( Complexity < 4 ) { + psEncC->pitchEstimationComplexity = SILK_PE_MID_COMPLEX; + psEncC->pitchEstimationThreshold_Q16 = SILK_FIX_CONST( 0.76, 16 ); + psEncC->pitchEstimationLPCOrder = 8; + psEncC->shapingLPCOrder = 14; + psEncC->la_shape = 5 * psEncC->fs_kHz; + psEncC->nStatesDelayedDecision = 2; + psEncC->useInterpolatedNLSFs = 0; psEncC->NLSF_MSVQ_Survivors = 4; psEncC->warping_Q16 = 0; } else if( Complexity < 6 ) { psEncC->pitchEstimationComplexity = SILK_PE_MID_COMPLEX; psEncC->pitchEstimationThreshold_Q16 = SILK_FIX_CONST( 0.74, 16 ); psEncC->pitchEstimationLPCOrder = 10; - psEncC->shapingLPCOrder = 12; + psEncC->shapingLPCOrder = 16; psEncC->la_shape = 5 * psEncC->fs_kHz; psEncC->nStatesDelayedDecision = 2; psEncC->useInterpolatedNLSFs = 1; - psEncC->LTPQuantLowComplexity = 0; - psEncC->NLSF_MSVQ_Survivors = 8; + psEncC->NLSF_MSVQ_Survivors = 6; psEncC->warping_Q16 = psEncC->fs_kHz * SILK_FIX_CONST( WARPING_MULTIPLIER, 16 ); } else if( Complexity < 8 ) { psEncC->pitchEstimationComplexity = SILK_PE_MID_COMPLEX; psEncC->pitchEstimationThreshold_Q16 = SILK_FIX_CONST( 0.72, 16 ); psEncC->pitchEstimationLPCOrder = 12; - psEncC->shapingLPCOrder = 14; + psEncC->shapingLPCOrder = 20; psEncC->la_shape = 5 * psEncC->fs_kHz; psEncC->nStatesDelayedDecision = 3; psEncC->useInterpolatedNLSFs = 1; - psEncC->LTPQuantLowComplexity = 0; - psEncC->NLSF_MSVQ_Survivors = 16; + psEncC->NLSF_MSVQ_Survivors = 8; psEncC->warping_Q16 = psEncC->fs_kHz * SILK_FIX_CONST( WARPING_MULTIPLIER, 16 ); } else { psEncC->pitchEstimationComplexity = SILK_PE_MAX_COMPLEX; psEncC->pitchEstimationThreshold_Q16 = SILK_FIX_CONST( 0.7, 16 ); psEncC->pitchEstimationLPCOrder = 16; - psEncC->shapingLPCOrder = 16; + psEncC->shapingLPCOrder = 24; psEncC->la_shape = 5 * psEncC->fs_kHz; psEncC->nStatesDelayedDecision = MAX_DEL_DEC_STATES; psEncC->useInterpolatedNLSFs = 1; - psEncC->LTPQuantLowComplexity = 0; - psEncC->NLSF_MSVQ_Survivors = 32; + psEncC->NLSF_MSVQ_Survivors = 16; psEncC->warping_Q16 = psEncC->fs_kHz * SILK_FIX_CONST( WARPING_MULTIPLIER, 16 ); } @@ -381,46 +390,32 @@ static opus_int silk_setup_complexity( psEncC->shapeWinLength = SUB_FRAME_LENGTH_MS * psEncC->fs_kHz + 2 * psEncC->la_shape; psEncC->Complexity = Complexity; - silk_assert( psEncC->pitchEstimationLPCOrder <= MAX_FIND_PITCH_LPC_ORDER ); - silk_assert( psEncC->shapingLPCOrder <= MAX_SHAPE_LPC_ORDER ); - silk_assert( psEncC->nStatesDelayedDecision <= MAX_DEL_DEC_STATES ); - silk_assert( psEncC->warping_Q16 <= 32767 ); - silk_assert( psEncC->la_shape <= LA_SHAPE_MAX ); - silk_assert( psEncC->shapeWinLength <= SHAPE_LPC_WIN_MAX ); - silk_assert( psEncC->NLSF_MSVQ_Survivors <= NLSF_VQ_MAX_SURVIVORS ); + celt_assert( psEncC->pitchEstimationLPCOrder <= MAX_FIND_PITCH_LPC_ORDER ); + celt_assert( psEncC->shapingLPCOrder <= MAX_SHAPE_LPC_ORDER ); + celt_assert( psEncC->nStatesDelayedDecision <= MAX_DEL_DEC_STATES ); + celt_assert( psEncC->warping_Q16 <= 32767 ); + celt_assert( psEncC->la_shape <= LA_SHAPE_MAX ); + celt_assert( psEncC->shapeWinLength <= SHAPE_LPC_WIN_MAX ); return ret; } static OPUS_INLINE opus_int silk_setup_LBRR( silk_encoder_state *psEncC, /* I/O */ - const opus_int32 TargetRate_bps /* I */ + const silk_EncControlStruct *encControl /* I */ ) { opus_int LBRR_in_previous_packet, ret = SILK_NO_ERROR; - opus_int32 LBRR_rate_thres_bps; LBRR_in_previous_packet = psEncC->LBRR_enabled; - psEncC->LBRR_enabled = 0; - if( psEncC->useInBandFEC && psEncC->PacketLoss_perc > 0 ) { - if( psEncC->fs_kHz == 8 ) { - LBRR_rate_thres_bps = LBRR_NB_MIN_RATE_BPS; - } else if( psEncC->fs_kHz == 12 ) { - LBRR_rate_thres_bps = LBRR_MB_MIN_RATE_BPS; + psEncC->LBRR_enabled = encControl->LBRR_coded; + if( psEncC->LBRR_enabled ) { + /* Set gain increase for coding LBRR excitation */ + if( LBRR_in_previous_packet == 0 ) { + /* Previous packet did not have LBRR, and was therefore coded at a higher bitrate */ + psEncC->LBRR_GainIncreases = 7; } else { - LBRR_rate_thres_bps = LBRR_WB_MIN_RATE_BPS; - } - LBRR_rate_thres_bps = silk_SMULWB( silk_MUL( LBRR_rate_thres_bps, 125 - silk_min( psEncC->PacketLoss_perc, 25 ) ), SILK_FIX_CONST( 0.01, 16 ) ); - - if( TargetRate_bps > LBRR_rate_thres_bps ) { - /* Set gain increase for coding LBRR excitation */ - if( LBRR_in_previous_packet == 0 ) { - /* Previous packet did not have LBRR, and was therefore coded at a higher bitrate */ - psEncC->LBRR_GainIncreases = 7; - } else { - psEncC->LBRR_GainIncreases = silk_max_int( 7 - silk_SMULWB( (opus_int32)psEncC->PacketLoss_perc, SILK_FIX_CONST( 0.4, 16 ) ), 2 ); - } - psEncC->LBRR_enabled = 1; + psEncC->LBRR_GainIncreases = silk_max_int( 7 - silk_SMULWB( (opus_int32)psEncC->PacketLoss_perc, SILK_FIX_CONST( 0.4, 16 ) ), 2 ); } } diff --git a/thirdparty/opus/silk/debug.h b/thirdparty/opus/silk/debug.h index efb6d3e99e..6f68c1ca0f 100644 --- a/thirdparty/opus/silk/debug.h +++ b/thirdparty/opus/silk/debug.h @@ -39,23 +39,10 @@ extern "C" unsigned long GetHighResolutionTime(void); /* O time in usec*/ -/* make SILK_DEBUG dependent on compiler's _DEBUG */ -#if defined _WIN32 - #ifdef _DEBUG - #define SILK_DEBUG 1 - #else - #define SILK_DEBUG 0 - #endif - - /* overrule the above */ - #if 0 - /* #define NO_ASSERTS*/ - #undef SILK_DEBUG - #define SILK_DEBUG 1 - #endif -#else - #define SILK_DEBUG 0 -#endif +/* Set to 1 to enable DEBUG_STORE_DATA() macros for dumping + * intermediate signals from the codec. + */ +#define SILK_DEBUG 0 /* Flag for using timers */ #define SILK_TIC_TOC 0 diff --git a/thirdparty/opus/silk/dec_API.c b/thirdparty/opus/silk/dec_API.c index b7d8ed48d8..7d5ca7fb9f 100644 --- a/thirdparty/opus/silk/dec_API.c +++ b/thirdparty/opus/silk/dec_API.c @@ -104,7 +104,7 @@ opus_int silk_Decode( /* O Returns error co int delay_stack_alloc; SAVE_STACK; - silk_assert( decControl->nChannelsInternal == 1 || decControl->nChannelsInternal == 2 ); + celt_assert( decControl->nChannelsInternal == 1 || decControl->nChannelsInternal == 2 ); /**********************************/ /* Test if first frame in payload */ @@ -143,13 +143,13 @@ opus_int silk_Decode( /* O Returns error co channel_state[ n ].nFramesPerPacket = 3; channel_state[ n ].nb_subfr = 4; } else { - silk_assert( 0 ); + celt_assert( 0 ); RESTORE_STACK; return SILK_DEC_INVALID_FRAME_SIZE; } fs_kHz_dec = ( decControl->internalSampleRate >> 10 ) + 1; if( fs_kHz_dec != 8 && fs_kHz_dec != 12 && fs_kHz_dec != 16 ) { - silk_assert( 0 ); + celt_assert( 0 ); RESTORE_STACK; return SILK_DEC_INVALID_SAMPLING_FREQUENCY; } diff --git a/thirdparty/opus/silk/decode_core.c b/thirdparty/opus/silk/decode_core.c index e569c0e72b..1c352a6522 100644 --- a/thirdparty/opus/silk/decode_core.c +++ b/thirdparty/opus/silk/decode_core.c @@ -141,7 +141,7 @@ void silk_decode_core( if( k == 0 || ( k == 2 && NLSF_interpolation_flag ) ) { /* Rewhiten with new A coefs */ start_idx = psDec->ltp_mem_length - lag - psDec->LPC_order - LTP_ORDER / 2; - silk_assert( start_idx > 0 ); + celt_assert( start_idx > 0 ); if( k == 2 ) { silk_memcpy( &psDec->outBuf[ psDec->ltp_mem_length ], xq, 2 * psDec->subfr_length * sizeof( opus_int16 ) ); @@ -196,7 +196,7 @@ void silk_decode_core( for( i = 0; i < psDec->subfr_length; i++ ) { /* Short-term prediction */ - silk_assert( psDec->LPC_order == 10 || psDec->LPC_order == 16 ); + celt_assert( psDec->LPC_order == 10 || psDec->LPC_order == 16 ); /* Avoids introducing a bias because silk_SMLAWB() always rounds to -inf */ LPC_pred_Q10 = silk_RSHIFT( psDec->LPC_order, 1 ); LPC_pred_Q10 = silk_SMLAWB( LPC_pred_Q10, sLPC_Q14[ MAX_LPC_ORDER + i - 1 ], A_Q12_tmp[ 0 ] ); @@ -225,8 +225,6 @@ void silk_decode_core( pxq[ i ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( silk_SMULWW( sLPC_Q14[ MAX_LPC_ORDER + i ], Gain_Q10 ), 8 ) ); } - /* DEBUG_STORE_DATA( dec.pcm, pxq, psDec->subfr_length * sizeof( opus_int16 ) ) */ - /* Update LPC filter state */ silk_memcpy( sLPC_Q14, &sLPC_Q14[ psDec->subfr_length ], MAX_LPC_ORDER * sizeof( opus_int32 ) ); pexc_Q14 += psDec->subfr_length; diff --git a/thirdparty/opus/silk/decode_frame.c b/thirdparty/opus/silk/decode_frame.c index a605d95ac6..e73825b267 100644 --- a/thirdparty/opus/silk/decode_frame.c +++ b/thirdparty/opus/silk/decode_frame.c @@ -55,7 +55,7 @@ opus_int silk_decode_frame( psDecCtrl->LTP_scale_Q14 = 0; /* Safety checks */ - silk_assert( L > 0 && L <= MAX_FRAME_LENGTH ); + celt_assert( L > 0 && L <= MAX_FRAME_LENGTH ); if( lostFlag == FLAG_DECODE_NORMAL || ( lostFlag == FLAG_DECODE_LBRR && psDec->LBRR_flags[ psDec->nFramesDecoded ] == 1 ) ) @@ -91,19 +91,20 @@ opus_int silk_decode_frame( psDec->lossCnt = 0; psDec->prevSignalType = psDec->indices.signalType; - silk_assert( psDec->prevSignalType >= 0 && psDec->prevSignalType <= 2 ); + celt_assert( psDec->prevSignalType >= 0 && psDec->prevSignalType <= 2 ); /* A frame has been decoded without errors */ psDec->first_frame_after_reset = 0; } else { /* Handle packet loss by extrapolation */ + psDec->indices.signalType = psDec->prevSignalType; silk_PLC( psDec, psDecCtrl, pOut, 1, arch ); } /*************************/ /* Update output buffer. */ /*************************/ - silk_assert( psDec->ltp_mem_length >= psDec->frame_length ); + celt_assert( psDec->ltp_mem_length >= psDec->frame_length ); mv_len = psDec->ltp_mem_length - psDec->frame_length; silk_memmove( psDec->outBuf, &psDec->outBuf[ psDec->frame_length ], mv_len * sizeof(opus_int16) ); silk_memcpy( &psDec->outBuf[ mv_len ], pOut, psDec->frame_length * sizeof( opus_int16 ) ); diff --git a/thirdparty/opus/silk/decode_indices.c b/thirdparty/opus/silk/decode_indices.c index 7afe5c26c1..0bb4a997a5 100644 --- a/thirdparty/opus/silk/decode_indices.c +++ b/thirdparty/opus/silk/decode_indices.c @@ -79,7 +79,7 @@ void silk_decode_indices( /**********************/ psDec->indices.NLSFIndices[ 0 ] = (opus_int8)ec_dec_icdf( psRangeDec, &psDec->psNLSF_CB->CB1_iCDF[ ( psDec->indices.signalType >> 1 ) * psDec->psNLSF_CB->nVectors ], 8 ); silk_NLSF_unpack( ec_ix, pred_Q8, psDec->psNLSF_CB, psDec->indices.NLSFIndices[ 0 ] ); - silk_assert( psDec->psNLSF_CB->order == psDec->LPC_order ); + celt_assert( psDec->psNLSF_CB->order == psDec->LPC_order ); for( i = 0; i < psDec->psNLSF_CB->order; i++ ) { Ix = ec_dec_icdf( psRangeDec, &psDec->psNLSF_CB->ec_iCDF[ ec_ix[ i ] ], 8 ); if( Ix == 0 ) { diff --git a/thirdparty/opus/silk/decode_parameters.c b/thirdparty/opus/silk/decode_parameters.c index e345b1dcef..a56a409858 100644 --- a/thirdparty/opus/silk/decode_parameters.c +++ b/thirdparty/opus/silk/decode_parameters.c @@ -52,7 +52,7 @@ void silk_decode_parameters( silk_NLSF_decode( pNLSF_Q15, psDec->indices.NLSFIndices, psDec->psNLSF_CB ); /* Convert NLSF parameters to AR prediction filter coefficients */ - silk_NLSF2A( psDecCtrl->PredCoef_Q12[ 1 ], pNLSF_Q15, psDec->LPC_order ); + silk_NLSF2A( psDecCtrl->PredCoef_Q12[ 1 ], pNLSF_Q15, psDec->LPC_order, psDec->arch ); /* If just reset, e.g., because internal Fs changed, do not allow interpolation */ /* improves the case of packet loss in the first frame after a switch */ @@ -69,7 +69,7 @@ void silk_decode_parameters( } /* Convert NLSF parameters to AR prediction filter coefficients */ - silk_NLSF2A( psDecCtrl->PredCoef_Q12[ 0 ], pNLSF0_Q15, psDec->LPC_order ); + silk_NLSF2A( psDecCtrl->PredCoef_Q12[ 0 ], pNLSF0_Q15, psDec->LPC_order, psDec->arch ); } else { /* Copy LPC coefficients for first half from second half */ silk_memcpy( psDecCtrl->PredCoef_Q12[ 0 ], psDecCtrl->PredCoef_Q12[ 1 ], psDec->LPC_order * sizeof( opus_int16 ) ); diff --git a/thirdparty/opus/silk/decode_pitch.c b/thirdparty/opus/silk/decode_pitch.c index fedbc6a525..fd1b6bf551 100644 --- a/thirdparty/opus/silk/decode_pitch.c +++ b/thirdparty/opus/silk/decode_pitch.c @@ -51,7 +51,7 @@ void silk_decode_pitch( Lag_CB_ptr = &silk_CB_lags_stage2[ 0 ][ 0 ]; cbk_size = PE_NB_CBKS_STAGE2_EXT; } else { - silk_assert( nb_subfr == PE_MAX_NB_SUBFR >> 1 ); + celt_assert( nb_subfr == PE_MAX_NB_SUBFR >> 1 ); Lag_CB_ptr = &silk_CB_lags_stage2_10_ms[ 0 ][ 0 ]; cbk_size = PE_NB_CBKS_STAGE2_10MS; } @@ -60,7 +60,7 @@ void silk_decode_pitch( Lag_CB_ptr = &silk_CB_lags_stage3[ 0 ][ 0 ]; cbk_size = PE_NB_CBKS_STAGE3_MAX; } else { - silk_assert( nb_subfr == PE_MAX_NB_SUBFR >> 1 ); + celt_assert( nb_subfr == PE_MAX_NB_SUBFR >> 1 ); Lag_CB_ptr = &silk_CB_lags_stage3_10_ms[ 0 ][ 0 ]; cbk_size = PE_NB_CBKS_STAGE3_10MS; } diff --git a/thirdparty/opus/silk/decode_pulses.c b/thirdparty/opus/silk/decode_pulses.c index d6bbec9225..a56d2d3074 100644 --- a/thirdparty/opus/silk/decode_pulses.c +++ b/thirdparty/opus/silk/decode_pulses.c @@ -56,7 +56,7 @@ void silk_decode_pulses( silk_assert( 1 << LOG2_SHELL_CODEC_FRAME_LENGTH == SHELL_CODEC_FRAME_LENGTH ); iter = silk_RSHIFT( frame_length, LOG2_SHELL_CODEC_FRAME_LENGTH ); if( iter * SHELL_CODEC_FRAME_LENGTH < frame_length ) { - silk_assert( frame_length == 12 * 10 ); /* Make sure only happens for 10 ms @ 12 kHz */ + celt_assert( frame_length == 12 * 10 ); /* Make sure only happens for 10 ms @ 12 kHz */ iter++; } diff --git a/thirdparty/opus/silk/decoder_set_fs.c b/thirdparty/opus/silk/decoder_set_fs.c index eef0fd25e1..d9a13d0f0c 100644 --- a/thirdparty/opus/silk/decoder_set_fs.c +++ b/thirdparty/opus/silk/decoder_set_fs.c @@ -40,8 +40,8 @@ opus_int silk_decoder_set_fs( { opus_int frame_length, ret = 0; - silk_assert( fs_kHz == 8 || fs_kHz == 12 || fs_kHz == 16 ); - silk_assert( psDec->nb_subfr == MAX_NB_SUBFR || psDec->nb_subfr == MAX_NB_SUBFR/2 ); + celt_assert( fs_kHz == 8 || fs_kHz == 12 || fs_kHz == 16 ); + celt_assert( psDec->nb_subfr == MAX_NB_SUBFR || psDec->nb_subfr == MAX_NB_SUBFR/2 ); /* New (sub)frame length */ psDec->subfr_length = silk_SMULBB( SUB_FRAME_LENGTH_MS, fs_kHz ); @@ -86,7 +86,7 @@ opus_int silk_decoder_set_fs( psDec->pitch_lag_low_bits_iCDF = silk_uniform4_iCDF; } else { /* unsupported sampling rate */ - silk_assert( 0 ); + celt_assert( 0 ); } psDec->first_frame_after_reset = 1; psDec->lagPrev = 100; @@ -101,7 +101,7 @@ opus_int silk_decoder_set_fs( } /* Check that settings are valid */ - silk_assert( psDec->frame_length > 0 && psDec->frame_length <= MAX_FRAME_LENGTH ); + celt_assert( psDec->frame_length > 0 && psDec->frame_length <= MAX_FRAME_LENGTH ); return ret; } diff --git a/thirdparty/opus/silk/define.h b/thirdparty/opus/silk/define.h index 19c9b00e25..247cb0bf71 100644 --- a/thirdparty/opus/silk/define.h +++ b/thirdparty/opus/silk/define.h @@ -46,7 +46,6 @@ extern "C" /* Limits on bitrate */ #define MIN_TARGET_RATE_BPS 5000 #define MAX_TARGET_RATE_BPS 80000 -#define TARGET_RATE_TAB_SZ 8 /* LBRR thresholds */ #define LBRR_NB_MIN_RATE_BPS 12000 @@ -56,6 +55,12 @@ extern "C" /* DTX settings */ #define NB_SPEECH_FRAMES_BEFORE_DTX 10 /* eq 200 ms */ #define MAX_CONSECUTIVE_DTX 20 /* eq 400 ms */ +#define DTX_ACTIVITY_THRESHOLD 0.1f + +/* VAD decision */ +#define VAD_NO_DECISION -1 +#define VAD_NO_ACTIVITY 0 +#define VAD_ACTIVITY 1 /* Maximum sampling frequency */ #define MAX_FS_KHZ 16 @@ -147,7 +152,7 @@ extern "C" #define USE_HARM_SHAPING 1 /* Max LPC order of noise shaping filters */ -#define MAX_SHAPE_LPC_ORDER 16 +#define MAX_SHAPE_LPC_ORDER 24 #define HARM_SHAPE_FIR_TAPS 3 @@ -157,8 +162,7 @@ extern "C" #define LTP_BUF_LENGTH 512 #define LTP_MASK ( LTP_BUF_LENGTH - 1 ) -#define DECISION_DELAY 32 -#define DECISION_DELAY_MASK ( DECISION_DELAY - 1 ) +#define DECISION_DELAY 40 /* Number of subframes for excitation entropy coding */ #define SHELL_CODEC_FRAME_LENGTH 16 @@ -173,11 +177,7 @@ extern "C" #define MAX_MATRIX_SIZE MAX_LPC_ORDER /* Max of LPC Order and LTP order */ -#if( MAX_LPC_ORDER > DECISION_DELAY ) # define NSQ_LPC_BUF_LENGTH MAX_LPC_ORDER -#else -# define NSQ_LPC_BUF_LENGTH DECISION_DELAY -#endif /***************************/ /* Voice activity detector */ @@ -205,7 +205,6 @@ extern "C" /******************/ #define NLSF_W_Q 2 #define NLSF_VQ_MAX_VECTORS 32 -#define NLSF_VQ_MAX_SURVIVORS 32 #define NLSF_QUANT_MAX_AMPLITUDE 4 #define NLSF_QUANT_MAX_AMPLITUDE_EXT 10 #define NLSF_QUANT_LEVEL_ADJ 0.1 diff --git a/thirdparty/opus/silk/enc_API.c b/thirdparty/opus/silk/enc_API.c index f8060286db..55a33f37e9 100644 --- a/thirdparty/opus/silk/enc_API.c +++ b/thirdparty/opus/silk/enc_API.c @@ -82,7 +82,7 @@ opus_int silk_InitEncoder( /* O Returns error co silk_memset( psEnc, 0, sizeof( silk_encoder ) ); for( n = 0; n < ENCODER_NUM_CHANNELS; n++ ) { if( ret += silk_init_encoder( &psEnc->state_Fxx[ n ], arch ) ) { - silk_assert( 0 ); + celt_assert( 0 ); } } @@ -91,7 +91,7 @@ opus_int silk_InitEncoder( /* O Returns error co /* Read control structure */ if( ret += silk_QueryEncoder( encState, encStatus ) ) { - silk_assert( 0 ); + celt_assert( 0 ); } return ret; @@ -144,7 +144,8 @@ opus_int silk_Encode( /* O Returns error co opus_int nSamplesIn, /* I Number of samples in input vector */ ec_enc *psRangeEnc, /* I/O Compressor data structure */ opus_int32 *nBytesOut, /* I/O Number of bytes in payload (input: Max bytes) */ - const opus_int prefillFlag /* I Flag to indicate prefilling buffers no coding */ + const opus_int prefillFlag, /* I Flag to indicate prefilling buffers no coding */ + opus_int activity /* I Decision of Opus voice activity detector */ ) { opus_int n, i, nBits, flags, tmp_payloadSize_ms = 0, tmp_complexity = 0, ret = 0; @@ -166,7 +167,7 @@ opus_int silk_Encode( /* O Returns error co /* Check values in encoder control structure */ if( ( ret = check_control_input( encControl ) ) != 0 ) { - silk_assert( 0 ); + celt_assert( 0 ); RESTORE_STACK; return ret; } @@ -199,16 +200,26 @@ opus_int silk_Encode( /* O Returns error co tot_blocks = ( nBlocksOf10ms > 1 ) ? nBlocksOf10ms >> 1 : 1; curr_block = 0; if( prefillFlag ) { + silk_LP_state save_LP; /* Only accept input length of 10 ms */ if( nBlocksOf10ms != 1 ) { - silk_assert( 0 ); + celt_assert( 0 ); RESTORE_STACK; return SILK_ENC_INPUT_INVALID_NO_OF_SAMPLES; } + if ( prefillFlag == 2 ) { + save_LP = psEnc->state_Fxx[ 0 ].sCmn.sLP; + /* Save the sampling rate so the bandwidth switching code can keep handling transitions. */ + save_LP.saved_fs_kHz = psEnc->state_Fxx[ 0 ].sCmn.fs_kHz; + } /* Reset Encoder */ for( n = 0; n < encControl->nChannelsInternal; n++ ) { ret = silk_init_encoder( &psEnc->state_Fxx[ n ], psEnc->state_Fxx[ n ].sCmn.arch ); - silk_assert( !ret ); + /* Restore the variable LP state. */ + if ( prefillFlag == 2 ) { + psEnc->state_Fxx[ n ].sCmn.sLP = save_LP; + } + celt_assert( !ret ); } tmp_payloadSize_ms = encControl->payloadSize_ms; encControl->payloadSize_ms = 10; @@ -221,23 +232,22 @@ opus_int silk_Encode( /* O Returns error co } else { /* Only accept input lengths that are a multiple of 10 ms */ if( nBlocksOf10ms * encControl->API_sampleRate != 100 * nSamplesIn || nSamplesIn < 0 ) { - silk_assert( 0 ); + celt_assert( 0 ); RESTORE_STACK; return SILK_ENC_INPUT_INVALID_NO_OF_SAMPLES; } /* Make sure no more than one packet can be produced */ if( 1000 * (opus_int32)nSamplesIn > encControl->payloadSize_ms * encControl->API_sampleRate ) { - silk_assert( 0 ); + celt_assert( 0 ); RESTORE_STACK; return SILK_ENC_INPUT_INVALID_NO_OF_SAMPLES; } } - TargetRate_bps = silk_RSHIFT32( encControl->bitRate, encControl->nChannelsInternal - 1 ); for( n = 0; n < encControl->nChannelsInternal; n++ ) { /* Force the side channel to the same rate as the mid */ opus_int force_fs_kHz = (n==1) ? psEnc->state_Fxx[0].sCmn.fs_kHz : 0; - if( ( ret = silk_control_encoder( &psEnc->state_Fxx[ n ], encControl, TargetRate_bps, psEnc->allowBandwidthSwitch, n, force_fs_kHz ) ) != 0 ) { + if( ( ret = silk_control_encoder( &psEnc->state_Fxx[ n ], encControl, psEnc->allowBandwidthSwitch, n, force_fs_kHz ) ) != 0 ) { silk_assert( 0 ); RESTORE_STACK; return ret; @@ -249,7 +259,7 @@ opus_int silk_Encode( /* O Returns error co } psEnc->state_Fxx[ n ].sCmn.inDTX = psEnc->state_Fxx[ n ].sCmn.useDTX; } - silk_assert( encControl->nChannelsInternal == 1 || psEnc->state_Fxx[ 0 ].sCmn.fs_kHz == psEnc->state_Fxx[ 1 ].sCmn.fs_kHz ); + celt_assert( encControl->nChannelsInternal == 1 || psEnc->state_Fxx[ 0 ].sCmn.fs_kHz == psEnc->state_Fxx[ 1 ].sCmn.fs_kHz ); /* Input buffering/resampling and encoding */ nSamplesToBufferMax = @@ -307,7 +317,7 @@ opus_int silk_Encode( /* O Returns error co } psEnc->state_Fxx[ 0 ].sCmn.inputBufIx += nSamplesToBuffer; } else { - silk_assert( encControl->nChannelsAPI == 1 && encControl->nChannelsInternal == 1 ); + celt_assert( encControl->nChannelsAPI == 1 && encControl->nChannelsInternal == 1 ); silk_memcpy(buf, samplesIn, nSamplesFromInput*sizeof(opus_int16)); ret += silk_resampler( &psEnc->state_Fxx[ 0 ].sCmn.resampler_state, &psEnc->state_Fxx[ 0 ].sCmn.inputBuf[ psEnc->state_Fxx[ 0 ].sCmn.inputBufIx + 2 ], buf, nSamplesFromInput ); @@ -323,8 +333,8 @@ opus_int silk_Encode( /* O Returns error co /* Silk encoder */ if( psEnc->state_Fxx[ 0 ].sCmn.inputBufIx >= psEnc->state_Fxx[ 0 ].sCmn.frame_length ) { /* Enough data in input buffer, so encode */ - silk_assert( psEnc->state_Fxx[ 0 ].sCmn.inputBufIx == psEnc->state_Fxx[ 0 ].sCmn.frame_length ); - silk_assert( encControl->nChannelsInternal == 1 || psEnc->state_Fxx[ 1 ].sCmn.inputBufIx == psEnc->state_Fxx[ 1 ].sCmn.frame_length ); + celt_assert( psEnc->state_Fxx[ 0 ].sCmn.inputBufIx == psEnc->state_Fxx[ 0 ].sCmn.frame_length ); + celt_assert( encControl->nChannelsInternal == 1 || psEnc->state_Fxx[ 1 ].sCmn.inputBufIx == psEnc->state_Fxx[ 1 ].sCmn.frame_length ); /* Deal with LBRR data */ if( psEnc->state_Fxx[ 0 ].sCmn.nFramesEncoded == 0 && !prefillFlag ) { @@ -416,7 +426,6 @@ opus_int silk_Encode( /* O Returns error co /* Reset side channel encoder memory for first frame with side coding */ if( psEnc->prev_decode_only_middle == 1 ) { silk_memset( &psEnc->state_Fxx[ 1 ].sShape, 0, sizeof( psEnc->state_Fxx[ 1 ].sShape ) ); - silk_memset( &psEnc->state_Fxx[ 1 ].sPrefilt, 0, sizeof( psEnc->state_Fxx[ 1 ].sPrefilt ) ); silk_memset( &psEnc->state_Fxx[ 1 ].sCmn.sNSQ, 0, sizeof( psEnc->state_Fxx[ 1 ].sCmn.sNSQ ) ); silk_memset( psEnc->state_Fxx[ 1 ].sCmn.prev_NLSFq_Q15, 0, sizeof( psEnc->state_Fxx[ 1 ].sCmn.prev_NLSFq_Q15 ) ); silk_memset( &psEnc->state_Fxx[ 1 ].sCmn.sLP.In_LP_State, 0, sizeof( psEnc->state_Fxx[ 1 ].sCmn.sLP.In_LP_State ) ); @@ -427,7 +436,7 @@ opus_int silk_Encode( /* O Returns error co psEnc->state_Fxx[ 1 ].sCmn.sNSQ.prev_gain_Q16 = 65536; psEnc->state_Fxx[ 1 ].sCmn.first_frame_after_reset = 1; } - silk_encode_do_VAD_Fxx( &psEnc->state_Fxx[ 1 ] ); + silk_encode_do_VAD_Fxx( &psEnc->state_Fxx[ 1 ], activity ); } else { psEnc->state_Fxx[ 1 ].sCmn.VAD_flags[ psEnc->state_Fxx[ 0 ].sCmn.nFramesEncoded ] = 0; } @@ -442,7 +451,7 @@ opus_int silk_Encode( /* O Returns error co silk_memcpy( psEnc->state_Fxx[ 0 ].sCmn.inputBuf, psEnc->sStereo.sMid, 2 * sizeof( opus_int16 ) ); silk_memcpy( psEnc->sStereo.sMid, &psEnc->state_Fxx[ 0 ].sCmn.inputBuf[ psEnc->state_Fxx[ 0 ].sCmn.frame_length ], 2 * sizeof( opus_int16 ) ); } - silk_encode_do_VAD_Fxx( &psEnc->state_Fxx[ 0 ] ); + silk_encode_do_VAD_Fxx( &psEnc->state_Fxx[ 0 ], activity ); /* Encode */ for( n = 0; n < encControl->nChannelsInternal; n++ ) { @@ -557,6 +566,10 @@ opus_int silk_Encode( /* O Returns error co } } + encControl->signalType = psEnc->state_Fxx[0].sCmn.indices.signalType; + encControl->offset = silk_Quantization_Offsets_Q10 + [ psEnc->state_Fxx[0].sCmn.indices.signalType >> 1 ] + [ psEnc->state_Fxx[0].sCmn.indices.quantOffsetType ]; RESTORE_STACK; return ret; } diff --git a/thirdparty/opus/silk/encode_indices.c b/thirdparty/opus/silk/encode_indices.c index 666c8c0b13..4bcbc3347b 100644 --- a/thirdparty/opus/silk/encode_indices.c +++ b/thirdparty/opus/silk/encode_indices.c @@ -56,8 +56,8 @@ void silk_encode_indices( /* Encode signal type and quantizer offset */ /*******************************************/ typeOffset = 2 * psIndices->signalType + psIndices->quantOffsetType; - silk_assert( typeOffset >= 0 && typeOffset < 6 ); - silk_assert( encode_LBRR == 0 || typeOffset >= 2 ); + celt_assert( typeOffset >= 0 && typeOffset < 6 ); + celt_assert( encode_LBRR == 0 || typeOffset >= 2 ); if( encode_LBRR || typeOffset >= 2 ) { ec_enc_icdf( psRangeEnc, typeOffset - 2, silk_type_offset_VAD_iCDF, 8 ); } else { @@ -90,7 +90,7 @@ void silk_encode_indices( /****************/ ec_enc_icdf( psRangeEnc, psIndices->NLSFIndices[ 0 ], &psEncC->psNLSF_CB->CB1_iCDF[ ( psIndices->signalType >> 1 ) * psEncC->psNLSF_CB->nVectors ], 8 ); silk_NLSF_unpack( ec_ix, pred_Q8, psEncC->psNLSF_CB, psIndices->NLSFIndices[ 0 ] ); - silk_assert( psEncC->psNLSF_CB->order == psEncC->predictLPCOrder ); + celt_assert( psEncC->psNLSF_CB->order == psEncC->predictLPCOrder ); for( i = 0; i < psEncC->psNLSF_CB->order; i++ ) { if( psIndices->NLSFIndices[ i+1 ] >= NLSF_QUANT_MAX_AMPLITUDE ) { ec_enc_icdf( psRangeEnc, 2 * NLSF_QUANT_MAX_AMPLITUDE, &psEncC->psNLSF_CB->ec_iCDF[ ec_ix[ i ] ], 8 ); diff --git a/thirdparty/opus/silk/encode_pulses.c b/thirdparty/opus/silk/encode_pulses.c index ab00264f99..8a1999138b 100644 --- a/thirdparty/opus/silk/encode_pulses.c +++ b/thirdparty/opus/silk/encode_pulses.c @@ -86,7 +86,7 @@ void silk_encode_pulses( silk_assert( 1 << LOG2_SHELL_CODEC_FRAME_LENGTH == SHELL_CODEC_FRAME_LENGTH ); iter = silk_RSHIFT( frame_length, LOG2_SHELL_CODEC_FRAME_LENGTH ); if( iter * SHELL_CODEC_FRAME_LENGTH < frame_length ) { - silk_assert( frame_length == 12 * 10 ); /* Make sure only happens for 10 ms @ 12 kHz */ + celt_assert( frame_length == 12 * 10 ); /* Make sure only happens for 10 ms @ 12 kHz */ iter++; silk_memset( &pulses[ frame_length ], 0, SHELL_CODEC_FRAME_LENGTH * sizeof(opus_int8)); } diff --git a/thirdparty/opus/silk/fixed/apply_sine_window_FIX.c b/thirdparty/opus/silk/fixed/apply_sine_window_FIX.c index 4502b7130e..03e088a6de 100644 --- a/thirdparty/opus/silk/fixed/apply_sine_window_FIX.c +++ b/thirdparty/opus/silk/fixed/apply_sine_window_FIX.c @@ -57,15 +57,15 @@ void silk_apply_sine_window( opus_int k, f_Q16, c_Q16; opus_int32 S0_Q16, S1_Q16; - silk_assert( win_type == 1 || win_type == 2 ); + celt_assert( win_type == 1 || win_type == 2 ); /* Length must be in a range from 16 to 120 and a multiple of 4 */ - silk_assert( length >= 16 && length <= 120 ); - silk_assert( ( length & 3 ) == 0 ); + celt_assert( length >= 16 && length <= 120 ); + celt_assert( ( length & 3 ) == 0 ); /* Frequency */ k = ( length >> 2 ) - 4; - silk_assert( k >= 0 && k <= 26 ); + celt_assert( k >= 0 && k <= 26 ); f_Q16 = (opus_int)freq_table_Q16[ k ]; /* Factor used for cosine approximation */ diff --git a/thirdparty/opus/silk/fixed/arm/warped_autocorrelation_FIX_arm.h b/thirdparty/opus/silk/fixed/arm/warped_autocorrelation_FIX_arm.h new file mode 100644 index 0000000000..1992e43288 --- /dev/null +++ b/thirdparty/opus/silk/fixed/arm/warped_autocorrelation_FIX_arm.h @@ -0,0 +1,68 @@ +/*********************************************************************** +Copyright (c) 2017 Google Inc. +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +- Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. +- Neither the name of Internet Society, IETF or IETF Trust, nor the +names of specific contributors, may be used to endorse or promote +products derived from this software without specific prior written +permission. +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. +***********************************************************************/ + +#ifndef SILK_WARPED_AUTOCORRELATION_FIX_ARM_H +# define SILK_WARPED_AUTOCORRELATION_FIX_ARM_H + +# include "celt/arm/armcpu.h" + +# if defined(FIXED_POINT) + +# if defined(OPUS_ARM_MAY_HAVE_NEON_INTR) +void silk_warped_autocorrelation_FIX_neon( + opus_int32 *corr, /* O Result [order + 1] */ + opus_int *scale, /* O Scaling of the correlation vector */ + const opus_int16 *input, /* I Input data to correlate */ + const opus_int warping_Q16, /* I Warping coefficient */ + const opus_int length, /* I Length of input */ + const opus_int order /* I Correlation order (even) */ +); + +# if !defined(OPUS_HAVE_RTCD) && defined(OPUS_ARM_PRESUME_NEON) +# define OVERRIDE_silk_warped_autocorrelation_FIX (1) +# define silk_warped_autocorrelation_FIX(corr, scale, input, warping_Q16, length, order, arch) \ + ((void)(arch), PRESUME_NEON(silk_warped_autocorrelation_FIX)(corr, scale, input, warping_Q16, length, order)) +# endif +# endif + +# if !defined(OVERRIDE_silk_warped_autocorrelation_FIX) +/*Is run-time CPU detection enabled on this platform?*/ +# if defined(OPUS_HAVE_RTCD) && (defined(OPUS_ARM_MAY_HAVE_NEON_INTR) && !defined(OPUS_ARM_PRESUME_NEON_INTR)) +extern void (*const SILK_WARPED_AUTOCORRELATION_FIX_IMPL[OPUS_ARCHMASK+1])(opus_int32*, opus_int*, const opus_int16*, const opus_int, const opus_int, const opus_int); +# define OVERRIDE_silk_warped_autocorrelation_FIX (1) +# define silk_warped_autocorrelation_FIX(corr, scale, input, warping_Q16, length, order, arch) \ + ((*SILK_WARPED_AUTOCORRELATION_FIX_IMPL[(arch)&OPUS_ARCHMASK])(corr, scale, input, warping_Q16, length, order)) +# elif defined(OPUS_ARM_PRESUME_NEON_INTR) +# define OVERRIDE_silk_warped_autocorrelation_FIX (1) +# define silk_warped_autocorrelation_FIX(corr, scale, input, warping_Q16, length, order, arch) \ + ((void)(arch), silk_warped_autocorrelation_FIX_neon(corr, scale, input, warping_Q16, length, order)) +# endif +# endif + +# endif /* end FIXED_POINT */ + +#endif /* end SILK_WARPED_AUTOCORRELATION_FIX_ARM_H */ diff --git a/thirdparty/opus/silk/fixed/arm/warped_autocorrelation_FIX_neon_intr.c b/thirdparty/opus/silk/fixed/arm/warped_autocorrelation_FIX_neon_intr.c new file mode 100644 index 0000000000..00a70cb51f --- /dev/null +++ b/thirdparty/opus/silk/fixed/arm/warped_autocorrelation_FIX_neon_intr.c @@ -0,0 +1,260 @@ +/*********************************************************************** +Copyright (c) 2017 Google Inc., Jean-Marc Valin +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +- Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. +- Neither the name of Internet Society, IETF or IETF Trust, nor the +names of specific contributors, may be used to endorse or promote +products derived from this software without specific prior written +permission. +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. +***********************************************************************/ + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#include <arm_neon.h> +#ifdef OPUS_CHECK_ASM +# include <string.h> +#endif +#include "stack_alloc.h" +#include "main_FIX.h" + +static OPUS_INLINE void calc_corr( const opus_int32 *const input_QS, opus_int64 *const corr_QC, const opus_int offset, const int32x4_t state_QS_s32x4 ) +{ + int64x2_t corr_QC_s64x2[ 2 ], t_s64x2[ 2 ]; + const int32x4_t input_QS_s32x4 = vld1q_s32( input_QS + offset ); + corr_QC_s64x2[ 0 ] = vld1q_s64( corr_QC + offset + 0 ); + corr_QC_s64x2[ 1 ] = vld1q_s64( corr_QC + offset + 2 ); + t_s64x2[ 0 ] = vmull_s32( vget_low_s32( state_QS_s32x4 ), vget_low_s32( input_QS_s32x4 ) ); + t_s64x2[ 1 ] = vmull_s32( vget_high_s32( state_QS_s32x4 ), vget_high_s32( input_QS_s32x4 ) ); + corr_QC_s64x2[ 0 ] = vsraq_n_s64( corr_QC_s64x2[ 0 ], t_s64x2[ 0 ], 2 * QS - QC ); + corr_QC_s64x2[ 1 ] = vsraq_n_s64( corr_QC_s64x2[ 1 ], t_s64x2[ 1 ], 2 * QS - QC ); + vst1q_s64( corr_QC + offset + 0, corr_QC_s64x2[ 0 ] ); + vst1q_s64( corr_QC + offset + 2, corr_QC_s64x2[ 1 ] ); +} + +static OPUS_INLINE int32x4_t calc_state( const int32x4_t state_QS0_s32x4, const int32x4_t state_QS0_1_s32x4, const int32x4_t state_QS1_1_s32x4, const int32x4_t warping_Q16_s32x4 ) +{ + int32x4_t t_s32x4 = vsubq_s32( state_QS0_s32x4, state_QS0_1_s32x4 ); + t_s32x4 = vqdmulhq_s32( t_s32x4, warping_Q16_s32x4 ); + return vaddq_s32( state_QS1_1_s32x4, t_s32x4 ); +} + +void silk_warped_autocorrelation_FIX_neon( + opus_int32 *corr, /* O Result [order + 1] */ + opus_int *scale, /* O Scaling of the correlation vector */ + const opus_int16 *input, /* I Input data to correlate */ + const opus_int warping_Q16, /* I Warping coefficient */ + const opus_int length, /* I Length of input */ + const opus_int order /* I Correlation order (even) */ +) +{ + if( ( MAX_SHAPE_LPC_ORDER > 24 ) || ( order < 6 ) ) { + silk_warped_autocorrelation_FIX_c( corr, scale, input, warping_Q16, length, order ); + } else { + opus_int n, i, lsh; + opus_int64 corr_QC[ MAX_SHAPE_LPC_ORDER + 1 ] = { 0 }; /* In reverse order */ + opus_int64 corr_QC_orderT; + int64x2_t lsh_s64x2; + const opus_int orderT = ( order + 3 ) & ~3; + opus_int64 *corr_QCT; + opus_int32 *input_QS; + VARDECL( opus_int32, input_QST ); + VARDECL( opus_int32, state ); + SAVE_STACK; + + /* Order must be even */ + silk_assert( ( order & 1 ) == 0 ); + silk_assert( 2 * QS - QC >= 0 ); + + ALLOC( input_QST, length + 2 * MAX_SHAPE_LPC_ORDER, opus_int32 ); + + input_QS = input_QST; + /* input_QS has zero paddings in the beginning and end. */ + vst1q_s32( input_QS, vdupq_n_s32( 0 ) ); + input_QS += 4; + vst1q_s32( input_QS, vdupq_n_s32( 0 ) ); + input_QS += 4; + vst1q_s32( input_QS, vdupq_n_s32( 0 ) ); + input_QS += 4; + vst1q_s32( input_QS, vdupq_n_s32( 0 ) ); + input_QS += 4; + vst1q_s32( input_QS, vdupq_n_s32( 0 ) ); + input_QS += 4; + vst1q_s32( input_QS, vdupq_n_s32( 0 ) ); + input_QS += 4; + + /* Loop over samples */ + for( n = 0; n < length - 7; n += 8, input_QS += 8 ) { + const int16x8_t t0_s16x4 = vld1q_s16( input + n ); + vst1q_s32( input_QS + 0, vshll_n_s16( vget_low_s16( t0_s16x4 ), QS ) ); + vst1q_s32( input_QS + 4, vshll_n_s16( vget_high_s16( t0_s16x4 ), QS ) ); + } + for( ; n < length; n++, input_QS++ ) { + input_QS[ 0 ] = silk_LSHIFT32( (opus_int32)input[ n ], QS ); + } + vst1q_s32( input_QS, vdupq_n_s32( 0 ) ); + input_QS += 4; + vst1q_s32( input_QS, vdupq_n_s32( 0 ) ); + input_QS += 4; + vst1q_s32( input_QS, vdupq_n_s32( 0 ) ); + input_QS += 4; + vst1q_s32( input_QS, vdupq_n_s32( 0 ) ); + input_QS += 4; + vst1q_s32( input_QS, vdupq_n_s32( 0 ) ); + input_QS += 4; + vst1q_s32( input_QS, vdupq_n_s32( 0 ) ); + input_QS = input_QST + MAX_SHAPE_LPC_ORDER - orderT; + + /* The following loop runs ( length + order ) times, with ( order ) extra epilogues. */ + /* The zero paddings in input_QS guarantee corr_QC's correctness even with the extra epilogues. */ + /* The values of state_QS will be polluted by the extra epilogues, however they are temporary values. */ + + /* Keep the C code here to help understand the intrinsics optimization. */ + /* + { + opus_int32 state_QS[ 2 ][ MAX_SHAPE_LPC_ORDER + 1 ] = { 0 }; + opus_int32 *state_QST[ 3 ]; + state_QST[ 0 ] = state_QS[ 0 ]; + state_QST[ 1 ] = state_QS[ 1 ]; + for( n = 0; n < length + order; n++, input_QS++ ) { + state_QST[ 0 ][ orderT ] = input_QS[ orderT ]; + for( i = 0; i < orderT; i++ ) { + corr_QC[ i ] += silk_RSHIFT64( silk_SMULL( state_QST[ 0 ][ i ], input_QS[ i ] ), 2 * QS - QC ); + state_QST[ 1 ][ i ] = silk_SMLAWB( state_QST[ 1 ][ i + 1 ], state_QST[ 0 ][ i ] - state_QST[ 0 ][ i + 1 ], warping_Q16 ); + } + state_QST[ 2 ] = state_QST[ 0 ]; + state_QST[ 0 ] = state_QST[ 1 ]; + state_QST[ 1 ] = state_QST[ 2 ]; + } + } + */ + + { + const int32x4_t warping_Q16_s32x4 = vdupq_n_s32( warping_Q16 << 15 ); + const opus_int32 *in = input_QS + orderT; + opus_int o = orderT; + int32x4_t state_QS_s32x4[ 3 ][ 2 ]; + + ALLOC( state, length + orderT, opus_int32 ); + state_QS_s32x4[ 2 ][ 1 ] = vdupq_n_s32( 0 ); + + /* Calculate 8 taps of all inputs in each loop. */ + do { + state_QS_s32x4[ 0 ][ 0 ] = state_QS_s32x4[ 0 ][ 1 ] = + state_QS_s32x4[ 1 ][ 0 ] = state_QS_s32x4[ 1 ][ 1 ] = vdupq_n_s32( 0 ); + n = 0; + do { + calc_corr( input_QS + n, corr_QC, o - 8, state_QS_s32x4[ 0 ][ 0 ] ); + calc_corr( input_QS + n, corr_QC, o - 4, state_QS_s32x4[ 0 ][ 1 ] ); + state_QS_s32x4[ 2 ][ 1 ] = vld1q_s32( in + n ); + vst1q_lane_s32( state + n, state_QS_s32x4[ 0 ][ 0 ], 0 ); + state_QS_s32x4[ 2 ][ 0 ] = vextq_s32( state_QS_s32x4[ 0 ][ 0 ], state_QS_s32x4[ 0 ][ 1 ], 1 ); + state_QS_s32x4[ 2 ][ 1 ] = vextq_s32( state_QS_s32x4[ 0 ][ 1 ], state_QS_s32x4[ 2 ][ 1 ], 1 ); + state_QS_s32x4[ 0 ][ 0 ] = calc_state( state_QS_s32x4[ 0 ][ 0 ], state_QS_s32x4[ 2 ][ 0 ], state_QS_s32x4[ 1 ][ 0 ], warping_Q16_s32x4 ); + state_QS_s32x4[ 0 ][ 1 ] = calc_state( state_QS_s32x4[ 0 ][ 1 ], state_QS_s32x4[ 2 ][ 1 ], state_QS_s32x4[ 1 ][ 1 ], warping_Q16_s32x4 ); + state_QS_s32x4[ 1 ][ 0 ] = state_QS_s32x4[ 2 ][ 0 ]; + state_QS_s32x4[ 1 ][ 1 ] = state_QS_s32x4[ 2 ][ 1 ]; + } while( ++n < ( length + order ) ); + in = state; + o -= 8; + } while( o > 4 ); + + if( o ) { + /* Calculate the last 4 taps of all inputs. */ + opus_int32 *stateT = state; + silk_assert( o == 4 ); + state_QS_s32x4[ 0 ][ 0 ] = state_QS_s32x4[ 1 ][ 0 ] = vdupq_n_s32( 0 ); + n = length + order; + do { + calc_corr( input_QS, corr_QC, 0, state_QS_s32x4[ 0 ][ 0 ] ); + state_QS_s32x4[ 2 ][ 0 ] = vld1q_s32( stateT ); + vst1q_lane_s32( stateT, state_QS_s32x4[ 0 ][ 0 ], 0 ); + state_QS_s32x4[ 2 ][ 0 ] = vextq_s32( state_QS_s32x4[ 0 ][ 0 ], state_QS_s32x4[ 2 ][ 0 ], 1 ); + state_QS_s32x4[ 0 ][ 0 ] = calc_state( state_QS_s32x4[ 0 ][ 0 ], state_QS_s32x4[ 2 ][ 0 ], state_QS_s32x4[ 1 ][ 0 ], warping_Q16_s32x4 ); + state_QS_s32x4[ 1 ][ 0 ] = state_QS_s32x4[ 2 ][ 0 ]; + input_QS++; + stateT++; + } while( --n ); + } + } + + { + const opus_int16 *inputT = input; + int32x4_t t_s32x4; + int64x1_t t_s64x1; + int64x2_t t_s64x2 = vdupq_n_s64( 0 ); + for( n = 0; n <= length - 8; n += 8 ) { + int16x8_t input_s16x8 = vld1q_s16( inputT ); + t_s32x4 = vmull_s16( vget_low_s16( input_s16x8 ), vget_low_s16( input_s16x8 ) ); + t_s32x4 = vmlal_s16( t_s32x4, vget_high_s16( input_s16x8 ), vget_high_s16( input_s16x8 ) ); + t_s64x2 = vaddw_s32( t_s64x2, vget_low_s32( t_s32x4 ) ); + t_s64x2 = vaddw_s32( t_s64x2, vget_high_s32( t_s32x4 ) ); + inputT += 8; + } + t_s64x1 = vadd_s64( vget_low_s64( t_s64x2 ), vget_high_s64( t_s64x2 ) ); + corr_QC_orderT = vget_lane_s64( t_s64x1, 0 ); + for( ; n < length; n++ ) { + corr_QC_orderT += silk_SMULL( input[ n ], input[ n ] ); + } + corr_QC_orderT = silk_LSHIFT64( corr_QC_orderT, QC ); + corr_QC[ orderT ] = corr_QC_orderT; + } + + corr_QCT = corr_QC + orderT - order; + lsh = silk_CLZ64( corr_QC_orderT ) - 35; + lsh = silk_LIMIT( lsh, -12 - QC, 30 - QC ); + *scale = -( QC + lsh ); + silk_assert( *scale >= -30 && *scale <= 12 ); + lsh_s64x2 = vdupq_n_s64( lsh ); + for( i = 0; i <= order - 3; i += 4 ) { + int32x4_t corr_s32x4; + int64x2_t corr_QC0_s64x2, corr_QC1_s64x2; + corr_QC0_s64x2 = vld1q_s64( corr_QCT + i ); + corr_QC1_s64x2 = vld1q_s64( corr_QCT + i + 2 ); + corr_QC0_s64x2 = vshlq_s64( corr_QC0_s64x2, lsh_s64x2 ); + corr_QC1_s64x2 = vshlq_s64( corr_QC1_s64x2, lsh_s64x2 ); + corr_s32x4 = vcombine_s32( vmovn_s64( corr_QC1_s64x2 ), vmovn_s64( corr_QC0_s64x2 ) ); + corr_s32x4 = vrev64q_s32( corr_s32x4 ); + vst1q_s32( corr + order - i - 3, corr_s32x4 ); + } + if( lsh >= 0 ) { + for( ; i < order + 1; i++ ) { + corr[ order - i ] = (opus_int32)silk_CHECK_FIT32( silk_LSHIFT64( corr_QCT[ i ], lsh ) ); + } + } else { + for( ; i < order + 1; i++ ) { + corr[ order - i ] = (opus_int32)silk_CHECK_FIT32( silk_RSHIFT64( corr_QCT[ i ], -lsh ) ); + } + } + silk_assert( corr_QCT[ order ] >= 0 ); /* If breaking, decrease QC*/ + RESTORE_STACK; + } + +#ifdef OPUS_CHECK_ASM + { + opus_int32 corr_c[ MAX_SHAPE_LPC_ORDER + 1 ]; + opus_int scale_c; + silk_warped_autocorrelation_FIX_c( corr_c, &scale_c, input, warping_Q16, length, order ); + silk_assert( !memcmp( corr_c, corr, sizeof( corr_c[ 0 ] ) * ( order + 1 ) ) ); + silk_assert( scale_c == *scale ); + } +#endif +} diff --git a/thirdparty/opus/silk/fixed/burg_modified_FIX.c b/thirdparty/opus/silk/fixed/burg_modified_FIX.c index 17d0e0993c..274d4b28e1 100644 --- a/thirdparty/opus/silk/fixed/burg_modified_FIX.c +++ b/thirdparty/opus/silk/fixed/burg_modified_FIX.c @@ -37,7 +37,7 @@ POSSIBILITY OF SUCH DAMAGE. #define MAX_FRAME_SIZE 384 /* subfr_length * nb_subfr = ( 0.005 * 16000 + 16 ) * 4 = 384 */ #define QA 25 -#define N_BITS_HEAD_ROOM 2 +#define N_BITS_HEAD_ROOM 3 #define MIN_RSHIFTS -16 #define MAX_RSHIFTS (32 - QA) @@ -65,7 +65,7 @@ void silk_burg_modified_c( opus_int32 xcorr[ SILK_MAX_ORDER_LPC ]; opus_int64 C0_64; - silk_assert( subfr_length * nb_subfr <= MAX_FRAME_SIZE ); + celt_assert( subfr_length * nb_subfr <= MAX_FRAME_SIZE ); /* Compute autocorrelations, added over subframes */ C0_64 = silk_inner_prod16_aligned_64( x, x, subfr_length*nb_subfr, arch ); diff --git a/thirdparty/opus/silk/fixed/corrMatrix_FIX.c b/thirdparty/opus/silk/fixed/corrMatrix_FIX.c index c1d437c785..1b4a29c232 100644 --- a/thirdparty/opus/silk/fixed/corrMatrix_FIX.c +++ b/thirdparty/opus/silk/fixed/corrMatrix_FIX.c @@ -58,7 +58,7 @@ void silk_corrVector_FIX( for( lag = 0; lag < order; lag++ ) { inner_prod = 0; for( i = 0; i < L; i++ ) { - inner_prod += silk_RSHIFT32( silk_SMULBB( ptr1[ i ], ptr2[i] ), rshifts ); + inner_prod = silk_ADD_RSHIFT32( inner_prod, silk_SMULBB( ptr1[ i ], ptr2[i] ), rshifts ); } Xt[ lag ] = inner_prod; /* X[:,lag]'*t */ ptr1--; /* Go to next column of X */ @@ -77,61 +77,54 @@ void silk_corrMatrix_FIX( const opus_int16 *x, /* I x vector [L + order - 1] used to form data matrix X */ const opus_int L, /* I Length of vectors */ const opus_int order, /* I Max lag for correlation */ - const opus_int head_room, /* I Desired headroom */ opus_int32 *XX, /* O Pointer to X'*X correlation matrix [ order x order ] */ - opus_int *rshifts, /* I/O Right shifts of correlations */ + opus_int32 *nrg, /* O Energy of x vector */ + opus_int *rshifts, /* O Right shifts of correlations and energy */ int arch /* I Run-time architecture */ ) { - opus_int i, j, lag, rshifts_local, head_room_rshifts; + opus_int i, j, lag; opus_int32 energy; const opus_int16 *ptr1, *ptr2; /* Calculate energy to find shift used to fit in 32 bits */ - silk_sum_sqr_shift( &energy, &rshifts_local, x, L + order - 1 ); - /* Add shifts to get the desired head room */ - head_room_rshifts = silk_max( head_room - silk_CLZ32( energy ), 0 ); - - energy = silk_RSHIFT32( energy, head_room_rshifts ); - rshifts_local += head_room_rshifts; + silk_sum_sqr_shift( nrg, rshifts, x, L + order - 1 ); + energy = *nrg; /* Calculate energy of first column (0) of X: X[:,0]'*X[:,0] */ /* Remove contribution of first order - 1 samples */ for( i = 0; i < order - 1; i++ ) { - energy -= silk_RSHIFT32( silk_SMULBB( x[ i ], x[ i ] ), rshifts_local ); - } - if( rshifts_local < *rshifts ) { - /* Adjust energy */ - energy = silk_RSHIFT32( energy, *rshifts - rshifts_local ); - rshifts_local = *rshifts; + energy -= silk_RSHIFT32( silk_SMULBB( x[ i ], x[ i ] ), *rshifts ); } /* Calculate energy of remaining columns of X: X[:,j]'*X[:,j] */ /* Fill out the diagonal of the correlation matrix */ matrix_ptr( XX, 0, 0, order ) = energy; + silk_assert( energy >= 0 ); ptr1 = &x[ order - 1 ]; /* First sample of column 0 of X */ for( j = 1; j < order; j++ ) { - energy = silk_SUB32( energy, silk_RSHIFT32( silk_SMULBB( ptr1[ L - j ], ptr1[ L - j ] ), rshifts_local ) ); - energy = silk_ADD32( energy, silk_RSHIFT32( silk_SMULBB( ptr1[ -j ], ptr1[ -j ] ), rshifts_local ) ); + energy = silk_SUB32( energy, silk_RSHIFT32( silk_SMULBB( ptr1[ L - j ], ptr1[ L - j ] ), *rshifts ) ); + energy = silk_ADD32( energy, silk_RSHIFT32( silk_SMULBB( ptr1[ -j ], ptr1[ -j ] ), *rshifts ) ); matrix_ptr( XX, j, j, order ) = energy; + silk_assert( energy >= 0 ); } ptr2 = &x[ order - 2 ]; /* First sample of column 1 of X */ /* Calculate the remaining elements of the correlation matrix */ - if( rshifts_local > 0 ) { + if( *rshifts > 0 ) { /* Right shifting used */ for( lag = 1; lag < order; lag++ ) { /* Inner product of column 0 and column lag: X[:,0]'*X[:,lag] */ energy = 0; for( i = 0; i < L; i++ ) { - energy += silk_RSHIFT32( silk_SMULBB( ptr1[ i ], ptr2[i] ), rshifts_local ); + energy += silk_RSHIFT32( silk_SMULBB( ptr1[ i ], ptr2[i] ), *rshifts ); } /* Calculate remaining off diagonal: X[:,j]'*X[:,j + lag] */ matrix_ptr( XX, lag, 0, order ) = energy; matrix_ptr( XX, 0, lag, order ) = energy; for( j = 1; j < ( order - lag ); j++ ) { - energy = silk_SUB32( energy, silk_RSHIFT32( silk_SMULBB( ptr1[ L - j ], ptr2[ L - j ] ), rshifts_local ) ); - energy = silk_ADD32( energy, silk_RSHIFT32( silk_SMULBB( ptr1[ -j ], ptr2[ -j ] ), rshifts_local ) ); + energy = silk_SUB32( energy, silk_RSHIFT32( silk_SMULBB( ptr1[ L - j ], ptr2[ L - j ] ), *rshifts ) ); + energy = silk_ADD32( energy, silk_RSHIFT32( silk_SMULBB( ptr1[ -j ], ptr2[ -j ] ), *rshifts ) ); matrix_ptr( XX, lag + j, j, order ) = energy; matrix_ptr( XX, j, lag + j, order ) = energy; } @@ -153,6 +146,5 @@ void silk_corrMatrix_FIX( ptr2--;/* Update pointer to first sample of next column (lag) in X */ } } - *rshifts = rshifts_local; } diff --git a/thirdparty/opus/silk/fixed/encode_frame_FIX.c b/thirdparty/opus/silk/fixed/encode_frame_FIX.c index 5ef44b03fc..a02bf87dbb 100644 --- a/thirdparty/opus/silk/fixed/encode_frame_FIX.c +++ b/thirdparty/opus/silk/fixed/encode_frame_FIX.c @@ -29,6 +29,7 @@ POSSIBILITY OF SUCH DAMAGE. #include "config.h" #endif +#include <stdlib.h> #include "main_FIX.h" #include "stack_alloc.h" #include "tuning_parameters.h" @@ -37,26 +38,33 @@ POSSIBILITY OF SUCH DAMAGE. static OPUS_INLINE void silk_LBRR_encode_FIX( silk_encoder_state_FIX *psEnc, /* I/O Pointer to Silk FIX encoder state */ silk_encoder_control_FIX *psEncCtrl, /* I/O Pointer to Silk FIX encoder control struct */ - const opus_int32 xfw_Q3[], /* I Input signal */ + const opus_int16 x16[], /* I Input signal */ opus_int condCoding /* I The type of conditional coding used so far for this frame */ ); void silk_encode_do_VAD_FIX( - silk_encoder_state_FIX *psEnc /* I/O Pointer to Silk FIX encoder state */ + silk_encoder_state_FIX *psEnc, /* I/O Pointer to Silk FIX encoder state */ + opus_int activity /* I Decision of Opus voice activity detector */ ) { + const opus_int activity_threshold = SILK_FIX_CONST( SPEECH_ACTIVITY_DTX_THRES, 8 ); + /****************************/ /* Voice Activity Detection */ /****************************/ silk_VAD_GetSA_Q8( &psEnc->sCmn, psEnc->sCmn.inputBuf + 1, psEnc->sCmn.arch ); + /* If Opus VAD is inactive and Silk VAD is active: lower Silk VAD to just under the threshold */ + if( activity == VAD_NO_ACTIVITY && psEnc->sCmn.speech_activity_Q8 >= activity_threshold ) { + psEnc->sCmn.speech_activity_Q8 = activity_threshold - 1; + } /**************************************************/ /* Convert speech activity into VAD and DTX flags */ /**************************************************/ - if( psEnc->sCmn.speech_activity_Q8 < SILK_FIX_CONST( SPEECH_ACTIVITY_DTX_THRES, 8 ) ) { + if( psEnc->sCmn.speech_activity_Q8 < activity_threshold ) { psEnc->sCmn.indices.signalType = TYPE_NO_VOICE_ACTIVITY; psEnc->sCmn.noSpeechCounter++; - if( psEnc->sCmn.noSpeechCounter < NB_SPEECH_FRAMES_BEFORE_DTX ) { + if( psEnc->sCmn.noSpeechCounter <= NB_SPEECH_FRAMES_BEFORE_DTX ) { psEnc->sCmn.inDTX = 0; } else if( psEnc->sCmn.noSpeechCounter > MAX_CONSECUTIVE_DTX + NB_SPEECH_FRAMES_BEFORE_DTX ) { psEnc->sCmn.noSpeechCounter = NB_SPEECH_FRAMES_BEFORE_DTX; @@ -94,6 +102,9 @@ opus_int silk_encode_frame_FIX( opus_int16 ec_prevLagIndex_copy; opus_int ec_prevSignalType_copy; opus_int8 LastGainIndex_copy2; + opus_int gain_lock[ MAX_NB_SUBFR ] = {0}; + opus_int16 best_gain_mult[ MAX_NB_SUBFR ]; + opus_int best_sum[ MAX_NB_SUBFR ]; SAVE_STACK; /* This is totally unnecessary but many compilers (including gcc) are too dumb to realise it */ @@ -118,7 +129,6 @@ opus_int silk_encode_frame_FIX( silk_memcpy( x_frame + LA_SHAPE_MS * psEnc->sCmn.fs_kHz, psEnc->sCmn.inputBuf + 1, psEnc->sCmn.frame_length * sizeof( opus_int16 ) ); if( !psEnc->sCmn.prefillFlag ) { - VARDECL( opus_int32, xfw_Q3 ); VARDECL( opus_int16, res_pitch ); VARDECL( opus_uint8, ec_buf_copy ); opus_int16 *res_pitch_frame; @@ -132,7 +142,7 @@ opus_int silk_encode_frame_FIX( /*****************************************/ /* Find pitch lags, initial LPC analysis */ /*****************************************/ - silk_find_pitch_lags_FIX( psEnc, &sEncCtrl, res_pitch, x_frame, psEnc->sCmn.arch ); + silk_find_pitch_lags_FIX( psEnc, &sEncCtrl, res_pitch, x_frame - psEnc->sCmn.ltp_mem_length, psEnc->sCmn.arch ); /************************/ /* Noise shape analysis */ @@ -142,23 +152,17 @@ opus_int silk_encode_frame_FIX( /***************************************************/ /* Find linear prediction coefficients (LPC + LTP) */ /***************************************************/ - silk_find_pred_coefs_FIX( psEnc, &sEncCtrl, res_pitch, x_frame, condCoding ); + silk_find_pred_coefs_FIX( psEnc, &sEncCtrl, res_pitch_frame, x_frame, condCoding ); /****************************************/ /* Process gains */ /****************************************/ silk_process_gains_FIX( psEnc, &sEncCtrl, condCoding ); - /*****************************************/ - /* Prefiltering for noise shaper */ - /*****************************************/ - ALLOC( xfw_Q3, psEnc->sCmn.frame_length, opus_int32 ); - silk_prefilter_FIX( psEnc, &sEncCtrl, xfw_Q3, x_frame ); - /****************************************/ /* Low Bitrate Redundant Encoding */ /****************************************/ - silk_LBRR_encode_FIX( psEnc, &sEncCtrl, xfw_Q3, condCoding ); + silk_LBRR_encode_FIX( psEnc, &sEncCtrl, x_frame, condCoding ); /* Loop over quantizer and entropy coding to control bitrate */ maxIter = 6; @@ -194,17 +198,21 @@ opus_int silk_encode_frame_FIX( /* Noise shaping quantization */ /*****************************************/ if( psEnc->sCmn.nStatesDelayedDecision > 1 || psEnc->sCmn.warping_Q16 > 0 ) { - silk_NSQ_del_dec( &psEnc->sCmn, &psEnc->sCmn.sNSQ, &psEnc->sCmn.indices, xfw_Q3, psEnc->sCmn.pulses, - sEncCtrl.PredCoef_Q12[ 0 ], sEncCtrl.LTPCoef_Q14, sEncCtrl.AR2_Q13, sEncCtrl.HarmShapeGain_Q14, + silk_NSQ_del_dec( &psEnc->sCmn, &psEnc->sCmn.sNSQ, &psEnc->sCmn.indices, x_frame, psEnc->sCmn.pulses, + sEncCtrl.PredCoef_Q12[ 0 ], sEncCtrl.LTPCoef_Q14, sEncCtrl.AR_Q13, sEncCtrl.HarmShapeGain_Q14, sEncCtrl.Tilt_Q14, sEncCtrl.LF_shp_Q14, sEncCtrl.Gains_Q16, sEncCtrl.pitchL, sEncCtrl.Lambda_Q10, sEncCtrl.LTP_scale_Q14, psEnc->sCmn.arch ); } else { - silk_NSQ( &psEnc->sCmn, &psEnc->sCmn.sNSQ, &psEnc->sCmn.indices, xfw_Q3, psEnc->sCmn.pulses, - sEncCtrl.PredCoef_Q12[ 0 ], sEncCtrl.LTPCoef_Q14, sEncCtrl.AR2_Q13, sEncCtrl.HarmShapeGain_Q14, + silk_NSQ( &psEnc->sCmn, &psEnc->sCmn.sNSQ, &psEnc->sCmn.indices, x_frame, psEnc->sCmn.pulses, + sEncCtrl.PredCoef_Q12[ 0 ], sEncCtrl.LTPCoef_Q14, sEncCtrl.AR_Q13, sEncCtrl.HarmShapeGain_Q14, sEncCtrl.Tilt_Q14, sEncCtrl.LF_shp_Q14, sEncCtrl.Gains_Q16, sEncCtrl.pitchL, sEncCtrl.Lambda_Q10, sEncCtrl.LTP_scale_Q14, psEnc->sCmn.arch); } + if ( iter == maxIter && !found_lower ) { + silk_memcpy( &sRangeEnc_copy2, psRangeEnc, sizeof( ec_enc ) ); + } + /****************************************/ /* Encode Parameters */ /****************************************/ @@ -218,6 +226,33 @@ opus_int silk_encode_frame_FIX( nBits = ec_tell( psRangeEnc ); + /* If we still bust after the last iteration, do some damage control. */ + if ( iter == maxIter && !found_lower && nBits > maxBits ) { + silk_memcpy( psRangeEnc, &sRangeEnc_copy2, sizeof( ec_enc ) ); + + /* Keep gains the same as the last frame. */ + psEnc->sShape.LastGainIndex = sEncCtrl.lastGainIndexPrev; + for ( i = 0; i < psEnc->sCmn.nb_subfr; i++ ) { + psEnc->sCmn.indices.GainsIndices[ i ] = 4; + } + if (condCoding != CODE_CONDITIONALLY) { + psEnc->sCmn.indices.GainsIndices[ 0 ] = sEncCtrl.lastGainIndexPrev; + } + psEnc->sCmn.ec_prevLagIndex = ec_prevLagIndex_copy; + psEnc->sCmn.ec_prevSignalType = ec_prevSignalType_copy; + /* Clear all pulses. */ + for ( i = 0; i < psEnc->sCmn.frame_length; i++ ) { + psEnc->sCmn.pulses[ i ] = 0; + } + + silk_encode_indices( &psEnc->sCmn, psRangeEnc, psEnc->sCmn.nFramesEncoded, 0, condCoding ); + + silk_encode_pulses( psRangeEnc, psEnc->sCmn.indices.signalType, psEnc->sCmn.indices.quantOffsetType, + psEnc->sCmn.pulses, psEnc->sCmn.frame_length ); + + nBits = ec_tell( psRangeEnc ); + } + if( useCBR == 0 && iter == 0 && nBits <= maxBits ) { break; } @@ -227,7 +262,7 @@ opus_int silk_encode_frame_FIX( if( found_lower && ( gainsID == gainsID_lower || nBits > maxBits ) ) { /* Restore output state from earlier iteration that did meet the bitrate budget */ silk_memcpy( psRangeEnc, &sRangeEnc_copy2, sizeof( ec_enc ) ); - silk_assert( sRangeEnc_copy2.offs <= 1275 ); + celt_assert( sRangeEnc_copy2.offs <= 1275 ); silk_memcpy( psRangeEnc->buf, ec_buf_copy, sRangeEnc_copy2.offs ); silk_memcpy( &psEnc->sCmn.sNSQ, &sNSQ_copy2, sizeof( silk_nsq_state ) ); psEnc->sShape.LastGainIndex = LastGainIndex_copy2; @@ -255,7 +290,7 @@ opus_int silk_encode_frame_FIX( gainsID_lower = gainsID; /* Copy part of the output state */ silk_memcpy( &sRangeEnc_copy2, psRangeEnc, sizeof( ec_enc ) ); - silk_assert( psRangeEnc->offs <= 1275 ); + celt_assert( psRangeEnc->offs <= 1275 ); silk_memcpy( ec_buf_copy, psRangeEnc->buf, psRangeEnc->offs ); silk_memcpy( &sNSQ_copy2, &psEnc->sCmn.sNSQ, sizeof( silk_nsq_state ) ); LastGainIndex_copy2 = psEnc->sShape.LastGainIndex; @@ -265,15 +300,35 @@ opus_int silk_encode_frame_FIX( break; } + if ( !found_lower && nBits > maxBits ) { + int j; + for ( i = 0; i < psEnc->sCmn.nb_subfr; i++ ) { + int sum=0; + for ( j = i*psEnc->sCmn.subfr_length; j < (i+1)*psEnc->sCmn.subfr_length; j++ ) { + sum += abs( psEnc->sCmn.pulses[j] ); + } + if ( iter == 0 || (sum < best_sum[i] && !gain_lock[i]) ) { + best_sum[i] = sum; + best_gain_mult[i] = gainMult_Q8; + } else { + gain_lock[i] = 1; + } + } + } if( ( found_lower & found_upper ) == 0 ) { /* Adjust gain according to high-rate rate/distortion curve */ - opus_int32 gain_factor_Q16; - gain_factor_Q16 = silk_log2lin( silk_LSHIFT( nBits - maxBits, 7 ) / psEnc->sCmn.frame_length + SILK_FIX_CONST( 16, 7 ) ); - gain_factor_Q16 = silk_min_32( gain_factor_Q16, SILK_FIX_CONST( 2, 16 ) ); if( nBits > maxBits ) { - gain_factor_Q16 = silk_max_32( gain_factor_Q16, SILK_FIX_CONST( 1.3, 16 ) ); + if (gainMult_Q8 < 16384) { + gainMult_Q8 *= 2; + } else { + gainMult_Q8 = 32767; + } + } else { + opus_int32 gain_factor_Q16; + gain_factor_Q16 = silk_log2lin( silk_LSHIFT( nBits - maxBits, 7 ) / psEnc->sCmn.frame_length + SILK_FIX_CONST( 16, 7 ) ); + gainMult_Q8 = silk_SMULWB( gain_factor_Q16, gainMult_Q8 ); } - gainMult_Q8 = silk_SMULWB( gain_factor_Q16, gainMult_Q8 ); + } else { /* Adjust gain by interpolating */ gainMult_Q8 = gainMult_lower + silk_DIV32_16( silk_MUL( gainMult_upper - gainMult_lower, maxBits - nBits_lower ), nBits_upper - nBits_lower ); @@ -287,7 +342,13 @@ opus_int silk_encode_frame_FIX( } for( i = 0; i < psEnc->sCmn.nb_subfr; i++ ) { - sEncCtrl.Gains_Q16[ i ] = silk_LSHIFT_SAT32( silk_SMULWB( sEncCtrl.GainsUnq_Q16[ i ], gainMult_Q8 ), 8 ); + opus_int16 tmp; + if ( gain_lock[i] ) { + tmp = best_gain_mult[i]; + } else { + tmp = gainMult_Q8; + } + sEncCtrl.Gains_Q16[ i ] = silk_LSHIFT_SAT32( silk_SMULWB( sEncCtrl.GainsUnq_Q16[ i ], tmp ), 8 ); } /* Quantize gains */ @@ -331,7 +392,7 @@ opus_int silk_encode_frame_FIX( static OPUS_INLINE void silk_LBRR_encode_FIX( silk_encoder_state_FIX *psEnc, /* I/O Pointer to Silk FIX encoder state */ silk_encoder_control_FIX *psEncCtrl, /* I/O Pointer to Silk FIX encoder control struct */ - const opus_int32 xfw_Q3[], /* I Input signal */ + const opus_int16 x16[], /* I Input signal */ opus_int condCoding /* I The type of conditional coding used so far for this frame */ ) { @@ -370,14 +431,14 @@ static OPUS_INLINE void silk_LBRR_encode_FIX( /* Noise shaping quantization */ /*****************************************/ if( psEnc->sCmn.nStatesDelayedDecision > 1 || psEnc->sCmn.warping_Q16 > 0 ) { - silk_NSQ_del_dec( &psEnc->sCmn, &sNSQ_LBRR, psIndices_LBRR, xfw_Q3, + silk_NSQ_del_dec( &psEnc->sCmn, &sNSQ_LBRR, psIndices_LBRR, x16, psEnc->sCmn.pulses_LBRR[ psEnc->sCmn.nFramesEncoded ], psEncCtrl->PredCoef_Q12[ 0 ], psEncCtrl->LTPCoef_Q14, - psEncCtrl->AR2_Q13, psEncCtrl->HarmShapeGain_Q14, psEncCtrl->Tilt_Q14, psEncCtrl->LF_shp_Q14, + psEncCtrl->AR_Q13, psEncCtrl->HarmShapeGain_Q14, psEncCtrl->Tilt_Q14, psEncCtrl->LF_shp_Q14, psEncCtrl->Gains_Q16, psEncCtrl->pitchL, psEncCtrl->Lambda_Q10, psEncCtrl->LTP_scale_Q14, psEnc->sCmn.arch ); } else { - silk_NSQ( &psEnc->sCmn, &sNSQ_LBRR, psIndices_LBRR, xfw_Q3, + silk_NSQ( &psEnc->sCmn, &sNSQ_LBRR, psIndices_LBRR, x16, psEnc->sCmn.pulses_LBRR[ psEnc->sCmn.nFramesEncoded ], psEncCtrl->PredCoef_Q12[ 0 ], psEncCtrl->LTPCoef_Q14, - psEncCtrl->AR2_Q13, psEncCtrl->HarmShapeGain_Q14, psEncCtrl->Tilt_Q14, psEncCtrl->LF_shp_Q14, + psEncCtrl->AR_Q13, psEncCtrl->HarmShapeGain_Q14, psEncCtrl->Tilt_Q14, psEncCtrl->LF_shp_Q14, psEncCtrl->Gains_Q16, psEncCtrl->pitchL, psEncCtrl->Lambda_Q10, psEncCtrl->LTP_scale_Q14, psEnc->sCmn.arch ); } diff --git a/thirdparty/opus/silk/fixed/find_LPC_FIX.c b/thirdparty/opus/silk/fixed/find_LPC_FIX.c index e11cdc86e6..c762a0f2a2 100644 --- a/thirdparty/opus/silk/fixed/find_LPC_FIX.c +++ b/thirdparty/opus/silk/fixed/find_LPC_FIX.c @@ -92,7 +92,7 @@ void silk_find_LPC_FIX( silk_interpolate( NLSF0_Q15, psEncC->prev_NLSFq_Q15, NLSF_Q15, k, psEncC->predictLPCOrder ); /* Convert to LPC for residual energy evaluation */ - silk_NLSF2A( a_tmp_Q12, NLSF0_Q15, psEncC->predictLPCOrder ); + silk_NLSF2A( a_tmp_Q12, NLSF0_Q15, psEncC->predictLPCOrder, psEncC->arch ); /* Calculate residual energy with NLSF interpolation */ silk_LPC_analysis_filter( LPC_res, x, a_tmp_Q12, 2 * subfr_length, psEncC->predictLPCOrder, psEncC->arch ); @@ -146,6 +146,6 @@ void silk_find_LPC_FIX( silk_A2NLSF( NLSF_Q15, a_Q16, psEncC->predictLPCOrder ); } - silk_assert( psEncC->indices.NLSFInterpCoef_Q2 == 4 || ( psEncC->useInterpolatedNLSFs && !psEncC->first_frame_after_reset && psEncC->nb_subfr == MAX_NB_SUBFR ) ); + celt_assert( psEncC->indices.NLSFInterpCoef_Q2 == 4 || ( psEncC->useInterpolatedNLSFs && !psEncC->first_frame_after_reset && psEncC->nb_subfr == MAX_NB_SUBFR ) ); RESTORE_STACK; } diff --git a/thirdparty/opus/silk/fixed/find_LTP_FIX.c b/thirdparty/opus/silk/fixed/find_LTP_FIX.c index 1314a28137..62d4afb250 100644 --- a/thirdparty/opus/silk/fixed/find_LTP_FIX.c +++ b/thirdparty/opus/silk/fixed/find_LTP_FIX.c @@ -32,214 +32,68 @@ POSSIBILITY OF SUCH DAMAGE. #include "main_FIX.h" #include "tuning_parameters.h" -/* Head room for correlations */ -#define LTP_CORRS_HEAD_ROOM 2 - -void silk_fit_LTP( - opus_int32 LTP_coefs_Q16[ LTP_ORDER ], - opus_int16 LTP_coefs_Q14[ LTP_ORDER ] -); - void silk_find_LTP_FIX( - opus_int16 b_Q14[ MAX_NB_SUBFR * LTP_ORDER ], /* O LTP coefs */ - opus_int32 WLTP[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ], /* O Weight for LTP quantization */ - opus_int *LTPredCodGain_Q7, /* O LTP coding gain */ - const opus_int16 r_lpc[], /* I residual signal after LPC signal + state for first 10 ms */ + opus_int32 XXLTP_Q17[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ], /* O Correlation matrix */ + opus_int32 xXLTP_Q17[ MAX_NB_SUBFR * LTP_ORDER ], /* O Correlation vector */ + const opus_int16 r_ptr[], /* I Residual signal after LPC */ const opus_int lag[ MAX_NB_SUBFR ], /* I LTP lags */ - const opus_int32 Wght_Q15[ MAX_NB_SUBFR ], /* I weights */ - const opus_int subfr_length, /* I subframe length */ - const opus_int nb_subfr, /* I number of subframes */ - const opus_int mem_offset, /* I number of samples in LTP memory */ - opus_int corr_rshifts[ MAX_NB_SUBFR ], /* O right shifts applied to correlations */ + const opus_int subfr_length, /* I Subframe length */ + const opus_int nb_subfr, /* I Number of subframes */ int arch /* I Run-time architecture */ ) { - opus_int i, k, lshift; - const opus_int16 *r_ptr, *lag_ptr; - opus_int16 *b_Q14_ptr; - - opus_int32 regu; - opus_int32 *WLTP_ptr; - opus_int32 b_Q16[ LTP_ORDER ], delta_b_Q14[ LTP_ORDER ], d_Q14[ MAX_NB_SUBFR ], nrg[ MAX_NB_SUBFR ], g_Q26; - opus_int32 w[ MAX_NB_SUBFR ], WLTP_max, max_abs_d_Q14, max_w_bits; - - opus_int32 temp32, denom32; - opus_int extra_shifts; - opus_int rr_shifts, maxRshifts, maxRshifts_wxtra, LZs; - opus_int32 LPC_res_nrg, LPC_LTP_res_nrg, div_Q16; - opus_int32 Rr[ LTP_ORDER ], rr[ MAX_NB_SUBFR ]; - opus_int32 wd, m_Q12; - - b_Q14_ptr = b_Q14; - WLTP_ptr = WLTP; - r_ptr = &r_lpc[ mem_offset ]; + opus_int i, k, extra_shifts; + opus_int xx_shifts, xX_shifts, XX_shifts; + const opus_int16 *lag_ptr; + opus_int32 *XXLTP_Q17_ptr, *xXLTP_Q17_ptr; + opus_int32 xx, nrg, temp; + + xXLTP_Q17_ptr = xXLTP_Q17; + XXLTP_Q17_ptr = XXLTP_Q17; for( k = 0; k < nb_subfr; k++ ) { lag_ptr = r_ptr - ( lag[ k ] + LTP_ORDER / 2 ); - silk_sum_sqr_shift( &rr[ k ], &rr_shifts, r_ptr, subfr_length ); /* rr[ k ] in Q( -rr_shifts ) */ - - /* Assure headroom */ - LZs = silk_CLZ32( rr[k] ); - if( LZs < LTP_CORRS_HEAD_ROOM ) { - rr[ k ] = silk_RSHIFT_ROUND( rr[ k ], LTP_CORRS_HEAD_ROOM - LZs ); - rr_shifts += ( LTP_CORRS_HEAD_ROOM - LZs ); - } - corr_rshifts[ k ] = rr_shifts; - silk_corrMatrix_FIX( lag_ptr, subfr_length, LTP_ORDER, LTP_CORRS_HEAD_ROOM, WLTP_ptr, &corr_rshifts[ k ], arch ); /* WLTP_fix_ptr in Q( -corr_rshifts[ k ] ) */ - - /* The correlation vector always has lower max abs value than rr and/or RR so head room is assured */ - silk_corrVector_FIX( lag_ptr, r_ptr, subfr_length, LTP_ORDER, Rr, corr_rshifts[ k ], arch ); /* Rr_fix_ptr in Q( -corr_rshifts[ k ] ) */ - if( corr_rshifts[ k ] > rr_shifts ) { - rr[ k ] = silk_RSHIFT( rr[ k ], corr_rshifts[ k ] - rr_shifts ); /* rr[ k ] in Q( -corr_rshifts[ k ] ) */ + silk_sum_sqr_shift( &xx, &xx_shifts, r_ptr, subfr_length + LTP_ORDER ); /* xx in Q( -xx_shifts ) */ + silk_corrMatrix_FIX( lag_ptr, subfr_length, LTP_ORDER, XXLTP_Q17_ptr, &nrg, &XX_shifts, arch ); /* XXLTP_Q17_ptr and nrg in Q( -XX_shifts ) */ + extra_shifts = xx_shifts - XX_shifts; + if( extra_shifts > 0 ) { + /* Shift XX */ + xX_shifts = xx_shifts; + for( i = 0; i < LTP_ORDER * LTP_ORDER; i++ ) { + XXLTP_Q17_ptr[ i ] = silk_RSHIFT32( XXLTP_Q17_ptr[ i ], extra_shifts ); /* Q( -xX_shifts ) */ + } + nrg = silk_RSHIFT32( nrg, extra_shifts ); /* Q( -xX_shifts ) */ + } else if( extra_shifts < 0 ) { + /* Shift xx */ + xX_shifts = XX_shifts; + xx = silk_RSHIFT32( xx, -extra_shifts ); /* Q( -xX_shifts ) */ + } else { + xX_shifts = xx_shifts; } - silk_assert( rr[ k ] >= 0 ); - - regu = 1; - regu = silk_SMLAWB( regu, rr[ k ], SILK_FIX_CONST( LTP_DAMPING/3, 16 ) ); - regu = silk_SMLAWB( regu, matrix_ptr( WLTP_ptr, 0, 0, LTP_ORDER ), SILK_FIX_CONST( LTP_DAMPING/3, 16 ) ); - regu = silk_SMLAWB( regu, matrix_ptr( WLTP_ptr, LTP_ORDER-1, LTP_ORDER-1, LTP_ORDER ), SILK_FIX_CONST( LTP_DAMPING/3, 16 ) ); - silk_regularize_correlations_FIX( WLTP_ptr, &rr[k], regu, LTP_ORDER ); - - silk_solve_LDL_FIX( WLTP_ptr, LTP_ORDER, Rr, b_Q16 ); /* WLTP_fix_ptr and Rr_fix_ptr both in Q(-corr_rshifts[k]) */ - - /* Limit and store in Q14 */ - silk_fit_LTP( b_Q16, b_Q14_ptr ); - - /* Calculate residual energy */ - nrg[ k ] = silk_residual_energy16_covar_FIX( b_Q14_ptr, WLTP_ptr, Rr, rr[ k ], LTP_ORDER, 14 ); /* nrg_fix in Q( -corr_rshifts[ k ] ) */ - - /* temp = Wght[ k ] / ( nrg[ k ] * Wght[ k ] + 0.01f * subfr_length ); */ - extra_shifts = silk_min_int( corr_rshifts[ k ], LTP_CORRS_HEAD_ROOM ); - denom32 = silk_LSHIFT_SAT32( silk_SMULWB( nrg[ k ], Wght_Q15[ k ] ), 1 + extra_shifts ) + /* Q( -corr_rshifts[ k ] + extra_shifts ) */ - silk_RSHIFT( silk_SMULWB( (opus_int32)subfr_length, 655 ), corr_rshifts[ k ] - extra_shifts ); /* Q( -corr_rshifts[ k ] + extra_shifts ) */ - denom32 = silk_max( denom32, 1 ); - silk_assert( ((opus_int64)Wght_Q15[ k ] << 16 ) < silk_int32_MAX ); /* Wght always < 0.5 in Q0 */ - temp32 = silk_DIV32( silk_LSHIFT( (opus_int32)Wght_Q15[ k ], 16 ), denom32 ); /* Q( 15 + 16 + corr_rshifts[k] - extra_shifts ) */ - temp32 = silk_RSHIFT( temp32, 31 + corr_rshifts[ k ] - extra_shifts - 26 ); /* Q26 */ + silk_corrVector_FIX( lag_ptr, r_ptr, subfr_length, LTP_ORDER, xXLTP_Q17_ptr, xX_shifts, arch ); /* xXLTP_Q17_ptr in Q( -xX_shifts ) */ - /* Limit temp such that the below scaling never wraps around */ - WLTP_max = 0; + /* At this point all correlations are in Q(-xX_shifts) */ + temp = silk_SMLAWB( 1, nrg, SILK_FIX_CONST( LTP_CORR_INV_MAX, 16 ) ); + temp = silk_max( temp, xx ); +TIC(div) +#if 0 for( i = 0; i < LTP_ORDER * LTP_ORDER; i++ ) { - WLTP_max = silk_max( WLTP_ptr[ i ], WLTP_max ); + XXLTP_Q17_ptr[ i ] = silk_DIV32_varQ( XXLTP_Q17_ptr[ i ], temp, 17 ); } - lshift = silk_CLZ32( WLTP_max ) - 1 - 3; /* keep 3 bits free for vq_nearest_neighbor_fix */ - silk_assert( 26 - 18 + lshift >= 0 ); - if( 26 - 18 + lshift < 31 ) { - temp32 = silk_min_32( temp32, silk_LSHIFT( (opus_int32)1, 26 - 18 + lshift ) ); - } - - silk_scale_vector32_Q26_lshift_18( WLTP_ptr, temp32, LTP_ORDER * LTP_ORDER ); /* WLTP_ptr in Q( 18 - corr_rshifts[ k ] ) */ - - w[ k ] = matrix_ptr( WLTP_ptr, LTP_ORDER/2, LTP_ORDER/2, LTP_ORDER ); /* w in Q( 18 - corr_rshifts[ k ] ) */ - silk_assert( w[k] >= 0 ); - - r_ptr += subfr_length; - b_Q14_ptr += LTP_ORDER; - WLTP_ptr += LTP_ORDER * LTP_ORDER; - } - - maxRshifts = 0; - for( k = 0; k < nb_subfr; k++ ) { - maxRshifts = silk_max_int( corr_rshifts[ k ], maxRshifts ); - } - - /* Compute LTP coding gain */ - if( LTPredCodGain_Q7 != NULL ) { - LPC_LTP_res_nrg = 0; - LPC_res_nrg = 0; - silk_assert( LTP_CORRS_HEAD_ROOM >= 2 ); /* Check that no overflow will happen when adding */ - for( k = 0; k < nb_subfr; k++ ) { - LPC_res_nrg = silk_ADD32( LPC_res_nrg, silk_RSHIFT( silk_ADD32( silk_SMULWB( rr[ k ], Wght_Q15[ k ] ), 1 ), 1 + ( maxRshifts - corr_rshifts[ k ] ) ) ); /* Q( -maxRshifts ) */ - LPC_LTP_res_nrg = silk_ADD32( LPC_LTP_res_nrg, silk_RSHIFT( silk_ADD32( silk_SMULWB( nrg[ k ], Wght_Q15[ k ] ), 1 ), 1 + ( maxRshifts - corr_rshifts[ k ] ) ) ); /* Q( -maxRshifts ) */ - } - LPC_LTP_res_nrg = silk_max( LPC_LTP_res_nrg, 1 ); /* avoid division by zero */ - - div_Q16 = silk_DIV32_varQ( LPC_res_nrg, LPC_LTP_res_nrg, 16 ); - *LTPredCodGain_Q7 = ( opus_int )silk_SMULBB( 3, silk_lin2log( div_Q16 ) - ( 16 << 7 ) ); - - silk_assert( *LTPredCodGain_Q7 == ( opus_int )silk_SAT16( silk_MUL( 3, silk_lin2log( div_Q16 ) - ( 16 << 7 ) ) ) ); - } - - /* smoothing */ - /* d = sum( B, 1 ); */ - b_Q14_ptr = b_Q14; - for( k = 0; k < nb_subfr; k++ ) { - d_Q14[ k ] = 0; for( i = 0; i < LTP_ORDER; i++ ) { - d_Q14[ k ] += b_Q14_ptr[ i ]; - } - b_Q14_ptr += LTP_ORDER; - } - - /* m = ( w * d' ) / ( sum( w ) + 1e-3 ); */ - - /* Find maximum absolute value of d_Q14 and the bits used by w in Q0 */ - max_abs_d_Q14 = 0; - max_w_bits = 0; - for( k = 0; k < nb_subfr; k++ ) { - max_abs_d_Q14 = silk_max_32( max_abs_d_Q14, silk_abs( d_Q14[ k ] ) ); - /* w[ k ] is in Q( 18 - corr_rshifts[ k ] ) */ - /* Find bits needed in Q( 18 - maxRshifts ) */ - max_w_bits = silk_max_32( max_w_bits, 32 - silk_CLZ32( w[ k ] ) + corr_rshifts[ k ] - maxRshifts ); - } - - /* max_abs_d_Q14 = (5 << 15); worst case, i.e. LTP_ORDER * -silk_int16_MIN */ - silk_assert( max_abs_d_Q14 <= ( 5 << 15 ) ); - - /* How many bits is needed for w*d' in Q( 18 - maxRshifts ) in the worst case, of all d_Q14's being equal to max_abs_d_Q14 */ - extra_shifts = max_w_bits + 32 - silk_CLZ32( max_abs_d_Q14 ) - 14; - - /* Subtract what we got available; bits in output var plus maxRshifts */ - extra_shifts -= ( 32 - 1 - 2 + maxRshifts ); /* Keep sign bit free as well as 2 bits for accumulation */ - extra_shifts = silk_max_int( extra_shifts, 0 ); - - maxRshifts_wxtra = maxRshifts + extra_shifts; - - temp32 = silk_RSHIFT( 262, maxRshifts + extra_shifts ) + 1; /* 1e-3f in Q( 18 - (maxRshifts + extra_shifts) ) */ - wd = 0; - for( k = 0; k < nb_subfr; k++ ) { - /* w has at least 2 bits of headroom so no overflow should happen */ - temp32 = silk_ADD32( temp32, silk_RSHIFT( w[ k ], maxRshifts_wxtra - corr_rshifts[ k ] ) ); /* Q( 18 - maxRshifts_wxtra ) */ - wd = silk_ADD32( wd, silk_LSHIFT( silk_SMULWW( silk_RSHIFT( w[ k ], maxRshifts_wxtra - corr_rshifts[ k ] ), d_Q14[ k ] ), 2 ) ); /* Q( 18 - maxRshifts_wxtra ) */ - } - m_Q12 = silk_DIV32_varQ( wd, temp32, 12 ); - - b_Q14_ptr = b_Q14; - for( k = 0; k < nb_subfr; k++ ) { - /* w_fix[ k ] from Q( 18 - corr_rshifts[ k ] ) to Q( 16 ) */ - if( 2 - corr_rshifts[k] > 0 ) { - temp32 = silk_RSHIFT( w[ k ], 2 - corr_rshifts[ k ] ); - } else { - temp32 = silk_LSHIFT_SAT32( w[ k ], corr_rshifts[ k ] - 2 ); + xXLTP_Q17_ptr[ i ] = silk_DIV32_varQ( xXLTP_Q17_ptr[ i ], temp, 17 ); } - - g_Q26 = silk_MUL( - silk_DIV32( - SILK_FIX_CONST( LTP_SMOOTHING, 26 ), - silk_RSHIFT( SILK_FIX_CONST( LTP_SMOOTHING, 26 ), 10 ) + temp32 ), /* Q10 */ - silk_LSHIFT_SAT32( silk_SUB_SAT32( (opus_int32)m_Q12, silk_RSHIFT( d_Q14[ k ], 2 ) ), 4 ) ); /* Q16 */ - - temp32 = 0; - for( i = 0; i < LTP_ORDER; i++ ) { - delta_b_Q14[ i ] = silk_max_16( b_Q14_ptr[ i ], 1638 ); /* 1638_Q14 = 0.1_Q0 */ - temp32 += delta_b_Q14[ i ]; /* Q14 */ +#else + for( i = 0; i < LTP_ORDER * LTP_ORDER; i++ ) { + XXLTP_Q17_ptr[ i ] = (opus_int32)( silk_LSHIFT64( (opus_int64)XXLTP_Q17_ptr[ i ], 17 ) / temp ); } - temp32 = silk_DIV32( g_Q26, temp32 ); /* Q14 -> Q12 */ for( i = 0; i < LTP_ORDER; i++ ) { - b_Q14_ptr[ i ] = silk_LIMIT_32( (opus_int32)b_Q14_ptr[ i ] + silk_SMULWB( silk_LSHIFT_SAT32( temp32, 4 ), delta_b_Q14[ i ] ), -16000, 28000 ); + xXLTP_Q17_ptr[ i ] = (opus_int32)( silk_LSHIFT64( (opus_int64)xXLTP_Q17_ptr[ i ], 17 ) / temp ); } - b_Q14_ptr += LTP_ORDER; - } -} - -void silk_fit_LTP( - opus_int32 LTP_coefs_Q16[ LTP_ORDER ], - opus_int16 LTP_coefs_Q14[ LTP_ORDER ] -) -{ - opus_int i; - - for( i = 0; i < LTP_ORDER; i++ ) { - LTP_coefs_Q14[ i ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( LTP_coefs_Q16[ i ], 2 ) ); +#endif +TOC(div) + r_ptr += subfr_length; + XXLTP_Q17_ptr += LTP_ORDER * LTP_ORDER; + xXLTP_Q17_ptr += LTP_ORDER; } } diff --git a/thirdparty/opus/silk/fixed/find_pitch_lags_FIX.c b/thirdparty/opus/silk/fixed/find_pitch_lags_FIX.c index b8440a8247..6c3379f2bb 100644 --- a/thirdparty/opus/silk/fixed/find_pitch_lags_FIX.c +++ b/thirdparty/opus/silk/fixed/find_pitch_lags_FIX.c @@ -44,7 +44,7 @@ void silk_find_pitch_lags_FIX( { opus_int buf_len, i, scale; opus_int32 thrhld_Q13, res_nrg; - const opus_int16 *x_buf, *x_buf_ptr; + const opus_int16 *x_ptr; VARDECL( opus_int16, Wsig ); opus_int16 *Wsig_ptr; opus_int32 auto_corr[ MAX_FIND_PITCH_LPC_ORDER + 1 ]; @@ -59,9 +59,7 @@ void silk_find_pitch_lags_FIX( buf_len = psEnc->sCmn.la_pitch + psEnc->sCmn.frame_length + psEnc->sCmn.ltp_mem_length; /* Safety check */ - silk_assert( buf_len >= psEnc->sCmn.pitch_LPC_win_length ); - - x_buf = x - psEnc->sCmn.ltp_mem_length; + celt_assert( buf_len >= psEnc->sCmn.pitch_LPC_win_length ); /*************************************/ /* Estimate LPC AR coefficients */ @@ -72,19 +70,19 @@ void silk_find_pitch_lags_FIX( ALLOC( Wsig, psEnc->sCmn.pitch_LPC_win_length, opus_int16 ); /* First LA_LTP samples */ - x_buf_ptr = x_buf + buf_len - psEnc->sCmn.pitch_LPC_win_length; + x_ptr = x + buf_len - psEnc->sCmn.pitch_LPC_win_length; Wsig_ptr = Wsig; - silk_apply_sine_window( Wsig_ptr, x_buf_ptr, 1, psEnc->sCmn.la_pitch ); + silk_apply_sine_window( Wsig_ptr, x_ptr, 1, psEnc->sCmn.la_pitch ); /* Middle un - windowed samples */ Wsig_ptr += psEnc->sCmn.la_pitch; - x_buf_ptr += psEnc->sCmn.la_pitch; - silk_memcpy( Wsig_ptr, x_buf_ptr, ( psEnc->sCmn.pitch_LPC_win_length - silk_LSHIFT( psEnc->sCmn.la_pitch, 1 ) ) * sizeof( opus_int16 ) ); + x_ptr += psEnc->sCmn.la_pitch; + silk_memcpy( Wsig_ptr, x_ptr, ( psEnc->sCmn.pitch_LPC_win_length - silk_LSHIFT( psEnc->sCmn.la_pitch, 1 ) ) * sizeof( opus_int16 ) ); /* Last LA_LTP samples */ Wsig_ptr += psEnc->sCmn.pitch_LPC_win_length - silk_LSHIFT( psEnc->sCmn.la_pitch, 1 ); - x_buf_ptr += psEnc->sCmn.pitch_LPC_win_length - silk_LSHIFT( psEnc->sCmn.la_pitch, 1 ); - silk_apply_sine_window( Wsig_ptr, x_buf_ptr, 2, psEnc->sCmn.la_pitch ); + x_ptr += psEnc->sCmn.pitch_LPC_win_length - silk_LSHIFT( psEnc->sCmn.la_pitch, 1 ); + silk_apply_sine_window( Wsig_ptr, x_ptr, 2, psEnc->sCmn.la_pitch ); /* Calculate autocorrelation sequence */ silk_autocorr( auto_corr, &scale, Wsig, psEnc->sCmn.pitch_LPC_win_length, psEnc->sCmn.pitchEstimationLPCOrder + 1, arch ); @@ -112,7 +110,7 @@ void silk_find_pitch_lags_FIX( /*****************************************/ /* LPC analysis filtering */ /*****************************************/ - silk_LPC_analysis_filter( res, x_buf, A_Q12, buf_len, psEnc->sCmn.pitchEstimationLPCOrder, psEnc->sCmn.arch ); + silk_LPC_analysis_filter( res, x, A_Q12, buf_len, psEnc->sCmn.pitchEstimationLPCOrder, psEnc->sCmn.arch ); if( psEnc->sCmn.indices.signalType != TYPE_NO_VOICE_ACTIVITY && psEnc->sCmn.first_frame_after_reset == 0 ) { /* Threshold for pitch estimator */ diff --git a/thirdparty/opus/silk/fixed/find_pred_coefs_FIX.c b/thirdparty/opus/silk/fixed/find_pred_coefs_FIX.c index d308e9cf5f..606d863347 100644 --- a/thirdparty/opus/silk/fixed/find_pred_coefs_FIX.c +++ b/thirdparty/opus/silk/fixed/find_pred_coefs_FIX.c @@ -41,13 +41,12 @@ void silk_find_pred_coefs_FIX( ) { opus_int i; - opus_int32 invGains_Q16[ MAX_NB_SUBFR ], local_gains[ MAX_NB_SUBFR ], Wght_Q15[ MAX_NB_SUBFR ]; + opus_int32 invGains_Q16[ MAX_NB_SUBFR ], local_gains[ MAX_NB_SUBFR ]; opus_int16 NLSF_Q15[ MAX_LPC_ORDER ]; const opus_int16 *x_ptr; opus_int16 *x_pre_ptr; VARDECL( opus_int16, LPC_in_pre ); - opus_int32 tmp, min_gain_Q16, minInvGain_Q30; - opus_int LTP_corrs_rshift[ MAX_NB_SUBFR ]; + opus_int32 min_gain_Q16, minInvGain_Q30; SAVE_STACK; /* weighting for weighted least squares */ @@ -61,13 +60,11 @@ void silk_find_pred_coefs_FIX( /* Invert and normalize gains, and ensure that maximum invGains_Q16 is within range of a 16 bit int */ invGains_Q16[ i ] = silk_DIV32_varQ( min_gain_Q16, psEncCtrl->Gains_Q16[ i ], 16 - 2 ); - /* Ensure Wght_Q15 a minimum value 1 */ - invGains_Q16[ i ] = silk_max( invGains_Q16[ i ], 363 ); + /* Limit inverse */ + invGains_Q16[ i ] = silk_max( invGains_Q16[ i ], 100 ); /* Square the inverted gains */ silk_assert( invGains_Q16[ i ] == silk_SAT16( invGains_Q16[ i ] ) ); - tmp = silk_SMULWB( invGains_Q16[ i ], invGains_Q16[ i ] ); - Wght_Q15[ i ] = silk_RSHIFT( tmp, 1 ); /* Invert the inverted and normalized gains */ local_gains[ i ] = silk_DIV32( ( (opus_int32)1 << 16 ), invGains_Q16[ i ] ); @@ -77,24 +74,24 @@ void silk_find_pred_coefs_FIX( psEnc->sCmn.nb_subfr * psEnc->sCmn.predictLPCOrder + psEnc->sCmn.frame_length, opus_int16 ); if( psEnc->sCmn.indices.signalType == TYPE_VOICED ) { - VARDECL( opus_int32, WLTP ); + VARDECL( opus_int32, xXLTP_Q17 ); + VARDECL( opus_int32, XXLTP_Q17 ); /**********/ /* VOICED */ /**********/ - silk_assert( psEnc->sCmn.ltp_mem_length - psEnc->sCmn.predictLPCOrder >= psEncCtrl->pitchL[ 0 ] + LTP_ORDER / 2 ); + celt_assert( psEnc->sCmn.ltp_mem_length - psEnc->sCmn.predictLPCOrder >= psEncCtrl->pitchL[ 0 ] + LTP_ORDER / 2 ); - ALLOC( WLTP, psEnc->sCmn.nb_subfr * LTP_ORDER * LTP_ORDER, opus_int32 ); + ALLOC( xXLTP_Q17, psEnc->sCmn.nb_subfr * LTP_ORDER, opus_int32 ); + ALLOC( XXLTP_Q17, psEnc->sCmn.nb_subfr * LTP_ORDER * LTP_ORDER, opus_int32 ); /* LTP analysis */ - silk_find_LTP_FIX( psEncCtrl->LTPCoef_Q14, WLTP, &psEncCtrl->LTPredCodGain_Q7, - res_pitch, psEncCtrl->pitchL, Wght_Q15, psEnc->sCmn.subfr_length, - psEnc->sCmn.nb_subfr, psEnc->sCmn.ltp_mem_length, LTP_corrs_rshift, psEnc->sCmn.arch ); + silk_find_LTP_FIX( XXLTP_Q17, xXLTP_Q17, res_pitch, + psEncCtrl->pitchL, psEnc->sCmn.subfr_length, psEnc->sCmn.nb_subfr, psEnc->sCmn.arch ); /* Quantize LTP gain parameters */ silk_quant_LTP_gains( psEncCtrl->LTPCoef_Q14, psEnc->sCmn.indices.LTPIndex, &psEnc->sCmn.indices.PERIndex, - &psEnc->sCmn.sum_log_gain_Q7, WLTP, psEnc->sCmn.mu_LTP_Q9, psEnc->sCmn.LTPQuantLowComplexity, psEnc->sCmn.nb_subfr, - psEnc->sCmn.arch); + &psEnc->sCmn.sum_log_gain_Q7, &psEncCtrl->LTPredCodGain_Q7, XXLTP_Q17, xXLTP_Q17, psEnc->sCmn.subfr_length, psEnc->sCmn.nb_subfr, psEnc->sCmn.arch ); /* Control LTP scaling */ silk_LTP_scale_ctrl_FIX( psEnc, psEncCtrl, condCoding ); diff --git a/thirdparty/opus/silk/fixed/k2a_FIX.c b/thirdparty/opus/silk/fixed/k2a_FIX.c index 5fee599bcb..549f6eadaa 100644 --- a/thirdparty/opus/silk/fixed/k2a_FIX.c +++ b/thirdparty/opus/silk/fixed/k2a_FIX.c @@ -39,14 +39,15 @@ void silk_k2a( ) { opus_int k, n; - opus_int32 Atmp[ SILK_MAX_ORDER_LPC ]; + opus_int32 rc, tmp1, tmp2; for( k = 0; k < order; k++ ) { - for( n = 0; n < k; n++ ) { - Atmp[ n ] = A_Q24[ n ]; - } - for( n = 0; n < k; n++ ) { - A_Q24[ n ] = silk_SMLAWB( A_Q24[ n ], silk_LSHIFT( Atmp[ k - n - 1 ], 1 ), rc_Q15[ k ] ); + rc = rc_Q15[ k ]; + for( n = 0; n < (k + 1) >> 1; n++ ) { + tmp1 = A_Q24[ n ]; + tmp2 = A_Q24[ k - n - 1 ]; + A_Q24[ n ] = silk_SMLAWB( tmp1, silk_LSHIFT( tmp2, 1 ), rc ); + A_Q24[ k - n - 1 ] = silk_SMLAWB( tmp2, silk_LSHIFT( tmp1, 1 ), rc ); } A_Q24[ k ] = -silk_LSHIFT( (opus_int32)rc_Q15[ k ], 9 ); } diff --git a/thirdparty/opus/silk/fixed/k2a_Q16_FIX.c b/thirdparty/opus/silk/fixed/k2a_Q16_FIX.c index 3b03987544..1595aa6212 100644 --- a/thirdparty/opus/silk/fixed/k2a_Q16_FIX.c +++ b/thirdparty/opus/silk/fixed/k2a_Q16_FIX.c @@ -39,15 +39,16 @@ void silk_k2a_Q16( ) { opus_int k, n; - opus_int32 Atmp[ SILK_MAX_ORDER_LPC ]; + opus_int32 rc, tmp1, tmp2; for( k = 0; k < order; k++ ) { - for( n = 0; n < k; n++ ) { - Atmp[ n ] = A_Q24[ n ]; + rc = rc_Q16[ k ]; + for( n = 0; n < (k + 1) >> 1; n++ ) { + tmp1 = A_Q24[ n ]; + tmp2 = A_Q24[ k - n - 1 ]; + A_Q24[ n ] = silk_SMLAWW( tmp1, tmp2, rc ); + A_Q24[ k - n - 1 ] = silk_SMLAWW( tmp2, tmp1, rc ); } - for( n = 0; n < k; n++ ) { - A_Q24[ n ] = silk_SMLAWW( A_Q24[ n ], Atmp[ k - n - 1 ], rc_Q16[ k ] ); - } - A_Q24[ k ] = -silk_LSHIFT( rc_Q16[ k ], 8 ); + A_Q24[ k ] = -silk_LSHIFT( rc, 8 ); } } diff --git a/thirdparty/opus/silk/fixed/main_FIX.h b/thirdparty/opus/silk/fixed/main_FIX.h index 375b5eb32e..6d2112e511 100644 --- a/thirdparty/opus/silk/fixed/main_FIX.h +++ b/thirdparty/opus/silk/fixed/main_FIX.h @@ -36,6 +36,11 @@ POSSIBILITY OF SUCH DAMAGE. #include "debug.h" #include "entenc.h" +#if ((defined(OPUS_ARM_ASM) && defined(FIXED_POINT)) \ + || defined(OPUS_ARM_MAY_HAVE_NEON_INTR)) +#include "fixed/arm/warped_autocorrelation_FIX_arm.h" +#endif + #ifndef FORCE_CPP_BUILD #ifdef __cplusplus extern "C" @@ -47,6 +52,9 @@ extern "C" #define silk_encode_do_VAD_Fxx silk_encode_do_VAD_FIX #define silk_encode_frame_Fxx silk_encode_frame_FIX +#define QC 10 +#define QS 13 + /*********************/ /* Encoder Functions */ /*********************/ @@ -58,7 +66,8 @@ void silk_HP_variable_cutoff( /* Encoder main function */ void silk_encode_do_VAD_FIX( - silk_encoder_state_FIX *psEnc /* I/O Pointer to Silk FIX encoder state */ + silk_encoder_state_FIX *psEnc, /* I/O Pointer to Silk FIX encoder state */ + opus_int activity /* I Decision of Opus voice activity detector */ ); /* Encoder main function */ @@ -81,33 +90,11 @@ opus_int silk_init_encoder( opus_int silk_control_encoder( silk_encoder_state_Fxx *psEnc, /* I/O Pointer to Silk encoder state */ silk_EncControlStruct *encControl, /* I Control structure */ - const opus_int32 TargetRate_bps, /* I Target max bitrate (bps) */ const opus_int allow_bw_switch, /* I Flag to allow switching audio bandwidth */ const opus_int channelNb, /* I Channel number */ const opus_int force_fs_kHz ); -/****************/ -/* Prefiltering */ -/****************/ -void silk_prefilter_FIX( - silk_encoder_state_FIX *psEnc, /* I/O Encoder state */ - const silk_encoder_control_FIX *psEncCtrl, /* I Encoder control */ - opus_int32 xw_Q10[], /* O Weighted signal */ - const opus_int16 x[] /* I Speech signal */ -); - -void silk_warped_LPC_analysis_filter_FIX_c( - opus_int32 state[], /* I/O State [order + 1] */ - opus_int32 res_Q2[], /* O Residual signal [length] */ - const opus_int16 coef_Q13[], /* I Coefficients [order] */ - const opus_int16 input[], /* I Input signal [length] */ - const opus_int16 lambda_Q16, /* I Warping factor */ - const opus_int length, /* I Length of input signal */ - const opus_int order /* I Filter order (even) */ -); - - /**************************/ /* Noise shaping analysis */ /**************************/ @@ -121,7 +108,7 @@ void silk_noise_shape_analysis_FIX( ); /* Autocorrelations for a warped frequency axis */ -void silk_warped_autocorrelation_FIX( +void silk_warped_autocorrelation_FIX_c( opus_int32 *corr, /* O Result [order + 1] */ opus_int *scale, /* O Scaling of the correlation vector */ const opus_int16 *input, /* I Input data to correlate */ @@ -130,6 +117,11 @@ void silk_warped_autocorrelation_FIX( const opus_int order /* I Correlation order (even) */ ); +#if !defined(OVERRIDE_silk_warped_autocorrelation_FIX) +#define silk_warped_autocorrelation_FIX(corr, scale, input, warping_Q16, length, order, arch) \ + ((void)(arch), silk_warped_autocorrelation_FIX_c(corr, scale, input, warping_Q16, length, order)) +#endif + /* Calculation of LTP state scaling */ void silk_LTP_scale_ctrl_FIX( silk_encoder_state_FIX *psEnc, /* I/O encoder state */ @@ -168,16 +160,12 @@ void silk_find_LPC_FIX( /* LTP analysis */ void silk_find_LTP_FIX( - opus_int16 b_Q14[ MAX_NB_SUBFR * LTP_ORDER ], /* O LTP coefs */ - opus_int32 WLTP[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ], /* O Weight for LTP quantization */ - opus_int *LTPredCodGain_Q7, /* O LTP coding gain */ - const opus_int16 r_lpc[], /* I residual signal after LPC signal + state for first 10 ms */ + opus_int32 XXLTP_Q17[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ], /* O Correlation matrix */ + opus_int32 xXLTP_Q17[ MAX_NB_SUBFR * LTP_ORDER ], /* O Correlation vector */ + const opus_int16 r_lpc[], /* I Residual signal after LPC */ const opus_int lag[ MAX_NB_SUBFR ], /* I LTP lags */ - const opus_int32 Wght_Q15[ MAX_NB_SUBFR ], /* I weights */ - const opus_int subfr_length, /* I subframe length */ - const opus_int nb_subfr, /* I number of subframes */ - const opus_int mem_offset, /* I number of samples in LTP memory */ - opus_int corr_rshifts[ MAX_NB_SUBFR ], /* O right shifts applied to correlations */ + const opus_int subfr_length, /* I Subframe length */ + const opus_int nb_subfr, /* I Number of subframes */ int arch /* I Run-time architecture */ ); @@ -231,9 +219,9 @@ void silk_corrMatrix_FIX( const opus_int16 *x, /* I x vector [L + order - 1] used to form data matrix X */ const opus_int L, /* I Length of vectors */ const opus_int order, /* I Max lag for correlation */ - const opus_int head_room, /* I Desired headroom */ opus_int32 *XX, /* O Pointer to X'*X correlation matrix [ order x order ] */ - opus_int *rshifts, /* I/O Right shifts of correlations */ + opus_int32 *nrg, /* O Energy of x vector */ + opus_int *rshifts, /* O Right shifts of correlations */ int arch /* I Run-time architecture */ ); @@ -248,22 +236,6 @@ void silk_corrVector_FIX( int arch /* I Run-time architecture */ ); -/* Add noise to matrix diagonal */ -void silk_regularize_correlations_FIX( - opus_int32 *XX, /* I/O Correlation matrices */ - opus_int32 *xx, /* I/O Correlation values */ - opus_int32 noise, /* I Noise to add */ - opus_int D /* I Dimension of XX */ -); - -/* Solves Ax = b, assuming A is symmetric */ -void silk_solve_LDL_FIX( - opus_int32 *A, /* I Pointer to symetric square matrix A */ - opus_int M, /* I Size of matrix */ - const opus_int32 *b, /* I Pointer to b vector */ - opus_int32 *x_Q16 /* O Pointer to x solution vector */ -); - #ifndef FORCE_CPP_BUILD #ifdef __cplusplus } diff --git a/thirdparty/opus/silk/fixed/mips/noise_shape_analysis_FIX_mipsr1.h b/thirdparty/opus/silk/fixed/mips/noise_shape_analysis_FIX_mipsr1.h index c30481e437..3999b5bd09 100644 --- a/thirdparty/opus/silk/fixed/mips/noise_shape_analysis_FIX_mipsr1.h +++ b/thirdparty/opus/silk/fixed/mips/noise_shape_analysis_FIX_mipsr1.h @@ -169,7 +169,7 @@ void silk_noise_shape_analysis_FIX( if( psEnc->sCmn.warping_Q16 > 0 ) { /* Calculate warped auto correlation */ - silk_warped_autocorrelation_FIX( auto_corr, &scale, x_windowed, warping_Q16, psEnc->sCmn.shapeWinLength, psEnc->sCmn.shapingLPCOrder ); + silk_warped_autocorrelation_FIX( auto_corr, &scale, x_windowed, warping_Q16, psEnc->sCmn.shapeWinLength, psEnc->sCmn.shapingLPCOrder, arch ); } else { /* Calculate regular auto correlation */ silk_autocorr( auto_corr, &scale, x_windowed, psEnc->sCmn.shapeWinLength, psEnc->sCmn.shapingLPCOrder + 1, arch ); @@ -224,8 +224,8 @@ void silk_noise_shape_analysis_FIX( silk_bwexpander_32( AR1_Q24, psEnc->sCmn.shapingLPCOrder, BWExp1_Q16 ); /* Ratio of prediction gains, in energy domain */ - pre_nrg_Q30 = silk_LPC_inverse_pred_gain_Q24( AR2_Q24, psEnc->sCmn.shapingLPCOrder ); - nrg = silk_LPC_inverse_pred_gain_Q24( AR1_Q24, psEnc->sCmn.shapingLPCOrder ); + pre_nrg_Q30 = silk_LPC_inverse_pred_gain_Q24( AR2_Q24, psEnc->sCmn.shapingLPCOrder, arch ); + nrg = silk_LPC_inverse_pred_gain_Q24( AR1_Q24, psEnc->sCmn.shapingLPCOrder, arch ); /*psEncCtrl->GainsPre[ k ] = 1.0f - 0.7f * ( 1.0f - pre_nrg / nrg ) = 0.3f + 0.7f * pre_nrg / nrg;*/ pre_nrg_Q30 = silk_LSHIFT32( silk_SMULWB( pre_nrg_Q30, SILK_FIX_CONST( 0.7, 15 ) ), 1 ); diff --git a/thirdparty/opus/silk/fixed/mips/prefilter_FIX_mipsr1.h b/thirdparty/opus/silk/fixed/mips/prefilter_FIX_mipsr1.h deleted file mode 100644 index 21b256885f..0000000000 --- a/thirdparty/opus/silk/fixed/mips/prefilter_FIX_mipsr1.h +++ /dev/null @@ -1,184 +0,0 @@ -/*********************************************************************** -Copyright (c) 2006-2011, Skype Limited. All rights reserved. -Redistribution and use in source and binary forms, with or without -modification, are permitted provided that the following conditions -are met: -- Redistributions of source code must retain the above copyright notice, -this list of conditions and the following disclaimer. -- Redistributions in binary form must reproduce the above copyright -notice, this list of conditions and the following disclaimer in the -documentation and/or other materials provided with the distribution. -- Neither the name of Internet Society, IETF or IETF Trust, nor the -names of specific contributors, may be used to endorse or promote -products derived from this software without specific prior written -permission. -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE -LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE -POSSIBILITY OF SUCH DAMAGE. -***********************************************************************/ -#ifndef __PREFILTER_FIX_MIPSR1_H__ -#define __PREFILTER_FIX_MIPSR1_H__ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#include "main_FIX.h" -#include "stack_alloc.h" -#include "tuning_parameters.h" - -#define OVERRIDE_silk_warped_LPC_analysis_filter_FIX -void silk_warped_LPC_analysis_filter_FIX( - opus_int32 state[], /* I/O State [order + 1] */ - opus_int32 res_Q2[], /* O Residual signal [length] */ - const opus_int16 coef_Q13[], /* I Coefficients [order] */ - const opus_int16 input[], /* I Input signal [length] */ - const opus_int16 lambda_Q16, /* I Warping factor */ - const opus_int length, /* I Length of input signal */ - const opus_int order, /* I Filter order (even) */ - int arch -) -{ - opus_int n, i; - opus_int32 acc_Q11, acc_Q22, tmp1, tmp2, tmp3, tmp4; - opus_int32 state_cur, state_next; - - (void)arch; - - /* Order must be even */ - /* Length must be even */ - - silk_assert( ( order & 1 ) == 0 ); - silk_assert( ( length & 1 ) == 0 ); - - for( n = 0; n < length; n+=2 ) { - /* Output of lowpass section */ - tmp2 = silk_SMLAWB( state[ 0 ], state[ 1 ], lambda_Q16 ); - state_cur = silk_LSHIFT( input[ n ], 14 ); - /* Output of allpass section */ - tmp1 = silk_SMLAWB( state[ 1 ], state[ 2 ] - tmp2, lambda_Q16 ); - state_next = tmp2; - acc_Q11 = silk_RSHIFT( order, 1 ); - acc_Q11 = silk_SMLAWB( acc_Q11, tmp2, coef_Q13[ 0 ] ); - - - /* Output of lowpass section */ - tmp4 = silk_SMLAWB( state_cur, state_next, lambda_Q16 ); - state[ 0 ] = silk_LSHIFT( input[ n+1 ], 14 ); - /* Output of allpass section */ - tmp3 = silk_SMLAWB( state_next, tmp1 - tmp4, lambda_Q16 ); - state[ 1 ] = tmp4; - acc_Q22 = silk_RSHIFT( order, 1 ); - acc_Q22 = silk_SMLAWB( acc_Q22, tmp4, coef_Q13[ 0 ] ); - - /* Loop over allpass sections */ - for( i = 2; i < order; i += 2 ) { - /* Output of allpass section */ - tmp2 = silk_SMLAWB( state[ i ], state[ i + 1 ] - tmp1, lambda_Q16 ); - state_cur = tmp1; - acc_Q11 = silk_SMLAWB( acc_Q11, tmp1, coef_Q13[ i - 1 ] ); - /* Output of allpass section */ - tmp1 = silk_SMLAWB( state[ i + 1 ], state[ i + 2 ] - tmp2, lambda_Q16 ); - state_next = tmp2; - acc_Q11 = silk_SMLAWB( acc_Q11, tmp2, coef_Q13[ i ] ); - - - /* Output of allpass section */ - tmp4 = silk_SMLAWB( state_cur, state_next - tmp3, lambda_Q16 ); - state[ i ] = tmp3; - acc_Q22 = silk_SMLAWB( acc_Q22, tmp3, coef_Q13[ i - 1 ] ); - /* Output of allpass section */ - tmp3 = silk_SMLAWB( state_next, tmp1 - tmp4, lambda_Q16 ); - state[ i + 1 ] = tmp4; - acc_Q22 = silk_SMLAWB( acc_Q22, tmp4, coef_Q13[ i ] ); - } - acc_Q11 = silk_SMLAWB( acc_Q11, tmp1, coef_Q13[ order - 1 ] ); - res_Q2[ n ] = silk_LSHIFT( (opus_int32)input[ n ], 2 ) - silk_RSHIFT_ROUND( acc_Q11, 9 ); - - state[ order ] = tmp3; - acc_Q22 = silk_SMLAWB( acc_Q22, tmp3, coef_Q13[ order - 1 ] ); - res_Q2[ n+1 ] = silk_LSHIFT( (opus_int32)input[ n+1 ], 2 ) - silk_RSHIFT_ROUND( acc_Q22, 9 ); - } -} - - - -/* Prefilter for finding Quantizer input signal */ -#define OVERRIDE_silk_prefilt_FIX -static inline void silk_prefilt_FIX( - silk_prefilter_state_FIX *P, /* I/O state */ - opus_int32 st_res_Q12[], /* I short term residual signal */ - opus_int32 xw_Q3[], /* O prefiltered signal */ - opus_int32 HarmShapeFIRPacked_Q12, /* I Harmonic shaping coeficients */ - opus_int Tilt_Q14, /* I Tilt shaping coeficient */ - opus_int32 LF_shp_Q14, /* I Low-frequancy shaping coeficients */ - opus_int lag, /* I Lag for harmonic shaping */ - opus_int length /* I Length of signals */ -) -{ - opus_int i, idx, LTP_shp_buf_idx; - opus_int32 n_LTP_Q12, n_Tilt_Q10, n_LF_Q10; - opus_int32 sLF_MA_shp_Q12, sLF_AR_shp_Q12; - opus_int16 *LTP_shp_buf; - - /* To speed up use temp variables instead of using the struct */ - LTP_shp_buf = P->sLTP_shp; - LTP_shp_buf_idx = P->sLTP_shp_buf_idx; - sLF_AR_shp_Q12 = P->sLF_AR_shp_Q12; - sLF_MA_shp_Q12 = P->sLF_MA_shp_Q12; - - if( lag > 0 ) { - for( i = 0; i < length; i++ ) { - /* unrolled loop */ - silk_assert( HARM_SHAPE_FIR_TAPS == 3 ); - idx = lag + LTP_shp_buf_idx; - n_LTP_Q12 = silk_SMULBB( LTP_shp_buf[ ( idx - HARM_SHAPE_FIR_TAPS / 2 - 1) & LTP_MASK ], HarmShapeFIRPacked_Q12 ); - n_LTP_Q12 = silk_SMLABT( n_LTP_Q12, LTP_shp_buf[ ( idx - HARM_SHAPE_FIR_TAPS / 2 ) & LTP_MASK ], HarmShapeFIRPacked_Q12 ); - n_LTP_Q12 = silk_SMLABB( n_LTP_Q12, LTP_shp_buf[ ( idx - HARM_SHAPE_FIR_TAPS / 2 + 1) & LTP_MASK ], HarmShapeFIRPacked_Q12 ); - - n_Tilt_Q10 = silk_SMULWB( sLF_AR_shp_Q12, Tilt_Q14 ); - n_LF_Q10 = silk_SMLAWB( silk_SMULWT( sLF_AR_shp_Q12, LF_shp_Q14 ), sLF_MA_shp_Q12, LF_shp_Q14 ); - - sLF_AR_shp_Q12 = silk_SUB32( st_res_Q12[ i ], silk_LSHIFT( n_Tilt_Q10, 2 ) ); - sLF_MA_shp_Q12 = silk_SUB32( sLF_AR_shp_Q12, silk_LSHIFT( n_LF_Q10, 2 ) ); - - LTP_shp_buf_idx = ( LTP_shp_buf_idx - 1 ) & LTP_MASK; - LTP_shp_buf[ LTP_shp_buf_idx ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( sLF_MA_shp_Q12, 12 ) ); - - xw_Q3[i] = silk_RSHIFT_ROUND( silk_SUB32( sLF_MA_shp_Q12, n_LTP_Q12 ), 9 ); - } - } - else - { - for( i = 0; i < length; i++ ) { - - n_LTP_Q12 = 0; - - n_Tilt_Q10 = silk_SMULWB( sLF_AR_shp_Q12, Tilt_Q14 ); - n_LF_Q10 = silk_SMLAWB( silk_SMULWT( sLF_AR_shp_Q12, LF_shp_Q14 ), sLF_MA_shp_Q12, LF_shp_Q14 ); - - sLF_AR_shp_Q12 = silk_SUB32( st_res_Q12[ i ], silk_LSHIFT( n_Tilt_Q10, 2 ) ); - sLF_MA_shp_Q12 = silk_SUB32( sLF_AR_shp_Q12, silk_LSHIFT( n_LF_Q10, 2 ) ); - - LTP_shp_buf_idx = ( LTP_shp_buf_idx - 1 ) & LTP_MASK; - LTP_shp_buf[ LTP_shp_buf_idx ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( sLF_MA_shp_Q12, 12 ) ); - - xw_Q3[i] = silk_RSHIFT_ROUND( sLF_MA_shp_Q12, 9 ); - } - } - - /* Copy temp variable back to state */ - P->sLF_AR_shp_Q12 = sLF_AR_shp_Q12; - P->sLF_MA_shp_Q12 = sLF_MA_shp_Q12; - P->sLTP_shp_buf_idx = LTP_shp_buf_idx; -} - -#endif /* __PREFILTER_FIX_MIPSR1_H__ */ diff --git a/thirdparty/opus/silk/fixed/mips/warped_autocorrelation_FIX_mipsr1.h b/thirdparty/opus/silk/fixed/mips/warped_autocorrelation_FIX_mipsr1.h index e803ef0fce..66eb2ed26d 100644 --- a/thirdparty/opus/silk/fixed/mips/warped_autocorrelation_FIX_mipsr1.h +++ b/thirdparty/opus/silk/fixed/mips/warped_autocorrelation_FIX_mipsr1.h @@ -41,8 +41,8 @@ POSSIBILITY OF SUCH DAMAGE. #define QS 14 /* Autocorrelations for a warped frequency axis */ -#define OVERRIDE_silk_warped_autocorrelation_FIX -void silk_warped_autocorrelation_FIX( +#define OVERRIDE_silk_warped_autocorrelation_FIX_c +void silk_warped_autocorrelation_FIX_c( opus_int32 *corr, /* O Result [order + 1] */ opus_int *scale, /* O Scaling of the correlation vector */ const opus_int16 *input, /* I Input data to correlate */ diff --git a/thirdparty/opus/silk/fixed/noise_shape_analysis_FIX.c b/thirdparty/opus/silk/fixed/noise_shape_analysis_FIX.c index 22a89f75ae..85fea0bf09 100644 --- a/thirdparty/opus/silk/fixed/noise_shape_analysis_FIX.c +++ b/thirdparty/opus/silk/fixed/noise_shape_analysis_FIX.c @@ -57,88 +57,79 @@ static OPUS_INLINE opus_int32 warped_gain( /* gain in Q16*/ /* Convert warped filter coefficients to monic pseudo-warped coefficients and limit maximum */ /* amplitude of monic warped coefficients by using bandwidth expansion on the true coefficients */ static OPUS_INLINE void limit_warped_coefs( - opus_int32 *coefs_syn_Q24, - opus_int32 *coefs_ana_Q24, + opus_int32 *coefs_Q24, opus_int lambda_Q16, opus_int32 limit_Q24, opus_int order ) { opus_int i, iter, ind = 0; - opus_int32 tmp, maxabs_Q24, chirp_Q16, gain_syn_Q16, gain_ana_Q16; + opus_int32 tmp, maxabs_Q24, chirp_Q16, gain_Q16; opus_int32 nom_Q16, den_Q24; + opus_int32 limit_Q20, maxabs_Q20; /* Convert to monic coefficients */ lambda_Q16 = -lambda_Q16; for( i = order - 1; i > 0; i-- ) { - coefs_syn_Q24[ i - 1 ] = silk_SMLAWB( coefs_syn_Q24[ i - 1 ], coefs_syn_Q24[ i ], lambda_Q16 ); - coefs_ana_Q24[ i - 1 ] = silk_SMLAWB( coefs_ana_Q24[ i - 1 ], coefs_ana_Q24[ i ], lambda_Q16 ); + coefs_Q24[ i - 1 ] = silk_SMLAWB( coefs_Q24[ i - 1 ], coefs_Q24[ i ], lambda_Q16 ); } lambda_Q16 = -lambda_Q16; - nom_Q16 = silk_SMLAWB( SILK_FIX_CONST( 1.0, 16 ), -(opus_int32)lambda_Q16, lambda_Q16 ); - den_Q24 = silk_SMLAWB( SILK_FIX_CONST( 1.0, 24 ), coefs_syn_Q24[ 0 ], lambda_Q16 ); - gain_syn_Q16 = silk_DIV32_varQ( nom_Q16, den_Q24, 24 ); - den_Q24 = silk_SMLAWB( SILK_FIX_CONST( 1.0, 24 ), coefs_ana_Q24[ 0 ], lambda_Q16 ); - gain_ana_Q16 = silk_DIV32_varQ( nom_Q16, den_Q24, 24 ); + nom_Q16 = silk_SMLAWB( SILK_FIX_CONST( 1.0, 16 ), -(opus_int32)lambda_Q16, lambda_Q16 ); + den_Q24 = silk_SMLAWB( SILK_FIX_CONST( 1.0, 24 ), coefs_Q24[ 0 ], lambda_Q16 ); + gain_Q16 = silk_DIV32_varQ( nom_Q16, den_Q24, 24 ); for( i = 0; i < order; i++ ) { - coefs_syn_Q24[ i ] = silk_SMULWW( gain_syn_Q16, coefs_syn_Q24[ i ] ); - coefs_ana_Q24[ i ] = silk_SMULWW( gain_ana_Q16, coefs_ana_Q24[ i ] ); + coefs_Q24[ i ] = silk_SMULWW( gain_Q16, coefs_Q24[ i ] ); } - + limit_Q20 = silk_RSHIFT(limit_Q24, 4); for( iter = 0; iter < 10; iter++ ) { /* Find maximum absolute value */ maxabs_Q24 = -1; for( i = 0; i < order; i++ ) { - tmp = silk_max( silk_abs_int32( coefs_syn_Q24[ i ] ), silk_abs_int32( coefs_ana_Q24[ i ] ) ); + tmp = silk_abs_int32( coefs_Q24[ i ] ); if( tmp > maxabs_Q24 ) { maxabs_Q24 = tmp; ind = i; } } - if( maxabs_Q24 <= limit_Q24 ) { + /* Use Q20 to avoid any overflow when multiplying by (ind + 1) later. */ + maxabs_Q20 = silk_RSHIFT(maxabs_Q24, 4); + if( maxabs_Q20 <= limit_Q20 ) { /* Coefficients are within range - done */ return; } /* Convert back to true warped coefficients */ for( i = 1; i < order; i++ ) { - coefs_syn_Q24[ i - 1 ] = silk_SMLAWB( coefs_syn_Q24[ i - 1 ], coefs_syn_Q24[ i ], lambda_Q16 ); - coefs_ana_Q24[ i - 1 ] = silk_SMLAWB( coefs_ana_Q24[ i - 1 ], coefs_ana_Q24[ i ], lambda_Q16 ); + coefs_Q24[ i - 1 ] = silk_SMLAWB( coefs_Q24[ i - 1 ], coefs_Q24[ i ], lambda_Q16 ); } - gain_syn_Q16 = silk_INVERSE32_varQ( gain_syn_Q16, 32 ); - gain_ana_Q16 = silk_INVERSE32_varQ( gain_ana_Q16, 32 ); + gain_Q16 = silk_INVERSE32_varQ( gain_Q16, 32 ); for( i = 0; i < order; i++ ) { - coefs_syn_Q24[ i ] = silk_SMULWW( gain_syn_Q16, coefs_syn_Q24[ i ] ); - coefs_ana_Q24[ i ] = silk_SMULWW( gain_ana_Q16, coefs_ana_Q24[ i ] ); + coefs_Q24[ i ] = silk_SMULWW( gain_Q16, coefs_Q24[ i ] ); } /* Apply bandwidth expansion */ chirp_Q16 = SILK_FIX_CONST( 0.99, 16 ) - silk_DIV32_varQ( - silk_SMULWB( maxabs_Q24 - limit_Q24, silk_SMLABB( SILK_FIX_CONST( 0.8, 10 ), SILK_FIX_CONST( 0.1, 10 ), iter ) ), - silk_MUL( maxabs_Q24, ind + 1 ), 22 ); - silk_bwexpander_32( coefs_syn_Q24, order, chirp_Q16 ); - silk_bwexpander_32( coefs_ana_Q24, order, chirp_Q16 ); + silk_SMULWB( maxabs_Q20 - limit_Q20, silk_SMLABB( SILK_FIX_CONST( 0.8, 10 ), SILK_FIX_CONST( 0.1, 10 ), iter ) ), + silk_MUL( maxabs_Q20, ind + 1 ), 22 ); + silk_bwexpander_32( coefs_Q24, order, chirp_Q16 ); /* Convert to monic warped coefficients */ lambda_Q16 = -lambda_Q16; for( i = order - 1; i > 0; i-- ) { - coefs_syn_Q24[ i - 1 ] = silk_SMLAWB( coefs_syn_Q24[ i - 1 ], coefs_syn_Q24[ i ], lambda_Q16 ); - coefs_ana_Q24[ i - 1 ] = silk_SMLAWB( coefs_ana_Q24[ i - 1 ], coefs_ana_Q24[ i ], lambda_Q16 ); + coefs_Q24[ i - 1 ] = silk_SMLAWB( coefs_Q24[ i - 1 ], coefs_Q24[ i ], lambda_Q16 ); } lambda_Q16 = -lambda_Q16; nom_Q16 = silk_SMLAWB( SILK_FIX_CONST( 1.0, 16 ), -(opus_int32)lambda_Q16, lambda_Q16 ); - den_Q24 = silk_SMLAWB( SILK_FIX_CONST( 1.0, 24 ), coefs_syn_Q24[ 0 ], lambda_Q16 ); - gain_syn_Q16 = silk_DIV32_varQ( nom_Q16, den_Q24, 24 ); - den_Q24 = silk_SMLAWB( SILK_FIX_CONST( 1.0, 24 ), coefs_ana_Q24[ 0 ], lambda_Q16 ); - gain_ana_Q16 = silk_DIV32_varQ( nom_Q16, den_Q24, 24 ); + den_Q24 = silk_SMLAWB( SILK_FIX_CONST( 1.0, 24 ), coefs_Q24[ 0 ], lambda_Q16 ); + gain_Q16 = silk_DIV32_varQ( nom_Q16, den_Q24, 24 ); for( i = 0; i < order; i++ ) { - coefs_syn_Q24[ i ] = silk_SMULWW( gain_syn_Q16, coefs_syn_Q24[ i ] ); - coefs_ana_Q24[ i ] = silk_SMULWW( gain_ana_Q16, coefs_ana_Q24[ i ] ); + coefs_Q24[ i ] = silk_SMULWW( gain_Q16, coefs_Q24[ i ] ); } } silk_assert( 0 ); } -#if defined(MIPSr1_ASM) +/* Disable MIPS version until it's updated. */ +#if 0 && defined(MIPSr1_ASM) #include "mips/noise_shape_analysis_FIX_mipsr1.h" #endif @@ -155,14 +146,13 @@ void silk_noise_shape_analysis_FIX( ) { silk_shape_state_FIX *psShapeSt = &psEnc->sShape; - opus_int k, i, nSamples, Qnrg, b_Q14, warping_Q16, scale = 0; - opus_int32 SNR_adj_dB_Q7, HarmBoost_Q16, HarmShapeGain_Q16, Tilt_Q16, tmp32; - opus_int32 nrg, pre_nrg_Q30, log_energy_Q7, log_energy_prev_Q7, energy_variation_Q7; - opus_int32 delta_Q16, BWExp1_Q16, BWExp2_Q16, gain_mult_Q16, gain_add_Q16, strength_Q16, b_Q8; + opus_int k, i, nSamples, nSegs, Qnrg, b_Q14, warping_Q16, scale = 0; + opus_int32 SNR_adj_dB_Q7, HarmShapeGain_Q16, Tilt_Q16, tmp32; + opus_int32 nrg, log_energy_Q7, log_energy_prev_Q7, energy_variation_Q7; + opus_int32 BWExp_Q16, gain_mult_Q16, gain_add_Q16, strength_Q16, b_Q8; opus_int32 auto_corr[ MAX_SHAPE_LPC_ORDER + 1 ]; opus_int32 refl_coef_Q16[ MAX_SHAPE_LPC_ORDER ]; - opus_int32 AR1_Q24[ MAX_SHAPE_LPC_ORDER ]; - opus_int32 AR2_Q24[ MAX_SHAPE_LPC_ORDER ]; + opus_int32 AR_Q24[ MAX_SHAPE_LPC_ORDER ]; VARDECL( opus_int16, x_windowed ); const opus_int16 *x_ptr, *pitch_res_ptr; SAVE_STACK; @@ -209,14 +199,14 @@ void silk_noise_shape_analysis_FIX( if( psEnc->sCmn.indices.signalType == TYPE_VOICED ) { /* Initially set to 0; may be overruled in process_gains(..) */ psEnc->sCmn.indices.quantOffsetType = 0; - psEncCtrl->sparseness_Q8 = 0; } else { /* Sparseness measure, based on relative fluctuations of energy per 2 milliseconds */ nSamples = silk_LSHIFT( psEnc->sCmn.fs_kHz, 1 ); energy_variation_Q7 = 0; log_energy_prev_Q7 = 0; pitch_res_ptr = pitch_res; - for( k = 0; k < silk_SMULBB( SUB_FRAME_LENGTH_MS, psEnc->sCmn.nb_subfr ) / 2; k++ ) { + nSegs = silk_SMULBB( SUB_FRAME_LENGTH_MS, psEnc->sCmn.nb_subfr ) / 2; + for( k = 0; k < nSegs; k++ ) { silk_sum_sqr_shift( &nrg, &scale, pitch_res_ptr, nSamples ); nrg += silk_RSHIFT( nSamples, scale ); /* Q(-scale)*/ @@ -228,18 +218,12 @@ void silk_noise_shape_analysis_FIX( pitch_res_ptr += nSamples; } - psEncCtrl->sparseness_Q8 = silk_RSHIFT( silk_sigm_Q15( silk_SMULWB( energy_variation_Q7 - - SILK_FIX_CONST( 5.0, 7 ), SILK_FIX_CONST( 0.1, 16 ) ) ), 7 ); - /* Set quantization offset depending on sparseness measure */ - if( psEncCtrl->sparseness_Q8 > SILK_FIX_CONST( SPARSENESS_THRESHOLD_QNT_OFFSET, 8 ) ) { + if( energy_variation_Q7 > SILK_FIX_CONST( ENERGY_VARIATION_THRESHOLD_QNT_OFFSET, 7 ) * (nSegs-1) ) { psEnc->sCmn.indices.quantOffsetType = 0; } else { psEnc->sCmn.indices.quantOffsetType = 1; } - - /* Increase coding SNR for sparse signals */ - SNR_adj_dB_Q7 = silk_SMLAWB( SNR_adj_dB_Q7, SILK_FIX_CONST( SPARSE_SNR_INCR_dB, 15 ), psEncCtrl->sparseness_Q8 - SILK_FIX_CONST( 0.5, 8 ) ); } /*******************************/ @@ -247,14 +231,8 @@ void silk_noise_shape_analysis_FIX( /*******************************/ /* More BWE for signals with high prediction gain */ strength_Q16 = silk_SMULWB( psEncCtrl->predGain_Q16, SILK_FIX_CONST( FIND_PITCH_WHITE_NOISE_FRACTION, 16 ) ); - BWExp1_Q16 = BWExp2_Q16 = silk_DIV32_varQ( SILK_FIX_CONST( BANDWIDTH_EXPANSION, 16 ), + BWExp_Q16 = silk_DIV32_varQ( SILK_FIX_CONST( BANDWIDTH_EXPANSION, 16 ), silk_SMLAWW( SILK_FIX_CONST( 1.0, 16 ), strength_Q16, strength_Q16 ), 16 ); - delta_Q16 = silk_SMULWB( SILK_FIX_CONST( 1.0, 16 ) - silk_SMULBB( 3, psEncCtrl->coding_quality_Q14 ), - SILK_FIX_CONST( LOW_RATE_BANDWIDTH_EXPANSION_DELTA, 16 ) ); - BWExp1_Q16 = silk_SUB32( BWExp1_Q16, delta_Q16 ); - BWExp2_Q16 = silk_ADD32( BWExp2_Q16, delta_Q16 ); - /* BWExp1 will be applied after BWExp2, so make it relative */ - BWExp1_Q16 = silk_DIV32_16( silk_LSHIFT( BWExp1_Q16, 14 ), silk_RSHIFT( BWExp2_Q16, 2 ) ); if( psEnc->sCmn.warping_Q16 > 0 ) { /* Slightly more warping in analysis will move quantization noise up in frequency, where it's better masked */ @@ -284,7 +262,7 @@ void silk_noise_shape_analysis_FIX( if( psEnc->sCmn.warping_Q16 > 0 ) { /* Calculate warped auto correlation */ - silk_warped_autocorrelation_FIX( auto_corr, &scale, x_windowed, warping_Q16, psEnc->sCmn.shapeWinLength, psEnc->sCmn.shapingLPCOrder ); + silk_warped_autocorrelation_FIX( auto_corr, &scale, x_windowed, warping_Q16, psEnc->sCmn.shapeWinLength, psEnc->sCmn.shapingLPCOrder, arch ); } else { /* Calculate regular auto correlation */ silk_autocorr( auto_corr, &scale, x_windowed, psEnc->sCmn.shapeWinLength, psEnc->sCmn.shapingLPCOrder + 1, arch ); @@ -299,7 +277,7 @@ void silk_noise_shape_analysis_FIX( silk_assert( nrg >= 0 ); /* Convert reflection coefficients to prediction coefficients */ - silk_k2a_Q16( AR2_Q24, refl_coef_Q16, psEnc->sCmn.shapingLPCOrder ); + silk_k2a_Q16( AR_Q24, refl_coef_Q16, psEnc->sCmn.shapingLPCOrder ); Qnrg = -scale; /* range: -12...30*/ silk_assert( Qnrg >= -12 ); @@ -318,40 +296,34 @@ void silk_noise_shape_analysis_FIX( if( psEnc->sCmn.warping_Q16 > 0 ) { /* Adjust gain for warping */ - gain_mult_Q16 = warped_gain( AR2_Q24, warping_Q16, psEnc->sCmn.shapingLPCOrder ); - silk_assert( psEncCtrl->Gains_Q16[ k ] >= 0 ); - if ( silk_SMULWW( silk_RSHIFT_ROUND( psEncCtrl->Gains_Q16[ k ], 1 ), gain_mult_Q16 ) >= ( silk_int32_MAX >> 1 ) ) { - psEncCtrl->Gains_Q16[ k ] = silk_int32_MAX; + gain_mult_Q16 = warped_gain( AR_Q24, warping_Q16, psEnc->sCmn.shapingLPCOrder ); + silk_assert( psEncCtrl->Gains_Q16[ k ] > 0 ); + if( psEncCtrl->Gains_Q16[ k ] < SILK_FIX_CONST( 0.25, 16 ) ) { + psEncCtrl->Gains_Q16[ k ] = silk_SMULWW( psEncCtrl->Gains_Q16[ k ], gain_mult_Q16 ); } else { - psEncCtrl->Gains_Q16[ k ] = silk_SMULWW( psEncCtrl->Gains_Q16[ k ], gain_mult_Q16 ); + psEncCtrl->Gains_Q16[ k ] = silk_SMULWW( silk_RSHIFT_ROUND( psEncCtrl->Gains_Q16[ k ], 1 ), gain_mult_Q16 ); + if ( psEncCtrl->Gains_Q16[ k ] >= ( silk_int32_MAX >> 1 ) ) { + psEncCtrl->Gains_Q16[ k ] = silk_int32_MAX; + } else { + psEncCtrl->Gains_Q16[ k ] = silk_LSHIFT32( psEncCtrl->Gains_Q16[ k ], 1 ); + } } + silk_assert( psEncCtrl->Gains_Q16[ k ] > 0 ); } - /* Bandwidth expansion for synthesis filter shaping */ - silk_bwexpander_32( AR2_Q24, psEnc->sCmn.shapingLPCOrder, BWExp2_Q16 ); - - /* Compute noise shaping filter coefficients */ - silk_memcpy( AR1_Q24, AR2_Q24, psEnc->sCmn.shapingLPCOrder * sizeof( opus_int32 ) ); - - /* Bandwidth expansion for analysis filter shaping */ - silk_assert( BWExp1_Q16 <= SILK_FIX_CONST( 1.0, 16 ) ); - silk_bwexpander_32( AR1_Q24, psEnc->sCmn.shapingLPCOrder, BWExp1_Q16 ); - - /* Ratio of prediction gains, in energy domain */ - pre_nrg_Q30 = silk_LPC_inverse_pred_gain_Q24( AR2_Q24, psEnc->sCmn.shapingLPCOrder ); - nrg = silk_LPC_inverse_pred_gain_Q24( AR1_Q24, psEnc->sCmn.shapingLPCOrder ); - - /*psEncCtrl->GainsPre[ k ] = 1.0f - 0.7f * ( 1.0f - pre_nrg / nrg ) = 0.3f + 0.7f * pre_nrg / nrg;*/ - pre_nrg_Q30 = silk_LSHIFT32( silk_SMULWB( pre_nrg_Q30, SILK_FIX_CONST( 0.7, 15 ) ), 1 ); - psEncCtrl->GainsPre_Q14[ k ] = ( opus_int ) SILK_FIX_CONST( 0.3, 14 ) + silk_DIV32_varQ( pre_nrg_Q30, nrg, 14 ); + /* Bandwidth expansion */ + silk_bwexpander_32( AR_Q24, psEnc->sCmn.shapingLPCOrder, BWExp_Q16 ); - /* Convert to monic warped prediction coefficients and limit absolute values */ - limit_warped_coefs( AR2_Q24, AR1_Q24, warping_Q16, SILK_FIX_CONST( 3.999, 24 ), psEnc->sCmn.shapingLPCOrder ); + if( psEnc->sCmn.warping_Q16 > 0 ) { + /* Convert to monic warped prediction coefficients and limit absolute values */ + limit_warped_coefs( AR_Q24, warping_Q16, SILK_FIX_CONST( 3.999, 24 ), psEnc->sCmn.shapingLPCOrder ); - /* Convert from Q24 to Q13 and store in int16 */ - for( i = 0; i < psEnc->sCmn.shapingLPCOrder; i++ ) { - psEncCtrl->AR1_Q13[ k * MAX_SHAPE_LPC_ORDER + i ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( AR1_Q24[ i ], 11 ) ); - psEncCtrl->AR2_Q13[ k * MAX_SHAPE_LPC_ORDER + i ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( AR2_Q24[ i ], 11 ) ); + /* Convert from Q24 to Q13 and store in int16 */ + for( i = 0; i < psEnc->sCmn.shapingLPCOrder; i++ ) { + psEncCtrl->AR_Q13[ k * MAX_SHAPE_LPC_ORDER + i ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( AR_Q24[ i ], 11 ) ); + } + } else { + silk_LPC_fit( &psEncCtrl->AR_Q13[ k * MAX_SHAPE_LPC_ORDER ], AR_Q24, 13, 24, psEnc->sCmn.shapingLPCOrder ); } } @@ -368,11 +340,6 @@ void silk_noise_shape_analysis_FIX( psEncCtrl->Gains_Q16[ k ] = silk_ADD_POS_SAT32( psEncCtrl->Gains_Q16[ k ], gain_add_Q16 ); } - gain_mult_Q16 = SILK_FIX_CONST( 1.0, 16 ) + silk_RSHIFT_ROUND( silk_MLA( SILK_FIX_CONST( INPUT_TILT, 26 ), - psEncCtrl->coding_quality_Q14, SILK_FIX_CONST( HIGH_RATE_INPUT_TILT, 12 ) ), 10 ); - for( k = 0; k < psEnc->sCmn.nb_subfr; k++ ) { - psEncCtrl->GainsPre_Q14[ k ] = silk_SMULWB( gain_mult_Q16, psEncCtrl->GainsPre_Q14[ k ] ); - } /************************************************/ /* Control low-frequency shaping and noise tilt */ @@ -410,14 +377,6 @@ void silk_noise_shape_analysis_FIX( /****************************/ /* HARMONIC SHAPING CONTROL */ /****************************/ - /* Control boosting of harmonic frequencies */ - HarmBoost_Q16 = silk_SMULWB( silk_SMULWB( SILK_FIX_CONST( 1.0, 17 ) - silk_LSHIFT( psEncCtrl->coding_quality_Q14, 3 ), - psEnc->LTPCorr_Q15 ), SILK_FIX_CONST( LOW_RATE_HARMONIC_BOOST, 16 ) ); - - /* More harmonic boost for noisy input signals */ - HarmBoost_Q16 = silk_SMLAWB( HarmBoost_Q16, - SILK_FIX_CONST( 1.0, 16 ) - silk_LSHIFT( psEncCtrl->input_quality_Q14, 2 ), SILK_FIX_CONST( LOW_INPUT_QUALITY_HARMONIC_BOOST, 16 ) ); - if( USE_HARM_SHAPING && psEnc->sCmn.indices.signalType == TYPE_VOICED ) { /* More harmonic noise shaping for high bitrates or noisy input */ HarmShapeGain_Q16 = silk_SMLAWB( SILK_FIX_CONST( HARMONIC_SHAPING, 16 ), @@ -435,14 +394,11 @@ void silk_noise_shape_analysis_FIX( /* Smooth over subframes */ /*************************/ for( k = 0; k < MAX_NB_SUBFR; k++ ) { - psShapeSt->HarmBoost_smth_Q16 = - silk_SMLAWB( psShapeSt->HarmBoost_smth_Q16, HarmBoost_Q16 - psShapeSt->HarmBoost_smth_Q16, SILK_FIX_CONST( SUBFR_SMTH_COEF, 16 ) ); psShapeSt->HarmShapeGain_smth_Q16 = silk_SMLAWB( psShapeSt->HarmShapeGain_smth_Q16, HarmShapeGain_Q16 - psShapeSt->HarmShapeGain_smth_Q16, SILK_FIX_CONST( SUBFR_SMTH_COEF, 16 ) ); psShapeSt->Tilt_smth_Q16 = silk_SMLAWB( psShapeSt->Tilt_smth_Q16, Tilt_Q16 - psShapeSt->Tilt_smth_Q16, SILK_FIX_CONST( SUBFR_SMTH_COEF, 16 ) ); - psEncCtrl->HarmBoost_Q14[ k ] = ( opus_int )silk_RSHIFT_ROUND( psShapeSt->HarmBoost_smth_Q16, 2 ); psEncCtrl->HarmShapeGain_Q14[ k ] = ( opus_int )silk_RSHIFT_ROUND( psShapeSt->HarmShapeGain_smth_Q16, 2 ); psEncCtrl->Tilt_Q14[ k ] = ( opus_int )silk_RSHIFT_ROUND( psShapeSt->Tilt_smth_Q16, 2 ); } diff --git a/thirdparty/opus/silk/fixed/pitch_analysis_core_FIX.c b/thirdparty/opus/silk/fixed/pitch_analysis_core_FIX.c index 01bb9fc0a8..14729046d2 100644 --- a/thirdparty/opus/silk/fixed/pitch_analysis_core_FIX.c +++ b/thirdparty/opus/silk/fixed/pitch_analysis_core_FIX.c @@ -80,7 +80,7 @@ static void silk_P_Ana_calc_energy_st3( /* FIXED POINT CORE PITCH ANALYSIS FUNCTION */ /*************************************************************/ opus_int silk_pitch_analysis_core( /* O Voicing estimate: 0 voiced, 1 unvoiced */ - const opus_int16 *frame, /* I Signal of length PE_FRAME_LENGTH_MS*Fs_kHz */ + const opus_int16 *frame_unscaled, /* I Signal of length PE_FRAME_LENGTH_MS*Fs_kHz */ opus_int *pitch_out, /* O 4 pitch lag values */ opus_int16 *lagIndex, /* O Lag Index */ opus_int8 *contourIndex, /* O Pitch contour Index */ @@ -94,16 +94,17 @@ opus_int silk_pitch_analysis_core( /* O Voicing estimate: 0 int arch /* I Run-time architecture */ ) { - VARDECL( opus_int16, frame_8kHz ); + VARDECL( opus_int16, frame_8kHz_buf ); VARDECL( opus_int16, frame_4kHz ); + VARDECL( opus_int16, frame_scaled ); opus_int32 filt_state[ 6 ]; - const opus_int16 *input_frame_ptr; + const opus_int16 *frame, *frame_8kHz; opus_int i, k, d, j; VARDECL( opus_int16, C ); VARDECL( opus_int32, xcorr32 ); const opus_int16 *target_ptr, *basis_ptr; - opus_int32 cross_corr, normalizer, energy, shift, energy_basis, energy_target; - opus_int d_srch[ PE_D_SRCH_LENGTH ], Cmax, length_d_srch, length_d_comp; + opus_int32 cross_corr, normalizer, energy, energy_basis, energy_target; + opus_int d_srch[ PE_D_SRCH_LENGTH ], Cmax, length_d_srch, length_d_comp, shift; VARDECL( opus_int16, d_comp ); opus_int32 sum, threshold, lag_counter; opus_int CBimax, CBimax_new, CBimax_old, lag, start_lag, end_lag, lag_new; @@ -119,12 +120,13 @@ opus_int silk_pitch_analysis_core( /* O Voicing estimate: 0 opus_int32 delta_lag_log2_sqr_Q7, lag_log2_Q7, prevLag_log2_Q7, prev_lag_bias_Q13; const opus_int8 *Lag_CB_ptr; SAVE_STACK; + /* Check for valid sampling frequency */ - silk_assert( Fs_kHz == 8 || Fs_kHz == 12 || Fs_kHz == 16 ); + celt_assert( Fs_kHz == 8 || Fs_kHz == 12 || Fs_kHz == 16 ); /* Check for valid complexity setting */ - silk_assert( complexity >= SILK_PE_MIN_COMPLEX ); - silk_assert( complexity <= SILK_PE_MAX_COMPLEX ); + celt_assert( complexity >= SILK_PE_MIN_COMPLEX ); + celt_assert( complexity <= SILK_PE_MAX_COMPLEX ); silk_assert( search_thres1_Q16 >= 0 && search_thres1_Q16 <= (1<<16) ); silk_assert( search_thres2_Q13 >= 0 && search_thres2_Q13 <= (1<<13) ); @@ -137,17 +139,33 @@ opus_int silk_pitch_analysis_core( /* O Voicing estimate: 0 min_lag = PE_MIN_LAG_MS * Fs_kHz; max_lag = PE_MAX_LAG_MS * Fs_kHz - 1; + /* Downscale input if necessary */ + silk_sum_sqr_shift( &energy, &shift, frame_unscaled, frame_length ); + shift += 3 - silk_CLZ32( energy ); /* at least two bits headroom */ + ALLOC( frame_scaled, frame_length, opus_int16 ); + if( shift > 0 ) { + shift = silk_RSHIFT( shift + 1, 1 ); + for( i = 0; i < frame_length; i++ ) { + frame_scaled[ i ] = silk_RSHIFT( frame_unscaled[ i ], shift ); + } + frame = frame_scaled; + } else { + frame = frame_unscaled; + } + + ALLOC( frame_8kHz_buf, ( Fs_kHz == 8 ) ? 1 : frame_length_8kHz, opus_int16 ); /* Resample from input sampled at Fs_kHz to 8 kHz */ - ALLOC( frame_8kHz, frame_length_8kHz, opus_int16 ); if( Fs_kHz == 16 ) { silk_memset( filt_state, 0, 2 * sizeof( opus_int32 ) ); - silk_resampler_down2( filt_state, frame_8kHz, frame, frame_length ); + silk_resampler_down2( filt_state, frame_8kHz_buf, frame, frame_length ); + frame_8kHz = frame_8kHz_buf; } else if( Fs_kHz == 12 ) { silk_memset( filt_state, 0, 6 * sizeof( opus_int32 ) ); - silk_resampler_down2_3( filt_state, frame_8kHz, frame, frame_length ); + silk_resampler_down2_3( filt_state, frame_8kHz_buf, frame, frame_length ); + frame_8kHz = frame_8kHz_buf; } else { - silk_assert( Fs_kHz == 8 ); - silk_memcpy( frame_8kHz, frame, frame_length_8kHz * sizeof(opus_int16) ); + celt_assert( Fs_kHz == 8 ); + frame_8kHz = frame; } /* Decimate again to 4 kHz */ @@ -160,19 +178,6 @@ opus_int silk_pitch_analysis_core( /* O Voicing estimate: 0 frame_4kHz[ i ] = silk_ADD_SAT16( frame_4kHz[ i ], frame_4kHz[ i - 1 ] ); } - /******************************************************************************* - ** Scale 4 kHz signal down to prevent correlations measures from overflowing - ** find scaling as max scaling for each 8kHz(?) subframe - *******************************************************************************/ - - /* Inner product is calculated with different lengths, so scale for the worst case */ - silk_sum_sqr_shift( &energy, &shift, frame_4kHz, frame_length_4kHz ); - if( shift > 0 ) { - shift = silk_RSHIFT( shift, 1 ); - for( i = 0; i < frame_length_4kHz; i++ ) { - frame_4kHz[ i ] = silk_RSHIFT( frame_4kHz[ i ], shift ); - } - } /****************************************************************************** * FIRST STAGE, operating in 4 khz @@ -183,14 +188,14 @@ opus_int silk_pitch_analysis_core( /* O Voicing estimate: 0 target_ptr = &frame_4kHz[ silk_LSHIFT( SF_LENGTH_4KHZ, 2 ) ]; for( k = 0; k < nb_subfr >> 1; k++ ) { /* Check that we are within range of the array */ - silk_assert( target_ptr >= frame_4kHz ); - silk_assert( target_ptr + SF_LENGTH_8KHZ <= frame_4kHz + frame_length_4kHz ); + celt_assert( target_ptr >= frame_4kHz ); + celt_assert( target_ptr + SF_LENGTH_8KHZ <= frame_4kHz + frame_length_4kHz ); basis_ptr = target_ptr - MIN_LAG_4KHZ; /* Check that we are within range of the array */ - silk_assert( basis_ptr >= frame_4kHz ); - silk_assert( basis_ptr + SF_LENGTH_8KHZ <= frame_4kHz + frame_length_4kHz ); + celt_assert( basis_ptr >= frame_4kHz ); + celt_assert( basis_ptr + SF_LENGTH_8KHZ <= frame_4kHz + frame_length_4kHz ); celt_pitch_xcorr( target_ptr, target_ptr - MAX_LAG_4KHZ, xcorr32, SF_LENGTH_8KHZ, MAX_LAG_4KHZ - MIN_LAG_4KHZ + 1, arch ); @@ -244,7 +249,7 @@ opus_int silk_pitch_analysis_core( /* O Voicing estimate: 0 /* Sort */ length_d_srch = silk_ADD_LSHIFT32( 4, complexity, 1 ); - silk_assert( 3 * length_d_srch <= PE_D_SRCH_LENGTH ); + celt_assert( 3 * length_d_srch <= PE_D_SRCH_LENGTH ); silk_insertion_sort_decreasing_int16( C, d_srch, CSTRIDE_4KHZ, length_d_srch ); @@ -269,7 +274,7 @@ opus_int silk_pitch_analysis_core( /* O Voicing estimate: 0 break; } } - silk_assert( length_d_srch > 0 ); + celt_assert( length_d_srch > 0 ); ALLOC( d_comp, D_COMP_STRIDE, opus_int16 ); for( i = D_COMP_MIN; i < D_COMP_MAX; i++ ) { @@ -311,18 +316,6 @@ opus_int silk_pitch_analysis_core( /* O Voicing estimate: 0 ** SECOND STAGE, operating at 8 kHz, on lag sections with high correlation *************************************************************************************/ - /****************************************************************************** - ** Scale signal down to avoid correlations measures from overflowing - *******************************************************************************/ - /* find scaling as max scaling for each subframe */ - silk_sum_sqr_shift( &energy, &shift, frame_8kHz, frame_length_8kHz ); - if( shift > 0 ) { - shift = silk_RSHIFT( shift, 1 ); - for( i = 0; i < frame_length_8kHz; i++ ) { - frame_8kHz[ i ] = silk_RSHIFT( frame_8kHz[ i ], shift ); - } - } - /********************************************************************************* * Find energy of each subframe projected onto its history, for a range of delays *********************************************************************************/ @@ -332,8 +325,8 @@ opus_int silk_pitch_analysis_core( /* O Voicing estimate: 0 for( k = 0; k < nb_subfr; k++ ) { /* Check that we are within range of the array */ - silk_assert( target_ptr >= frame_8kHz ); - silk_assert( target_ptr + SF_LENGTH_8KHZ <= frame_8kHz + frame_length_8kHz ); + celt_assert( target_ptr >= frame_8kHz ); + celt_assert( target_ptr + SF_LENGTH_8KHZ <= frame_8kHz + frame_length_8kHz ); energy_target = silk_ADD32( silk_inner_prod_aligned( target_ptr, target_ptr, SF_LENGTH_8KHZ, arch ), 1 ); for( j = 0; j < length_d_comp; j++ ) { @@ -462,24 +455,6 @@ opus_int silk_pitch_analysis_core( /* O Voicing estimate: 0 silk_assert( *LTPCorr_Q15 >= 0 ); if( Fs_kHz > 8 ) { - VARDECL( opus_int16, scratch_mem ); - /***************************************************************************/ - /* Scale input signal down to avoid correlations measures from overflowing */ - /***************************************************************************/ - /* find scaling as max scaling for each subframe */ - silk_sum_sqr_shift( &energy, &shift, frame, frame_length ); - ALLOC( scratch_mem, shift > 0 ? frame_length : ALLOC_NONE, opus_int16 ); - if( shift > 0 ) { - /* Move signal to scratch mem because the input signal should be unchanged */ - shift = silk_RSHIFT( shift, 1 ); - for( i = 0; i < frame_length; i++ ) { - scratch_mem[ i ] = silk_RSHIFT( frame[ i ], shift ); - } - input_frame_ptr = scratch_mem; - } else { - input_frame_ptr = frame; - } - /* Search in original signal */ CBimax_old = CBimax; @@ -519,14 +494,14 @@ opus_int silk_pitch_analysis_core( /* O Voicing estimate: 0 /* Calculate the correlations and energies needed in stage 3 */ ALLOC( energies_st3, nb_subfr * nb_cbk_search, silk_pe_stage3_vals ); ALLOC( cross_corr_st3, nb_subfr * nb_cbk_search, silk_pe_stage3_vals ); - silk_P_Ana_calc_corr_st3( cross_corr_st3, input_frame_ptr, start_lag, sf_length, nb_subfr, complexity, arch ); - silk_P_Ana_calc_energy_st3( energies_st3, input_frame_ptr, start_lag, sf_length, nb_subfr, complexity, arch ); + silk_P_Ana_calc_corr_st3( cross_corr_st3, frame, start_lag, sf_length, nb_subfr, complexity, arch ); + silk_P_Ana_calc_energy_st3( energies_st3, frame, start_lag, sf_length, nb_subfr, complexity, arch ); lag_counter = 0; silk_assert( lag == silk_SAT16( lag ) ); contour_bias_Q15 = silk_DIV32_16( SILK_FIX_CONST( PE_FLATCONTOUR_BIAS, 15 ), lag ); - target_ptr = &input_frame_ptr[ PE_LTP_MEM_LENGTH_MS * Fs_kHz ]; + target_ptr = &frame[ PE_LTP_MEM_LENGTH_MS * Fs_kHz ]; energy_target = silk_ADD32( silk_inner_prod_aligned( target_ptr, target_ptr, nb_subfr * sf_length, arch ), 1 ); for( d = start_lag; d <= end_lag; d++ ) { for( j = 0; j < nb_cbk_search; j++ ) { @@ -575,7 +550,7 @@ opus_int silk_pitch_analysis_core( /* O Voicing estimate: 0 *lagIndex = (opus_int16)( lag - MIN_LAG_8KHZ ); *contourIndex = (opus_int8)CBimax; } - silk_assert( *lagIndex >= 0 ); + celt_assert( *lagIndex >= 0 ); /* return as voiced */ RESTORE_STACK; return 0; @@ -612,8 +587,8 @@ static void silk_P_Ana_calc_corr_st3( const opus_int8 *Lag_range_ptr, *Lag_CB_ptr; SAVE_STACK; - silk_assert( complexity >= SILK_PE_MIN_COMPLEX ); - silk_assert( complexity <= SILK_PE_MAX_COMPLEX ); + celt_assert( complexity >= SILK_PE_MIN_COMPLEX ); + celt_assert( complexity <= SILK_PE_MAX_COMPLEX ); if( nb_subfr == PE_MAX_NB_SUBFR ) { Lag_range_ptr = &silk_Lag_range_stage3[ complexity ][ 0 ][ 0 ]; @@ -621,7 +596,7 @@ static void silk_P_Ana_calc_corr_st3( nb_cbk_search = silk_nb_cbk_searchs_stage3[ complexity ]; cbk_size = PE_NB_CBKS_STAGE3_MAX; } else { - silk_assert( nb_subfr == PE_MAX_NB_SUBFR >> 1); + celt_assert( nb_subfr == PE_MAX_NB_SUBFR >> 1); Lag_range_ptr = &silk_Lag_range_stage3_10_ms[ 0 ][ 0 ]; Lag_CB_ptr = &silk_CB_lags_stage3_10_ms[ 0 ][ 0 ]; nb_cbk_search = PE_NB_CBKS_STAGE3_10MS; @@ -637,7 +612,7 @@ static void silk_P_Ana_calc_corr_st3( /* Calculate the correlations for each subframe */ lag_low = matrix_ptr( Lag_range_ptr, k, 0, 2 ); lag_high = matrix_ptr( Lag_range_ptr, k, 1, 2 ); - silk_assert(lag_high-lag_low+1 <= SCRATCH_SIZE); + celt_assert(lag_high-lag_low+1 <= SCRATCH_SIZE); celt_pitch_xcorr( target_ptr, target_ptr - start_lag - lag_high, xcorr32, sf_length, lag_high - lag_low + 1, arch ); for( j = lag_low; j <= lag_high; j++ ) { silk_assert( lag_counter < SCRATCH_SIZE ); @@ -684,8 +659,8 @@ static void silk_P_Ana_calc_energy_st3( const opus_int8 *Lag_range_ptr, *Lag_CB_ptr; SAVE_STACK; - silk_assert( complexity >= SILK_PE_MIN_COMPLEX ); - silk_assert( complexity <= SILK_PE_MAX_COMPLEX ); + celt_assert( complexity >= SILK_PE_MIN_COMPLEX ); + celt_assert( complexity <= SILK_PE_MAX_COMPLEX ); if( nb_subfr == PE_MAX_NB_SUBFR ) { Lag_range_ptr = &silk_Lag_range_stage3[ complexity ][ 0 ][ 0 ]; @@ -693,7 +668,7 @@ static void silk_P_Ana_calc_energy_st3( nb_cbk_search = silk_nb_cbk_searchs_stage3[ complexity ]; cbk_size = PE_NB_CBKS_STAGE3_MAX; } else { - silk_assert( nb_subfr == PE_MAX_NB_SUBFR >> 1); + celt_assert( nb_subfr == PE_MAX_NB_SUBFR >> 1); Lag_range_ptr = &silk_Lag_range_stage3_10_ms[ 0 ][ 0 ]; Lag_CB_ptr = &silk_CB_lags_stage3_10_ms[ 0 ][ 0 ]; nb_cbk_search = PE_NB_CBKS_STAGE3_10MS; diff --git a/thirdparty/opus/silk/fixed/prefilter_FIX.c b/thirdparty/opus/silk/fixed/prefilter_FIX.c deleted file mode 100644 index 6a8e35152e..0000000000 --- a/thirdparty/opus/silk/fixed/prefilter_FIX.c +++ /dev/null @@ -1,221 +0,0 @@ -/*********************************************************************** -Copyright (c) 2006-2011, Skype Limited. All rights reserved. -Redistribution and use in source and binary forms, with or without -modification, are permitted provided that the following conditions -are met: -- Redistributions of source code must retain the above copyright notice, -this list of conditions and the following disclaimer. -- Redistributions in binary form must reproduce the above copyright -notice, this list of conditions and the following disclaimer in the -documentation and/or other materials provided with the distribution. -- Neither the name of Internet Society, IETF or IETF Trust, nor the -names of specific contributors, may be used to endorse or promote -products derived from this software without specific prior written -permission. -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE -LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE -POSSIBILITY OF SUCH DAMAGE. -***********************************************************************/ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#include "main_FIX.h" -#include "stack_alloc.h" -#include "tuning_parameters.h" - -#if defined(MIPSr1_ASM) -#include "mips/prefilter_FIX_mipsr1.h" -#endif - - -#if !defined(OVERRIDE_silk_warped_LPC_analysis_filter_FIX) -#define silk_warped_LPC_analysis_filter_FIX(state, res_Q2, coef_Q13, input, lambda_Q16, length, order, arch) \ - ((void)(arch),silk_warped_LPC_analysis_filter_FIX_c(state, res_Q2, coef_Q13, input, lambda_Q16, length, order)) -#endif - -/* Prefilter for finding Quantizer input signal */ -static OPUS_INLINE void silk_prefilt_FIX( - silk_prefilter_state_FIX *P, /* I/O state */ - opus_int32 st_res_Q12[], /* I short term residual signal */ - opus_int32 xw_Q3[], /* O prefiltered signal */ - opus_int32 HarmShapeFIRPacked_Q12, /* I Harmonic shaping coeficients */ - opus_int Tilt_Q14, /* I Tilt shaping coeficient */ - opus_int32 LF_shp_Q14, /* I Low-frequancy shaping coeficients */ - opus_int lag, /* I Lag for harmonic shaping */ - opus_int length /* I Length of signals */ -); - -void silk_warped_LPC_analysis_filter_FIX_c( - opus_int32 state[], /* I/O State [order + 1] */ - opus_int32 res_Q2[], /* O Residual signal [length] */ - const opus_int16 coef_Q13[], /* I Coefficients [order] */ - const opus_int16 input[], /* I Input signal [length] */ - const opus_int16 lambda_Q16, /* I Warping factor */ - const opus_int length, /* I Length of input signal */ - const opus_int order /* I Filter order (even) */ -) -{ - opus_int n, i; - opus_int32 acc_Q11, tmp1, tmp2; - - /* Order must be even */ - silk_assert( ( order & 1 ) == 0 ); - - for( n = 0; n < length; n++ ) { - /* Output of lowpass section */ - tmp2 = silk_SMLAWB( state[ 0 ], state[ 1 ], lambda_Q16 ); - state[ 0 ] = silk_LSHIFT( input[ n ], 14 ); - /* Output of allpass section */ - tmp1 = silk_SMLAWB( state[ 1 ], state[ 2 ] - tmp2, lambda_Q16 ); - state[ 1 ] = tmp2; - acc_Q11 = silk_RSHIFT( order, 1 ); - acc_Q11 = silk_SMLAWB( acc_Q11, tmp2, coef_Q13[ 0 ] ); - /* Loop over allpass sections */ - for( i = 2; i < order; i += 2 ) { - /* Output of allpass section */ - tmp2 = silk_SMLAWB( state[ i ], state[ i + 1 ] - tmp1, lambda_Q16 ); - state[ i ] = tmp1; - acc_Q11 = silk_SMLAWB( acc_Q11, tmp1, coef_Q13[ i - 1 ] ); - /* Output of allpass section */ - tmp1 = silk_SMLAWB( state[ i + 1 ], state[ i + 2 ] - tmp2, lambda_Q16 ); - state[ i + 1 ] = tmp2; - acc_Q11 = silk_SMLAWB( acc_Q11, tmp2, coef_Q13[ i ] ); - } - state[ order ] = tmp1; - acc_Q11 = silk_SMLAWB( acc_Q11, tmp1, coef_Q13[ order - 1 ] ); - res_Q2[ n ] = silk_LSHIFT( (opus_int32)input[ n ], 2 ) - silk_RSHIFT_ROUND( acc_Q11, 9 ); - } -} - -void silk_prefilter_FIX( - silk_encoder_state_FIX *psEnc, /* I/O Encoder state */ - const silk_encoder_control_FIX *psEncCtrl, /* I Encoder control */ - opus_int32 xw_Q3[], /* O Weighted signal */ - const opus_int16 x[] /* I Speech signal */ -) -{ - silk_prefilter_state_FIX *P = &psEnc->sPrefilt; - opus_int j, k, lag; - opus_int32 tmp_32; - const opus_int16 *AR1_shp_Q13; - const opus_int16 *px; - opus_int32 *pxw_Q3; - opus_int HarmShapeGain_Q12, Tilt_Q14; - opus_int32 HarmShapeFIRPacked_Q12, LF_shp_Q14; - VARDECL( opus_int32, x_filt_Q12 ); - VARDECL( opus_int32, st_res_Q2 ); - opus_int16 B_Q10[ 2 ]; - SAVE_STACK; - - /* Set up pointers */ - px = x; - pxw_Q3 = xw_Q3; - lag = P->lagPrev; - ALLOC( x_filt_Q12, psEnc->sCmn.subfr_length, opus_int32 ); - ALLOC( st_res_Q2, psEnc->sCmn.subfr_length, opus_int32 ); - for( k = 0; k < psEnc->sCmn.nb_subfr; k++ ) { - /* Update Variables that change per sub frame */ - if( psEnc->sCmn.indices.signalType == TYPE_VOICED ) { - lag = psEncCtrl->pitchL[ k ]; - } - - /* Noise shape parameters */ - HarmShapeGain_Q12 = silk_SMULWB( (opus_int32)psEncCtrl->HarmShapeGain_Q14[ k ], 16384 - psEncCtrl->HarmBoost_Q14[ k ] ); - silk_assert( HarmShapeGain_Q12 >= 0 ); - HarmShapeFIRPacked_Q12 = silk_RSHIFT( HarmShapeGain_Q12, 2 ); - HarmShapeFIRPacked_Q12 |= silk_LSHIFT( (opus_int32)silk_RSHIFT( HarmShapeGain_Q12, 1 ), 16 ); - Tilt_Q14 = psEncCtrl->Tilt_Q14[ k ]; - LF_shp_Q14 = psEncCtrl->LF_shp_Q14[ k ]; - AR1_shp_Q13 = &psEncCtrl->AR1_Q13[ k * MAX_SHAPE_LPC_ORDER ]; - - /* Short term FIR filtering*/ - silk_warped_LPC_analysis_filter_FIX( P->sAR_shp, st_res_Q2, AR1_shp_Q13, px, - psEnc->sCmn.warping_Q16, psEnc->sCmn.subfr_length, psEnc->sCmn.shapingLPCOrder, psEnc->sCmn.arch ); - - /* Reduce (mainly) low frequencies during harmonic emphasis */ - B_Q10[ 0 ] = silk_RSHIFT_ROUND( psEncCtrl->GainsPre_Q14[ k ], 4 ); - tmp_32 = silk_SMLABB( SILK_FIX_CONST( INPUT_TILT, 26 ), psEncCtrl->HarmBoost_Q14[ k ], HarmShapeGain_Q12 ); /* Q26 */ - tmp_32 = silk_SMLABB( tmp_32, psEncCtrl->coding_quality_Q14, SILK_FIX_CONST( HIGH_RATE_INPUT_TILT, 12 ) ); /* Q26 */ - tmp_32 = silk_SMULWB( tmp_32, -psEncCtrl->GainsPre_Q14[ k ] ); /* Q24 */ - tmp_32 = silk_RSHIFT_ROUND( tmp_32, 14 ); /* Q10 */ - B_Q10[ 1 ]= silk_SAT16( tmp_32 ); - x_filt_Q12[ 0 ] = silk_MLA( silk_MUL( st_res_Q2[ 0 ], B_Q10[ 0 ] ), P->sHarmHP_Q2, B_Q10[ 1 ] ); - for( j = 1; j < psEnc->sCmn.subfr_length; j++ ) { - x_filt_Q12[ j ] = silk_MLA( silk_MUL( st_res_Q2[ j ], B_Q10[ 0 ] ), st_res_Q2[ j - 1 ], B_Q10[ 1 ] ); - } - P->sHarmHP_Q2 = st_res_Q2[ psEnc->sCmn.subfr_length - 1 ]; - - silk_prefilt_FIX( P, x_filt_Q12, pxw_Q3, HarmShapeFIRPacked_Q12, Tilt_Q14, LF_shp_Q14, lag, psEnc->sCmn.subfr_length ); - - px += psEnc->sCmn.subfr_length; - pxw_Q3 += psEnc->sCmn.subfr_length; - } - - P->lagPrev = psEncCtrl->pitchL[ psEnc->sCmn.nb_subfr - 1 ]; - RESTORE_STACK; -} - -#ifndef OVERRIDE_silk_prefilt_FIX -/* Prefilter for finding Quantizer input signal */ -static OPUS_INLINE void silk_prefilt_FIX( - silk_prefilter_state_FIX *P, /* I/O state */ - opus_int32 st_res_Q12[], /* I short term residual signal */ - opus_int32 xw_Q3[], /* O prefiltered signal */ - opus_int32 HarmShapeFIRPacked_Q12, /* I Harmonic shaping coeficients */ - opus_int Tilt_Q14, /* I Tilt shaping coeficient */ - opus_int32 LF_shp_Q14, /* I Low-frequancy shaping coeficients */ - opus_int lag, /* I Lag for harmonic shaping */ - opus_int length /* I Length of signals */ -) -{ - opus_int i, idx, LTP_shp_buf_idx; - opus_int32 n_LTP_Q12, n_Tilt_Q10, n_LF_Q10; - opus_int32 sLF_MA_shp_Q12, sLF_AR_shp_Q12; - opus_int16 *LTP_shp_buf; - - /* To speed up use temp variables instead of using the struct */ - LTP_shp_buf = P->sLTP_shp; - LTP_shp_buf_idx = P->sLTP_shp_buf_idx; - sLF_AR_shp_Q12 = P->sLF_AR_shp_Q12; - sLF_MA_shp_Q12 = P->sLF_MA_shp_Q12; - - for( i = 0; i < length; i++ ) { - if( lag > 0 ) { - /* unrolled loop */ - silk_assert( HARM_SHAPE_FIR_TAPS == 3 ); - idx = lag + LTP_shp_buf_idx; - n_LTP_Q12 = silk_SMULBB( LTP_shp_buf[ ( idx - HARM_SHAPE_FIR_TAPS / 2 - 1) & LTP_MASK ], HarmShapeFIRPacked_Q12 ); - n_LTP_Q12 = silk_SMLABT( n_LTP_Q12, LTP_shp_buf[ ( idx - HARM_SHAPE_FIR_TAPS / 2 ) & LTP_MASK ], HarmShapeFIRPacked_Q12 ); - n_LTP_Q12 = silk_SMLABB( n_LTP_Q12, LTP_shp_buf[ ( idx - HARM_SHAPE_FIR_TAPS / 2 + 1) & LTP_MASK ], HarmShapeFIRPacked_Q12 ); - } else { - n_LTP_Q12 = 0; - } - - n_Tilt_Q10 = silk_SMULWB( sLF_AR_shp_Q12, Tilt_Q14 ); - n_LF_Q10 = silk_SMLAWB( silk_SMULWT( sLF_AR_shp_Q12, LF_shp_Q14 ), sLF_MA_shp_Q12, LF_shp_Q14 ); - - sLF_AR_shp_Q12 = silk_SUB32( st_res_Q12[ i ], silk_LSHIFT( n_Tilt_Q10, 2 ) ); - sLF_MA_shp_Q12 = silk_SUB32( sLF_AR_shp_Q12, silk_LSHIFT( n_LF_Q10, 2 ) ); - - LTP_shp_buf_idx = ( LTP_shp_buf_idx - 1 ) & LTP_MASK; - LTP_shp_buf[ LTP_shp_buf_idx ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( sLF_MA_shp_Q12, 12 ) ); - - xw_Q3[i] = silk_RSHIFT_ROUND( silk_SUB32( sLF_MA_shp_Q12, n_LTP_Q12 ), 9 ); - } - - /* Copy temp variable back to state */ - P->sLF_AR_shp_Q12 = sLF_AR_shp_Q12; - P->sLF_MA_shp_Q12 = sLF_MA_shp_Q12; - P->sLTP_shp_buf_idx = LTP_shp_buf_idx; -} -#endif /* OVERRIDE_silk_prefilt_FIX */ diff --git a/thirdparty/opus/silk/fixed/residual_energy16_FIX.c b/thirdparty/opus/silk/fixed/residual_energy16_FIX.c index ebffb2a66f..7f130f3d3d 100644 --- a/thirdparty/opus/silk/fixed/residual_energy16_FIX.c +++ b/thirdparty/opus/silk/fixed/residual_energy16_FIX.c @@ -47,10 +47,10 @@ opus_int32 silk_residual_energy16_covar_FIX( const opus_int32 *pRow; /* Safety checks */ - silk_assert( D >= 0 ); - silk_assert( D <= 16 ); - silk_assert( cQ > 0 ); - silk_assert( cQ < 16 ); + celt_assert( D >= 0 ); + celt_assert( D <= 16 ); + celt_assert( cQ > 0 ); + celt_assert( cQ < 16 ); lshifts = 16 - cQ; Qxtra = lshifts; diff --git a/thirdparty/opus/silk/fixed/residual_energy_FIX.c b/thirdparty/opus/silk/fixed/residual_energy_FIX.c index 41f74778e8..6c7cade9a0 100644 --- a/thirdparty/opus/silk/fixed/residual_energy_FIX.c +++ b/thirdparty/opus/silk/fixed/residual_energy_FIX.c @@ -58,7 +58,7 @@ void silk_residual_energy_FIX( /* Filter input to create the LPC residual for each frame half, and measure subframe energies */ ALLOC( LPC_res, ( MAX_NB_SUBFR >> 1 ) * offset, opus_int16 ); - silk_assert( ( nb_subfr >> 1 ) * ( MAX_NB_SUBFR >> 1 ) == nb_subfr ); + celt_assert( ( nb_subfr >> 1 ) * ( MAX_NB_SUBFR >> 1 ) == nb_subfr ); for( i = 0; i < nb_subfr >> 1; i++ ) { /* Calculate half frame LPC residual signal including preceding samples */ silk_LPC_analysis_filter( LPC_res, x_ptr, a_Q12[ i ], ( MAX_NB_SUBFR >> 1 ) * offset, LPC_order, arch ); diff --git a/thirdparty/opus/silk/fixed/schur64_FIX.c b/thirdparty/opus/silk/fixed/schur64_FIX.c index 764a10ef3e..4b7e19ea59 100644 --- a/thirdparty/opus/silk/fixed/schur64_FIX.c +++ b/thirdparty/opus/silk/fixed/schur64_FIX.c @@ -43,7 +43,7 @@ opus_int32 silk_schur64( /* O returns residual ene opus_int32 C[ SILK_MAX_ORDER_LPC + 1 ][ 2 ]; opus_int32 Ctmp1_Q30, Ctmp2_Q30, rc_tmp_Q31; - silk_assert( order==6||order==8||order==10||order==12||order==14||order==16 ); + celt_assert( order >= 0 && order <= SILK_MAX_ORDER_LPC ); /* Check for invalid input */ if( c[ 0 ] <= 0 ) { @@ -51,9 +51,10 @@ opus_int32 silk_schur64( /* O returns residual ene return 0; } - for( k = 0; k < order + 1; k++ ) { + k = 0; + do { C[ k ][ 0 ] = C[ k ][ 1 ] = c[ k ]; - } + } while( ++k <= order ); for( k = 0; k < order; k++ ) { /* Check that we won't be getting an unstable rc, otherwise stop here. */ diff --git a/thirdparty/opus/silk/fixed/schur_FIX.c b/thirdparty/opus/silk/fixed/schur_FIX.c index c4c0ef23b4..2840f6b1aa 100644 --- a/thirdparty/opus/silk/fixed/schur_FIX.c +++ b/thirdparty/opus/silk/fixed/schur_FIX.c @@ -43,28 +43,29 @@ opus_int32 silk_schur( /* O Returns residual ene opus_int32 C[ SILK_MAX_ORDER_LPC + 1 ][ 2 ]; opus_int32 Ctmp1, Ctmp2, rc_tmp_Q15; - silk_assert( order==6||order==8||order==10||order==12||order==14||order==16 ); + celt_assert( order >= 0 && order <= SILK_MAX_ORDER_LPC ); /* Get number of leading zeros */ lz = silk_CLZ32( c[ 0 ] ); /* Copy correlations and adjust level to Q30 */ + k = 0; if( lz < 2 ) { /* lz must be 1, so shift one to the right */ - for( k = 0; k < order + 1; k++ ) { + do { C[ k ][ 0 ] = C[ k ][ 1 ] = silk_RSHIFT( c[ k ], 1 ); - } + } while( ++k <= order ); } else if( lz > 2 ) { /* Shift to the left */ lz -= 2; - for( k = 0; k < order + 1; k++ ) { + do { C[ k ][ 0 ] = C[ k ][ 1 ] = silk_LSHIFT( c[ k ], lz ); - } + } while( ++k <= order ); } else { /* No need to shift */ - for( k = 0; k < order + 1; k++ ) { + do { C[ k ][ 0 ] = C[ k ][ 1 ] = c[ k ]; - } + } while( ++k <= order ); } for( k = 0; k < order; k++ ) { diff --git a/thirdparty/opus/silk/fixed/solve_LS_FIX.c b/thirdparty/opus/silk/fixed/solve_LS_FIX.c deleted file mode 100644 index 51d7d49d02..0000000000 --- a/thirdparty/opus/silk/fixed/solve_LS_FIX.c +++ /dev/null @@ -1,249 +0,0 @@ -/*********************************************************************** -Copyright (c) 2006-2011, Skype Limited. All rights reserved. -Redistribution and use in source and binary forms, with or without -modification, are permitted provided that the following conditions -are met: -- Redistributions of source code must retain the above copyright notice, -this list of conditions and the following disclaimer. -- Redistributions in binary form must reproduce the above copyright -notice, this list of conditions and the following disclaimer in the -documentation and/or other materials provided with the distribution. -- Neither the name of Internet Society, IETF or IETF Trust, nor the -names of specific contributors, may be used to endorse or promote -products derived from this software without specific prior written -permission. -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE -LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE -POSSIBILITY OF SUCH DAMAGE. -***********************************************************************/ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#include "main_FIX.h" -#include "stack_alloc.h" -#include "tuning_parameters.h" - -/*****************************/ -/* Internal function headers */ -/*****************************/ - -typedef struct { - opus_int32 Q36_part; - opus_int32 Q48_part; -} inv_D_t; - -/* Factorize square matrix A into LDL form */ -static OPUS_INLINE void silk_LDL_factorize_FIX( - opus_int32 *A, /* I/O Pointer to Symetric Square Matrix */ - opus_int M, /* I Size of Matrix */ - opus_int32 *L_Q16, /* I/O Pointer to Square Upper triangular Matrix */ - inv_D_t *inv_D /* I/O Pointer to vector holding inverted diagonal elements of D */ -); - -/* Solve Lx = b, when L is lower triangular and has ones on the diagonal */ -static OPUS_INLINE void silk_LS_SolveFirst_FIX( - const opus_int32 *L_Q16, /* I Pointer to Lower Triangular Matrix */ - opus_int M, /* I Dim of Matrix equation */ - const opus_int32 *b, /* I b Vector */ - opus_int32 *x_Q16 /* O x Vector */ -); - -/* Solve L^t*x = b, where L is lower triangular with ones on the diagonal */ -static OPUS_INLINE void silk_LS_SolveLast_FIX( - const opus_int32 *L_Q16, /* I Pointer to Lower Triangular Matrix */ - const opus_int M, /* I Dim of Matrix equation */ - const opus_int32 *b, /* I b Vector */ - opus_int32 *x_Q16 /* O x Vector */ -); - -static OPUS_INLINE void silk_LS_divide_Q16_FIX( - opus_int32 T[], /* I/O Numenator vector */ - inv_D_t *inv_D, /* I 1 / D vector */ - opus_int M /* I dimension */ -); - -/* Solves Ax = b, assuming A is symmetric */ -void silk_solve_LDL_FIX( - opus_int32 *A, /* I Pointer to symetric square matrix A */ - opus_int M, /* I Size of matrix */ - const opus_int32 *b, /* I Pointer to b vector */ - opus_int32 *x_Q16 /* O Pointer to x solution vector */ -) -{ - VARDECL( opus_int32, L_Q16 ); - opus_int32 Y[ MAX_MATRIX_SIZE ]; - inv_D_t inv_D[ MAX_MATRIX_SIZE ]; - SAVE_STACK; - - silk_assert( M <= MAX_MATRIX_SIZE ); - ALLOC( L_Q16, M * M, opus_int32 ); - - /*************************************************** - Factorize A by LDL such that A = L*D*L', - where L is lower triangular with ones on diagonal - ****************************************************/ - silk_LDL_factorize_FIX( A, M, L_Q16, inv_D ); - - /**************************************************** - * substitute D*L'*x = Y. ie: - L*D*L'*x = b => L*Y = b <=> Y = inv(L)*b - ******************************************************/ - silk_LS_SolveFirst_FIX( L_Q16, M, b, Y ); - - /**************************************************** - D*L'*x = Y <=> L'*x = inv(D)*Y, because D is - diagonal just multiply with 1/d_i - ****************************************************/ - silk_LS_divide_Q16_FIX( Y, inv_D, M ); - - /**************************************************** - x = inv(L') * inv(D) * Y - *****************************************************/ - silk_LS_SolveLast_FIX( L_Q16, M, Y, x_Q16 ); - RESTORE_STACK; -} - -static OPUS_INLINE void silk_LDL_factorize_FIX( - opus_int32 *A, /* I/O Pointer to Symetric Square Matrix */ - opus_int M, /* I Size of Matrix */ - opus_int32 *L_Q16, /* I/O Pointer to Square Upper triangular Matrix */ - inv_D_t *inv_D /* I/O Pointer to vector holding inverted diagonal elements of D */ -) -{ - opus_int i, j, k, status, loop_count; - const opus_int32 *ptr1, *ptr2; - opus_int32 diag_min_value, tmp_32, err; - opus_int32 v_Q0[ MAX_MATRIX_SIZE ], D_Q0[ MAX_MATRIX_SIZE ]; - opus_int32 one_div_diag_Q36, one_div_diag_Q40, one_div_diag_Q48; - - silk_assert( M <= MAX_MATRIX_SIZE ); - - status = 1; - diag_min_value = silk_max_32( silk_SMMUL( silk_ADD_SAT32( A[ 0 ], A[ silk_SMULBB( M, M ) - 1 ] ), SILK_FIX_CONST( FIND_LTP_COND_FAC, 31 ) ), 1 << 9 ); - for( loop_count = 0; loop_count < M && status == 1; loop_count++ ) { - status = 0; - for( j = 0; j < M; j++ ) { - ptr1 = matrix_adr( L_Q16, j, 0, M ); - tmp_32 = 0; - for( i = 0; i < j; i++ ) { - v_Q0[ i ] = silk_SMULWW( D_Q0[ i ], ptr1[ i ] ); /* Q0 */ - tmp_32 = silk_SMLAWW( tmp_32, v_Q0[ i ], ptr1[ i ] ); /* Q0 */ - } - tmp_32 = silk_SUB32( matrix_ptr( A, j, j, M ), tmp_32 ); - - if( tmp_32 < diag_min_value ) { - tmp_32 = silk_SUB32( silk_SMULBB( loop_count + 1, diag_min_value ), tmp_32 ); - /* Matrix not positive semi-definite, or ill conditioned */ - for( i = 0; i < M; i++ ) { - matrix_ptr( A, i, i, M ) = silk_ADD32( matrix_ptr( A, i, i, M ), tmp_32 ); - } - status = 1; - break; - } - D_Q0[ j ] = tmp_32; /* always < max(Correlation) */ - - /* two-step division */ - one_div_diag_Q36 = silk_INVERSE32_varQ( tmp_32, 36 ); /* Q36 */ - one_div_diag_Q40 = silk_LSHIFT( one_div_diag_Q36, 4 ); /* Q40 */ - err = silk_SUB32( (opus_int32)1 << 24, silk_SMULWW( tmp_32, one_div_diag_Q40 ) ); /* Q24 */ - one_div_diag_Q48 = silk_SMULWW( err, one_div_diag_Q40 ); /* Q48 */ - - /* Save 1/Ds */ - inv_D[ j ].Q36_part = one_div_diag_Q36; - inv_D[ j ].Q48_part = one_div_diag_Q48; - - matrix_ptr( L_Q16, j, j, M ) = 65536; /* 1.0 in Q16 */ - ptr1 = matrix_adr( A, j, 0, M ); - ptr2 = matrix_adr( L_Q16, j + 1, 0, M ); - for( i = j + 1; i < M; i++ ) { - tmp_32 = 0; - for( k = 0; k < j; k++ ) { - tmp_32 = silk_SMLAWW( tmp_32, v_Q0[ k ], ptr2[ k ] ); /* Q0 */ - } - tmp_32 = silk_SUB32( ptr1[ i ], tmp_32 ); /* always < max(Correlation) */ - - /* tmp_32 / D_Q0[j] : Divide to Q16 */ - matrix_ptr( L_Q16, i, j, M ) = silk_ADD32( silk_SMMUL( tmp_32, one_div_diag_Q48 ), - silk_RSHIFT( silk_SMULWW( tmp_32, one_div_diag_Q36 ), 4 ) ); - - /* go to next column */ - ptr2 += M; - } - } - } - - silk_assert( status == 0 ); -} - -static OPUS_INLINE void silk_LS_divide_Q16_FIX( - opus_int32 T[], /* I/O Numenator vector */ - inv_D_t *inv_D, /* I 1 / D vector */ - opus_int M /* I dimension */ -) -{ - opus_int i; - opus_int32 tmp_32; - opus_int32 one_div_diag_Q36, one_div_diag_Q48; - - for( i = 0; i < M; i++ ) { - one_div_diag_Q36 = inv_D[ i ].Q36_part; - one_div_diag_Q48 = inv_D[ i ].Q48_part; - - tmp_32 = T[ i ]; - T[ i ] = silk_ADD32( silk_SMMUL( tmp_32, one_div_diag_Q48 ), silk_RSHIFT( silk_SMULWW( tmp_32, one_div_diag_Q36 ), 4 ) ); - } -} - -/* Solve Lx = b, when L is lower triangular and has ones on the diagonal */ -static OPUS_INLINE void silk_LS_SolveFirst_FIX( - const opus_int32 *L_Q16, /* I Pointer to Lower Triangular Matrix */ - opus_int M, /* I Dim of Matrix equation */ - const opus_int32 *b, /* I b Vector */ - opus_int32 *x_Q16 /* O x Vector */ -) -{ - opus_int i, j; - const opus_int32 *ptr32; - opus_int32 tmp_32; - - for( i = 0; i < M; i++ ) { - ptr32 = matrix_adr( L_Q16, i, 0, M ); - tmp_32 = 0; - for( j = 0; j < i; j++ ) { - tmp_32 = silk_SMLAWW( tmp_32, ptr32[ j ], x_Q16[ j ] ); - } - x_Q16[ i ] = silk_SUB32( b[ i ], tmp_32 ); - } -} - -/* Solve L^t*x = b, where L is lower triangular with ones on the diagonal */ -static OPUS_INLINE void silk_LS_SolveLast_FIX( - const opus_int32 *L_Q16, /* I Pointer to Lower Triangular Matrix */ - const opus_int M, /* I Dim of Matrix equation */ - const opus_int32 *b, /* I b Vector */ - opus_int32 *x_Q16 /* O x Vector */ -) -{ - opus_int i, j; - const opus_int32 *ptr32; - opus_int32 tmp_32; - - for( i = M - 1; i >= 0; i-- ) { - ptr32 = matrix_adr( L_Q16, 0, i, M ); - tmp_32 = 0; - for( j = M - 1; j > i; j-- ) { - tmp_32 = silk_SMLAWW( tmp_32, ptr32[ silk_SMULBB( j, M ) ], x_Q16[ j ] ); - } - x_Q16[ i ] = silk_SUB32( b[ i ], tmp_32 ); - } -} diff --git a/thirdparty/opus/silk/fixed/structs_FIX.h b/thirdparty/opus/silk/fixed/structs_FIX.h index 3294b25128..2774a97b24 100644 --- a/thirdparty/opus/silk/fixed/structs_FIX.h +++ b/thirdparty/opus/silk/fixed/structs_FIX.h @@ -48,30 +48,16 @@ typedef struct { } silk_shape_state_FIX; /********************************/ -/* Prefilter state */ -/********************************/ -typedef struct { - opus_int16 sLTP_shp[ LTP_BUF_LENGTH ]; - opus_int32 sAR_shp[ MAX_SHAPE_LPC_ORDER + 1 ]; - opus_int sLTP_shp_buf_idx; - opus_int32 sLF_AR_shp_Q12; - opus_int32 sLF_MA_shp_Q12; - opus_int32 sHarmHP_Q2; - opus_int32 rand_seed; - opus_int lagPrev; -} silk_prefilter_state_FIX; - -/********************************/ /* Encoder state FIX */ /********************************/ typedef struct { silk_encoder_state sCmn; /* Common struct, shared with floating-point code */ silk_shape_state_FIX sShape; /* Shape state */ - silk_prefilter_state_FIX sPrefilt; /* Prefilter State */ /* Buffer for find pitch and noise shape analysis */ silk_DWORD_ALIGN opus_int16 x_buf[ 2 * MAX_FRAME_LENGTH + LA_SHAPE_MAX ];/* Buffer for find pitch and noise shape analysis */ opus_int LTPCorr_Q15; /* Normalized correlation from pitch lag estimator */ + opus_int32 resNrgSmth; } silk_encoder_state_FIX; /************************/ @@ -87,11 +73,8 @@ typedef struct { /* Noise shaping parameters */ /* Testing */ - silk_DWORD_ALIGN opus_int16 AR1_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ]; - silk_DWORD_ALIGN opus_int16 AR2_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ]; + silk_DWORD_ALIGN opus_int16 AR_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ]; opus_int32 LF_shp_Q14[ MAX_NB_SUBFR ]; /* Packs two int16 coefficients per int32 value */ - opus_int GainsPre_Q14[ MAX_NB_SUBFR ]; - opus_int HarmBoost_Q14[ MAX_NB_SUBFR ]; opus_int Tilt_Q14[ MAX_NB_SUBFR ]; opus_int HarmShapeGain_Q14[ MAX_NB_SUBFR ]; opus_int Lambda_Q10; @@ -99,7 +82,6 @@ typedef struct { opus_int coding_quality_Q14; /* measures */ - opus_int sparseness_Q8; opus_int32 predGain_Q16; opus_int LTPredCodGain_Q7; opus_int32 ResNrg[ MAX_NB_SUBFR ]; /* Residual energy per subframe */ diff --git a/thirdparty/opus/silk/fixed/warped_autocorrelation_FIX.c b/thirdparty/opus/silk/fixed/warped_autocorrelation_FIX.c index 6ca6c1184d..5c79553bc0 100644 --- a/thirdparty/opus/silk/fixed/warped_autocorrelation_FIX.c +++ b/thirdparty/opus/silk/fixed/warped_autocorrelation_FIX.c @@ -31,17 +31,14 @@ POSSIBILITY OF SUCH DAMAGE. #include "main_FIX.h" -#define QC 10 -#define QS 14 - #if defined(MIPSr1_ASM) #include "mips/warped_autocorrelation_FIX_mipsr1.h" #endif -#ifndef OVERRIDE_silk_warped_autocorrelation_FIX /* Autocorrelations for a warped frequency axis */ -void silk_warped_autocorrelation_FIX( +#ifndef OVERRIDE_silk_warped_autocorrelation_FIX_c +void silk_warped_autocorrelation_FIX_c( opus_int32 *corr, /* O Result [order + 1] */ opus_int *scale, /* O Scaling of the correlation vector */ const opus_int16 *input, /* I Input data to correlate */ @@ -56,7 +53,7 @@ void silk_warped_autocorrelation_FIX( opus_int64 corr_QC[ MAX_SHAPE_LPC_ORDER + 1 ] = { 0 }; /* Order must be even */ - silk_assert( ( order & 1 ) == 0 ); + celt_assert( ( order & 1 ) == 0 ); silk_assert( 2 * QS - QC >= 0 ); /* Loop over samples */ @@ -92,4 +89,4 @@ void silk_warped_autocorrelation_FIX( } silk_assert( corr_QC[ 0 ] >= 0 ); /* If breaking, decrease QC*/ } -#endif /* OVERRIDE_silk_warped_autocorrelation_FIX */ +#endif /* OVERRIDE_silk_warped_autocorrelation_FIX_c */ diff --git a/thirdparty/opus/silk/fixed/x86/burg_modified_FIX_sse.c b/thirdparty/opus/silk/fixed/x86/burg_modified_FIX_sse4_1.c index 3c3583c5fc..bbb1ce0fcc 100644 --- a/thirdparty/opus/silk/fixed/x86/burg_modified_FIX_sse.c +++ b/thirdparty/opus/silk/fixed/x86/burg_modified_FIX_sse4_1.c @@ -72,7 +72,7 @@ void silk_burg_modified_sse4_1( __m128i FIRST_3210, LAST_3210, ATMP_3210, TMP1_3210, TMP2_3210, T1_3210, T2_3210, PTR_3210, SUBFR_3210, X1_3210, X2_3210; __m128i CONST1 = _mm_set1_epi32(1); - silk_assert( subfr_length * nb_subfr <= MAX_FRAME_SIZE ); + celt_assert( subfr_length * nb_subfr <= MAX_FRAME_SIZE ); /* Compute autocorrelations, added over subframes */ silk_sum_sqr_shift( &C0, &rshifts, x, nb_subfr * subfr_length ); diff --git a/thirdparty/opus/silk/fixed/x86/prefilter_FIX_sse.c b/thirdparty/opus/silk/fixed/x86/prefilter_FIX_sse.c deleted file mode 100644 index 488a603f5d..0000000000 --- a/thirdparty/opus/silk/fixed/x86/prefilter_FIX_sse.c +++ /dev/null @@ -1,160 +0,0 @@ -/* Copyright (c) 2014, Cisco Systems, INC - Written by XiangMingZhu WeiZhou MinPeng YanWang - - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - - Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - - - Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in the - documentation and/or other materials provided with the distribution. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER - OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, - EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, - PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR - PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF - LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING - NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS - SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -*/ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#include <xmmintrin.h> -#include <emmintrin.h> -#include <smmintrin.h> -#include "main.h" -#include "celt/x86/x86cpu.h" - -void silk_warped_LPC_analysis_filter_FIX_sse4_1( - opus_int32 state[], /* I/O State [order + 1] */ - opus_int32 res_Q2[], /* O Residual signal [length] */ - const opus_int16 coef_Q13[], /* I Coefficients [order] */ - const opus_int16 input[], /* I Input signal [length] */ - const opus_int16 lambda_Q16, /* I Warping factor */ - const opus_int length, /* I Length of input signal */ - const opus_int order /* I Filter order (even) */ -) -{ - opus_int n, i; - opus_int32 acc_Q11, tmp1, tmp2; - - /* Order must be even */ - silk_assert( ( order & 1 ) == 0 ); - - if (order == 10) - { - if (0 == lambda_Q16) - { - __m128i coef_Q13_3210, coef_Q13_7654; - __m128i coef_Q13_0123, coef_Q13_4567; - __m128i state_0123, state_4567; - __m128i xmm_product1, xmm_product2; - __m128i xmm_tempa, xmm_tempb; - - register opus_int32 sum; - register opus_int32 state_8, state_9, state_a; - register opus_int64 coef_Q13_8, coef_Q13_9; - - silk_assert( length > 0 ); - - coef_Q13_3210 = OP_CVTEPI16_EPI32_M64( &coef_Q13[ 0 ] ); - coef_Q13_7654 = OP_CVTEPI16_EPI32_M64( &coef_Q13[ 4 ] ); - - coef_Q13_0123 = _mm_shuffle_epi32( coef_Q13_3210, _MM_SHUFFLE( 0, 1, 2, 3 ) ); - coef_Q13_4567 = _mm_shuffle_epi32( coef_Q13_7654, _MM_SHUFFLE( 0, 1, 2, 3 ) ); - - coef_Q13_8 = (opus_int64) coef_Q13[ 8 ]; - coef_Q13_9 = (opus_int64) coef_Q13[ 9 ]; - - state_0123 = _mm_loadu_si128( (__m128i *)(&state[ 0 ] ) ); - state_4567 = _mm_loadu_si128( (__m128i *)(&state[ 4 ] ) ); - - state_0123 = _mm_shuffle_epi32( state_0123, _MM_SHUFFLE( 0, 1, 2, 3 ) ); - state_4567 = _mm_shuffle_epi32( state_4567, _MM_SHUFFLE( 0, 1, 2, 3 ) ); - - state_8 = state[ 8 ]; - state_9 = state[ 9 ]; - state_a = 0; - - for( n = 0; n < length; n++ ) - { - xmm_product1 = _mm_mul_epi32( coef_Q13_0123, state_0123 ); /* 64-bit multiply, only 2 pairs */ - xmm_product2 = _mm_mul_epi32( coef_Q13_4567, state_4567 ); - - xmm_tempa = _mm_shuffle_epi32( state_0123, _MM_SHUFFLE( 0, 1, 2, 3 ) ); - xmm_tempb = _mm_shuffle_epi32( state_4567, _MM_SHUFFLE( 0, 1, 2, 3 ) ); - - xmm_product1 = _mm_srli_epi64( xmm_product1, 16 ); /* >> 16, zero extending works */ - xmm_product2 = _mm_srli_epi64( xmm_product2, 16 ); - - xmm_tempa = _mm_mul_epi32( coef_Q13_3210, xmm_tempa ); - xmm_tempb = _mm_mul_epi32( coef_Q13_7654, xmm_tempb ); - - xmm_tempa = _mm_srli_epi64( xmm_tempa, 16 ); - xmm_tempb = _mm_srli_epi64( xmm_tempb, 16 ); - - xmm_tempa = _mm_add_epi32( xmm_tempa, xmm_product1 ); - xmm_tempb = _mm_add_epi32( xmm_tempb, xmm_product2 ); - xmm_tempa = _mm_add_epi32( xmm_tempa, xmm_tempb ); - - sum = (coef_Q13_8 * state_8) >> 16; - sum += (coef_Q13_9 * state_9) >> 16; - - xmm_tempa = _mm_add_epi32( xmm_tempa, _mm_shuffle_epi32( xmm_tempa, _MM_SHUFFLE( 0, 0, 0, 2 ) ) ); - sum += _mm_cvtsi128_si32( xmm_tempa); - res_Q2[ n ] = silk_LSHIFT( (opus_int32)input[ n ], 2 ) - silk_RSHIFT_ROUND( ( 5 + sum ), 9); - - /* move right */ - state_a = state_9; - state_9 = state_8; - state_8 = _mm_cvtsi128_si32( state_4567 ); - state_4567 = _mm_alignr_epi8( state_0123, state_4567, 4 ); - - state_0123 = _mm_alignr_epi8( _mm_cvtsi32_si128( silk_LSHIFT( input[ n ], 14 ) ), state_0123, 4 ); - } - - _mm_storeu_si128( (__m128i *)( &state[ 0 ] ), _mm_shuffle_epi32( state_0123, _MM_SHUFFLE( 0, 1, 2, 3 ) ) ); - _mm_storeu_si128( (__m128i *)( &state[ 4 ] ), _mm_shuffle_epi32( state_4567, _MM_SHUFFLE( 0, 1, 2, 3 ) ) ); - state[ 8 ] = state_8; - state[ 9 ] = state_9; - state[ 10 ] = state_a; - - return; - } - } - - for( n = 0; n < length; n++ ) { - /* Output of lowpass section */ - tmp2 = silk_SMLAWB( state[ 0 ], state[ 1 ], lambda_Q16 ); - state[ 0 ] = silk_LSHIFT( input[ n ], 14 ); - /* Output of allpass section */ - tmp1 = silk_SMLAWB( state[ 1 ], state[ 2 ] - tmp2, lambda_Q16 ); - state[ 1 ] = tmp2; - acc_Q11 = silk_RSHIFT( order, 1 ); - acc_Q11 = silk_SMLAWB( acc_Q11, tmp2, coef_Q13[ 0 ] ); - /* Loop over allpass sections */ - for( i = 2; i < order; i += 2 ) { - /* Output of allpass section */ - tmp2 = silk_SMLAWB( state[ i ], state[ i + 1 ] - tmp1, lambda_Q16 ); - state[ i ] = tmp1; - acc_Q11 = silk_SMLAWB( acc_Q11, tmp1, coef_Q13[ i - 1 ] ); - /* Output of allpass section */ - tmp1 = silk_SMLAWB( state[ i + 1 ], state[ i + 2 ] - tmp2, lambda_Q16 ); - state[ i + 1 ] = tmp2; - acc_Q11 = silk_SMLAWB( acc_Q11, tmp2, coef_Q13[ i ] ); - } - state[ order ] = tmp1; - acc_Q11 = silk_SMLAWB( acc_Q11, tmp1, coef_Q13[ order - 1 ] ); - res_Q2[ n ] = silk_LSHIFT( (opus_int32)input[ n ], 2 ) - silk_RSHIFT_ROUND( acc_Q11, 9 ); - } -} diff --git a/thirdparty/opus/silk/fixed/x86/vector_ops_FIX_sse.c b/thirdparty/opus/silk/fixed/x86/vector_ops_FIX_sse4_1.c index c1e90564d0..c1e90564d0 100644 --- a/thirdparty/opus/silk/fixed/x86/vector_ops_FIX_sse.c +++ b/thirdparty/opus/silk/fixed/x86/vector_ops_FIX_sse4_1.c diff --git a/thirdparty/opus/silk/float/LPC_analysis_filter_FLP.c b/thirdparty/opus/silk/float/LPC_analysis_filter_FLP.c index cae89a0a18..0e1a1fed0f 100644 --- a/thirdparty/opus/silk/float/LPC_analysis_filter_FLP.c +++ b/thirdparty/opus/silk/float/LPC_analysis_filter_FLP.c @@ -215,7 +215,7 @@ void silk_LPC_analysis_filter_FLP( const opus_int Order /* I LPC order */ ) { - silk_assert( Order <= length ); + celt_assert( Order <= length ); switch( Order ) { case 6: @@ -239,7 +239,7 @@ void silk_LPC_analysis_filter_FLP( break; default: - silk_assert( 0 ); + celt_assert( 0 ); break; } diff --git a/thirdparty/opus/silk/float/LPC_inv_pred_gain_FLP.c b/thirdparty/opus/silk/float/LPC_inv_pred_gain_FLP.c index 25178bacdd..2be2122d61 100644 --- a/thirdparty/opus/silk/float/LPC_inv_pred_gain_FLP.c +++ b/thirdparty/opus/silk/float/LPC_inv_pred_gain_FLP.c @@ -31,8 +31,7 @@ POSSIBILITY OF SUCH DAMAGE. #include "SigProc_FIX.h" #include "SigProc_FLP.h" - -#define RC_THRESHOLD 0.9999f +#include "define.h" /* compute inverse of LPC prediction gain, and */ /* test if LPC coefficients are stable (all poles within unit circle) */ @@ -43,34 +42,32 @@ silk_float silk_LPC_inverse_pred_gain_FLP( /* O return inverse prediction ga ) { opus_int k, n; - double invGain, rc, rc_mult1, rc_mult2; - silk_float Atmp[ 2 ][ SILK_MAX_ORDER_LPC ]; - silk_float *Aold, *Anew; + double invGain, rc, rc_mult1, rc_mult2, tmp1, tmp2; + silk_float Atmp[ SILK_MAX_ORDER_LPC ]; - Anew = Atmp[ order & 1 ]; - silk_memcpy( Anew, A, order * sizeof(silk_float) ); + silk_memcpy( Atmp, A, order * sizeof(silk_float) ); invGain = 1.0; for( k = order - 1; k > 0; k-- ) { - rc = -Anew[ k ]; - if( rc > RC_THRESHOLD || rc < -RC_THRESHOLD ) { + rc = -Atmp[ k ]; + rc_mult1 = 1.0f - rc * rc; + invGain *= rc_mult1; + if( invGain * MAX_PREDICTION_POWER_GAIN < 1.0f ) { return 0.0f; } - rc_mult1 = 1.0f - rc * rc; rc_mult2 = 1.0f / rc_mult1; - invGain *= rc_mult1; - /* swap pointers */ - Aold = Anew; - Anew = Atmp[ k & 1 ]; - for( n = 0; n < k; n++ ) { - Anew[ n ] = (silk_float)( ( Aold[ n ] - Aold[ k - n - 1 ] * rc ) * rc_mult2 ); + for( n = 0; n < (k + 1) >> 1; n++ ) { + tmp1 = Atmp[ n ]; + tmp2 = Atmp[ k - n - 1 ]; + Atmp[ n ] = (silk_float)( ( tmp1 - tmp2 * rc ) * rc_mult2 ); + Atmp[ k - n - 1 ] = (silk_float)( ( tmp2 - tmp1 * rc ) * rc_mult2 ); } } - rc = -Anew[ 0 ]; - if( rc > RC_THRESHOLD || rc < -RC_THRESHOLD ) { - return 0.0f; - } + rc = -Atmp[ 0 ]; rc_mult1 = 1.0f - rc * rc; invGain *= rc_mult1; + if( invGain * MAX_PREDICTION_POWER_GAIN < 1.0f ) { + return 0.0f; + } return (silk_float)invGain; } diff --git a/thirdparty/opus/silk/float/SigProc_FLP.h b/thirdparty/opus/silk/float/SigProc_FLP.h index f0cb3733be..953de8b09e 100644 --- a/thirdparty/opus/silk/float/SigProc_FLP.h +++ b/thirdparty/opus/silk/float/SigProc_FLP.h @@ -68,13 +68,6 @@ void silk_k2a_FLP( opus_int32 order /* I prediction order */ ); -/* Solve the normal equations using the Levinson-Durbin recursion */ -silk_float silk_levinsondurbin_FLP( /* O prediction error energy */ - silk_float A[], /* O prediction coefficients [order] */ - const silk_float corr[], /* I input auto-correlations [order + 1] */ - const opus_int order /* I prediction order */ -); - /* compute autocorrelation */ void silk_autocorrelation_FLP( silk_float *results, /* O result (length correlationCount) */ diff --git a/thirdparty/opus/silk/float/apply_sine_window_FLP.c b/thirdparty/opus/silk/float/apply_sine_window_FLP.c index 6aae57c0ab..e49e717991 100644 --- a/thirdparty/opus/silk/float/apply_sine_window_FLP.c +++ b/thirdparty/opus/silk/float/apply_sine_window_FLP.c @@ -45,10 +45,10 @@ void silk_apply_sine_window_FLP( opus_int k; silk_float freq, c, S0, S1; - silk_assert( win_type == 1 || win_type == 2 ); + celt_assert( win_type == 1 || win_type == 2 ); /* Length must be multiple of 4 */ - silk_assert( ( length & 3 ) == 0 ); + celt_assert( ( length & 3 ) == 0 ); freq = PI / ( length + 1 ); diff --git a/thirdparty/opus/silk/float/burg_modified_FLP.c b/thirdparty/opus/silk/float/burg_modified_FLP.c index ea5dc25a93..756b76a35b 100644 --- a/thirdparty/opus/silk/float/burg_modified_FLP.c +++ b/thirdparty/opus/silk/float/burg_modified_FLP.c @@ -52,7 +52,7 @@ silk_float silk_burg_modified_FLP( /* O returns residual energy double CAf[ SILK_MAX_ORDER_LPC + 1 ], CAb[ SILK_MAX_ORDER_LPC + 1 ]; double Af[ SILK_MAX_ORDER_LPC ]; - silk_assert( subfr_length * nb_subfr <= MAX_FRAME_SIZE ); + celt_assert( subfr_length * nb_subfr <= MAX_FRAME_SIZE ); /* Compute autocorrelations, added over subframes */ C0 = silk_energy_FLP( x, nb_subfr * subfr_length ); diff --git a/thirdparty/opus/silk/float/encode_frame_FLP.c b/thirdparty/opus/silk/float/encode_frame_FLP.c index 2092a4d9e2..b029c3f5ca 100644 --- a/thirdparty/opus/silk/float/encode_frame_FLP.c +++ b/thirdparty/opus/silk/float/encode_frame_FLP.c @@ -29,6 +29,7 @@ POSSIBILITY OF SUCH DAMAGE. #include "config.h" #endif +#include <stdlib.h> #include "main_FLP.h" #include "tuning_parameters.h" @@ -41,21 +42,28 @@ static OPUS_INLINE void silk_LBRR_encode_FLP( ); void silk_encode_do_VAD_FLP( - silk_encoder_state_FLP *psEnc /* I/O Encoder state FLP */ + silk_encoder_state_FLP *psEnc, /* I/O Encoder state FLP */ + opus_int activity /* I Decision of Opus voice activity detector */ ) { + const opus_int activity_threshold = SILK_FIX_CONST( SPEECH_ACTIVITY_DTX_THRES, 8 ); + /****************************/ /* Voice Activity Detection */ /****************************/ silk_VAD_GetSA_Q8( &psEnc->sCmn, psEnc->sCmn.inputBuf + 1, psEnc->sCmn.arch ); + /* If Opus VAD is inactive and Silk VAD is active: lower Silk VAD to just under the threshold */ + if( activity == VAD_NO_ACTIVITY && psEnc->sCmn.speech_activity_Q8 >= activity_threshold ) { + psEnc->sCmn.speech_activity_Q8 = activity_threshold - 1; + } /**************************************************/ /* Convert speech activity into VAD and DTX flags */ /**************************************************/ - if( psEnc->sCmn.speech_activity_Q8 < SILK_FIX_CONST( SPEECH_ACTIVITY_DTX_THRES, 8 ) ) { + if( psEnc->sCmn.speech_activity_Q8 < activity_threshold ) { psEnc->sCmn.indices.signalType = TYPE_NO_VOICE_ACTIVITY; psEnc->sCmn.noSpeechCounter++; - if( psEnc->sCmn.noSpeechCounter < NB_SPEECH_FRAMES_BEFORE_DTX ) { + if( psEnc->sCmn.noSpeechCounter <= NB_SPEECH_FRAMES_BEFORE_DTX ) { psEnc->sCmn.inDTX = 0; } else if( psEnc->sCmn.noSpeechCounter > MAX_CONSECUTIVE_DTX + NB_SPEECH_FRAMES_BEFORE_DTX ) { psEnc->sCmn.noSpeechCounter = NB_SPEECH_FRAMES_BEFORE_DTX; @@ -85,7 +93,6 @@ opus_int silk_encode_frame_FLP( silk_encoder_control_FLP sEncCtrl; opus_int i, iter, maxIter, found_upper, found_lower, ret = 0; silk_float *x_frame, *res_pitch_frame; - silk_float xfw[ MAX_FRAME_LENGTH ]; silk_float res_pitch[ 2 * MAX_FRAME_LENGTH + LA_PITCH_MAX ]; ec_enc sRangeEnc_copy, sRangeEnc_copy2; silk_nsq_state sNSQ_copy, sNSQ_copy2; @@ -97,6 +104,9 @@ opus_int silk_encode_frame_FLP( opus_int8 LastGainIndex_copy2; opus_int32 pGains_Q16[ MAX_NB_SUBFR ]; opus_uint8 ec_buf_copy[ 1275 ]; + opus_int gain_lock[ MAX_NB_SUBFR ] = {0}; + opus_int16 best_gain_mult[ MAX_NB_SUBFR ]; + opus_int best_sum[ MAX_NB_SUBFR ]; /* This is totally unnecessary but many compilers (including gcc) are too dumb to realise it */ LastGainIndex_copy2 = nBits_lower = nBits_upper = gainMult_lower = gainMult_upper = 0; @@ -139,22 +149,17 @@ opus_int silk_encode_frame_FLP( /***************************************************/ /* Find linear prediction coefficients (LPC + LTP) */ /***************************************************/ - silk_find_pred_coefs_FLP( psEnc, &sEncCtrl, res_pitch, x_frame, condCoding ); + silk_find_pred_coefs_FLP( psEnc, &sEncCtrl, res_pitch_frame, x_frame, condCoding ); /****************************************/ /* Process gains */ /****************************************/ silk_process_gains_FLP( psEnc, &sEncCtrl, condCoding ); - /*****************************************/ - /* Prefiltering for noise shaper */ - /*****************************************/ - silk_prefilter_FLP( psEnc, &sEncCtrl, xfw, x_frame ); - /****************************************/ /* Low Bitrate Redundant Encoding */ /****************************************/ - silk_LBRR_encode_FLP( psEnc, &sEncCtrl, xfw, condCoding ); + silk_LBRR_encode_FLP( psEnc, &sEncCtrl, x_frame, condCoding ); /* Loop over quantizer and entroy coding to control bitrate */ maxIter = 6; @@ -188,7 +193,11 @@ opus_int silk_encode_frame_FLP( /*****************************************/ /* Noise shaping quantization */ /*****************************************/ - silk_NSQ_wrapper_FLP( psEnc, &sEncCtrl, &psEnc->sCmn.indices, &psEnc->sCmn.sNSQ, psEnc->sCmn.pulses, xfw ); + silk_NSQ_wrapper_FLP( psEnc, &sEncCtrl, &psEnc->sCmn.indices, &psEnc->sCmn.sNSQ, psEnc->sCmn.pulses, x_frame ); + + if ( iter == maxIter && !found_lower ) { + silk_memcpy( &sRangeEnc_copy2, psRangeEnc, sizeof( ec_enc ) ); + } /****************************************/ /* Encode Parameters */ @@ -203,6 +212,33 @@ opus_int silk_encode_frame_FLP( nBits = ec_tell( psRangeEnc ); + /* If we still bust after the last iteration, do some damage control. */ + if ( iter == maxIter && !found_lower && nBits > maxBits ) { + silk_memcpy( psRangeEnc, &sRangeEnc_copy2, sizeof( ec_enc ) ); + + /* Keep gains the same as the last frame. */ + psEnc->sShape.LastGainIndex = sEncCtrl.lastGainIndexPrev; + for ( i = 0; i < psEnc->sCmn.nb_subfr; i++ ) { + psEnc->sCmn.indices.GainsIndices[ i ] = 4; + } + if (condCoding != CODE_CONDITIONALLY) { + psEnc->sCmn.indices.GainsIndices[ 0 ] = sEncCtrl.lastGainIndexPrev; + } + psEnc->sCmn.ec_prevLagIndex = ec_prevLagIndex_copy; + psEnc->sCmn.ec_prevSignalType = ec_prevSignalType_copy; + /* Clear all pulses. */ + for ( i = 0; i < psEnc->sCmn.frame_length; i++ ) { + psEnc->sCmn.pulses[ i ] = 0; + } + + silk_encode_indices( &psEnc->sCmn, psRangeEnc, psEnc->sCmn.nFramesEncoded, 0, condCoding ); + + silk_encode_pulses( psRangeEnc, psEnc->sCmn.indices.signalType, psEnc->sCmn.indices.quantOffsetType, + psEnc->sCmn.pulses, psEnc->sCmn.frame_length ); + + nBits = ec_tell( psRangeEnc ); + } + if( useCBR == 0 && iter == 0 && nBits <= maxBits ) { break; } @@ -212,7 +248,7 @@ opus_int silk_encode_frame_FLP( if( found_lower && ( gainsID == gainsID_lower || nBits > maxBits ) ) { /* Restore output state from earlier iteration that did meet the bitrate budget */ silk_memcpy( psRangeEnc, &sRangeEnc_copy2, sizeof( ec_enc ) ); - silk_assert( sRangeEnc_copy2.offs <= 1275 ); + celt_assert( sRangeEnc_copy2.offs <= 1275 ); silk_memcpy( psRangeEnc->buf, ec_buf_copy, sRangeEnc_copy2.offs ); silk_memcpy( &psEnc->sCmn.sNSQ, &sNSQ_copy2, sizeof( silk_nsq_state ) ); psEnc->sShape.LastGainIndex = LastGainIndex_copy2; @@ -223,7 +259,9 @@ opus_int silk_encode_frame_FLP( if( nBits > maxBits ) { if( found_lower == 0 && iter >= 2 ) { /* Adjust the quantizer's rate/distortion tradeoff and discard previous "upper" results */ - sEncCtrl.Lambda *= 1.5f; + sEncCtrl.Lambda = silk_max_float(sEncCtrl.Lambda*1.5f, 1.5f); + /* Reducing dithering can help us hit the target. */ + psEnc->sCmn.indices.quantOffsetType = 0; found_upper = 0; gainsID_upper = -1; } else { @@ -240,7 +278,7 @@ opus_int silk_encode_frame_FLP( gainsID_lower = gainsID; /* Copy part of the output state */ silk_memcpy( &sRangeEnc_copy2, psRangeEnc, sizeof( ec_enc ) ); - silk_assert( psRangeEnc->offs <= 1275 ); + celt_assert( psRangeEnc->offs <= 1275 ); silk_memcpy( ec_buf_copy, psRangeEnc->buf, psRangeEnc->offs ); silk_memcpy( &sNSQ_copy2, &psEnc->sCmn.sNSQ, sizeof( silk_nsq_state ) ); LastGainIndex_copy2 = psEnc->sShape.LastGainIndex; @@ -250,15 +288,34 @@ opus_int silk_encode_frame_FLP( break; } + if ( !found_lower && nBits > maxBits ) { + int j; + for ( i = 0; i < psEnc->sCmn.nb_subfr; i++ ) { + int sum=0; + for ( j = i*psEnc->sCmn.subfr_length; j < (i+1)*psEnc->sCmn.subfr_length; j++ ) { + sum += abs( psEnc->sCmn.pulses[j] ); + } + if ( iter == 0 || (sum < best_sum[i] && !gain_lock[i]) ) { + best_sum[i] = sum; + best_gain_mult[i] = gainMult_Q8; + } else { + gain_lock[i] = 1; + } + } + } if( ( found_lower & found_upper ) == 0 ) { /* Adjust gain according to high-rate rate/distortion curve */ - opus_int32 gain_factor_Q16; - gain_factor_Q16 = silk_log2lin( silk_LSHIFT( nBits - maxBits, 7 ) / psEnc->sCmn.frame_length + SILK_FIX_CONST( 16, 7 ) ); - gain_factor_Q16 = silk_min_32( gain_factor_Q16, SILK_FIX_CONST( 2, 16 ) ); if( nBits > maxBits ) { - gain_factor_Q16 = silk_max_32( gain_factor_Q16, SILK_FIX_CONST( 1.3, 16 ) ); + if (gainMult_Q8 < 16384) { + gainMult_Q8 *= 2; + } else { + gainMult_Q8 = 32767; + } + } else { + opus_int32 gain_factor_Q16; + gain_factor_Q16 = silk_log2lin( silk_LSHIFT( nBits - maxBits, 7 ) / psEnc->sCmn.frame_length + SILK_FIX_CONST( 16, 7 ) ); + gainMult_Q8 = silk_SMULWB( gain_factor_Q16, gainMult_Q8 ); } - gainMult_Q8 = silk_SMULWB( gain_factor_Q16, gainMult_Q8 ); } else { /* Adjust gain by interpolating */ gainMult_Q8 = gainMult_lower + ( ( gainMult_upper - gainMult_lower ) * ( maxBits - nBits_lower ) ) / ( nBits_upper - nBits_lower ); @@ -272,7 +329,13 @@ opus_int silk_encode_frame_FLP( } for( i = 0; i < psEnc->sCmn.nb_subfr; i++ ) { - pGains_Q16[ i ] = silk_LSHIFT_SAT32( silk_SMULWB( sEncCtrl.GainsUnq_Q16[ i ], gainMult_Q8 ), 8 ); + opus_int16 tmp; + if ( gain_lock[i] ) { + tmp = best_gain_mult[i]; + } else { + tmp = gainMult_Q8; + } + pGains_Q16[ i ] = silk_LSHIFT_SAT32( silk_SMULWB( sEncCtrl.GainsUnq_Q16[ i ], tmp ), 8 ); } /* Quantize gains */ diff --git a/thirdparty/opus/silk/float/energy_FLP.c b/thirdparty/opus/silk/float/energy_FLP.c index 24b8179f9e..7bc7173c9c 100644 --- a/thirdparty/opus/silk/float/energy_FLP.c +++ b/thirdparty/opus/silk/float/energy_FLP.c @@ -37,13 +37,12 @@ double silk_energy_FLP( opus_int dataSize ) { - opus_int i, dataSize4; + opus_int i; double result; /* 4x unrolled loop */ result = 0.0; - dataSize4 = dataSize & 0xFFFC; - for( i = 0; i < dataSize4; i += 4 ) { + for( i = 0; i < dataSize - 3; i += 4 ) { result += data[ i + 0 ] * (double)data[ i + 0 ] + data[ i + 1 ] * (double)data[ i + 1 ] + data[ i + 2 ] * (double)data[ i + 2 ] + diff --git a/thirdparty/opus/silk/float/find_LPC_FLP.c b/thirdparty/opus/silk/float/find_LPC_FLP.c index fcfe1c3681..fa3ffe7f8b 100644 --- a/thirdparty/opus/silk/float/find_LPC_FLP.c +++ b/thirdparty/opus/silk/float/find_LPC_FLP.c @@ -73,7 +73,7 @@ void silk_find_LPC_FLP( silk_interpolate( NLSF0_Q15, psEncC->prev_NLSFq_Q15, NLSF_Q15, k, psEncC->predictLPCOrder ); /* Convert to LPC for residual energy evaluation */ - silk_NLSF2A_FLP( a_tmp, NLSF0_Q15, psEncC->predictLPCOrder ); + silk_NLSF2A_FLP( a_tmp, NLSF0_Q15, psEncC->predictLPCOrder, psEncC->arch ); /* Calculate residual energy with LSF interpolation */ silk_LPC_analysis_filter_FLP( LPC_res, a_tmp, x, 2 * subfr_length, psEncC->predictLPCOrder ); @@ -99,6 +99,6 @@ void silk_find_LPC_FLP( silk_A2NLSF_FLP( NLSF_Q15, a, psEncC->predictLPCOrder ); } - silk_assert( psEncC->indices.NLSFInterpCoef_Q2 == 4 || + celt_assert( psEncC->indices.NLSFInterpCoef_Q2 == 4 || ( psEncC->useInterpolatedNLSFs && !psEncC->first_frame_after_reset && psEncC->nb_subfr == MAX_NB_SUBFR ) ); } diff --git a/thirdparty/opus/silk/float/find_LTP_FLP.c b/thirdparty/opus/silk/float/find_LTP_FLP.c index 7229996014..f97064930e 100644 --- a/thirdparty/opus/silk/float/find_LTP_FLP.c +++ b/thirdparty/opus/silk/float/find_LTP_FLP.c @@ -33,100 +33,32 @@ POSSIBILITY OF SUCH DAMAGE. #include "tuning_parameters.h" void silk_find_LTP_FLP( - silk_float b[ MAX_NB_SUBFR * LTP_ORDER ], /* O LTP coefs */ - silk_float WLTP[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ], /* O Weight for LTP quantization */ - silk_float *LTPredCodGain, /* O LTP coding gain */ - const silk_float r_lpc[], /* I LPC residual */ - const opus_int lag[ MAX_NB_SUBFR ], /* I LTP lags */ - const silk_float Wght[ MAX_NB_SUBFR ], /* I Weights */ + silk_float XX[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ], /* O Weight for LTP quantization */ + silk_float xX[ MAX_NB_SUBFR * LTP_ORDER ], /* O Weight for LTP quantization */ + const silk_float r_ptr[], /* I LPC residual */ + const opus_int lag[ MAX_NB_SUBFR ], /* I LTP lags */ const opus_int subfr_length, /* I Subframe length */ - const opus_int nb_subfr, /* I number of subframes */ - const opus_int mem_offset /* I Number of samples in LTP memory */ + const opus_int nb_subfr /* I number of subframes */ ) { - opus_int i, k; - silk_float *b_ptr, temp, *WLTP_ptr; - silk_float LPC_res_nrg, LPC_LTP_res_nrg; - silk_float d[ MAX_NB_SUBFR ], m, g, delta_b[ LTP_ORDER ]; - silk_float w[ MAX_NB_SUBFR ], nrg[ MAX_NB_SUBFR ], regu; - silk_float Rr[ LTP_ORDER ], rr[ MAX_NB_SUBFR ]; - const silk_float *r_ptr, *lag_ptr; + opus_int k; + silk_float *xX_ptr, *XX_ptr; + const silk_float *lag_ptr; + silk_float xx, temp; - b_ptr = b; - WLTP_ptr = WLTP; - r_ptr = &r_lpc[ mem_offset ]; + xX_ptr = xX; + XX_ptr = XX; for( k = 0; k < nb_subfr; k++ ) { lag_ptr = r_ptr - ( lag[ k ] + LTP_ORDER / 2 ); + silk_corrMatrix_FLP( lag_ptr, subfr_length, LTP_ORDER, XX_ptr ); + silk_corrVector_FLP( lag_ptr, r_ptr, subfr_length, LTP_ORDER, xX_ptr ); + xx = ( silk_float )silk_energy_FLP( r_ptr, subfr_length + LTP_ORDER ); + temp = 1.0f / silk_max( xx, LTP_CORR_INV_MAX * 0.5f * ( XX_ptr[ 0 ] + XX_ptr[ 24 ] ) + 1.0f ); + silk_scale_vector_FLP( XX_ptr, temp, LTP_ORDER * LTP_ORDER ); + silk_scale_vector_FLP( xX_ptr, temp, LTP_ORDER ); - silk_corrMatrix_FLP( lag_ptr, subfr_length, LTP_ORDER, WLTP_ptr ); - silk_corrVector_FLP( lag_ptr, r_ptr, subfr_length, LTP_ORDER, Rr ); - - rr[ k ] = ( silk_float )silk_energy_FLP( r_ptr, subfr_length ); - regu = 1.0f + rr[ k ] + - matrix_ptr( WLTP_ptr, 0, 0, LTP_ORDER ) + - matrix_ptr( WLTP_ptr, LTP_ORDER-1, LTP_ORDER-1, LTP_ORDER ); - regu *= LTP_DAMPING / 3; - silk_regularize_correlations_FLP( WLTP_ptr, &rr[ k ], regu, LTP_ORDER ); - silk_solve_LDL_FLP( WLTP_ptr, LTP_ORDER, Rr, b_ptr ); - - /* Calculate residual energy */ - nrg[ k ] = silk_residual_energy_covar_FLP( b_ptr, WLTP_ptr, Rr, rr[ k ], LTP_ORDER ); - - temp = Wght[ k ] / ( nrg[ k ] * Wght[ k ] + 0.01f * subfr_length ); - silk_scale_vector_FLP( WLTP_ptr, temp, LTP_ORDER * LTP_ORDER ); - w[ k ] = matrix_ptr( WLTP_ptr, LTP_ORDER / 2, LTP_ORDER / 2, LTP_ORDER ); - - r_ptr += subfr_length; - b_ptr += LTP_ORDER; - WLTP_ptr += LTP_ORDER * LTP_ORDER; - } - - /* Compute LTP coding gain */ - if( LTPredCodGain != NULL ) { - LPC_LTP_res_nrg = 1e-6f; - LPC_res_nrg = 0.0f; - for( k = 0; k < nb_subfr; k++ ) { - LPC_res_nrg += rr[ k ] * Wght[ k ]; - LPC_LTP_res_nrg += nrg[ k ] * Wght[ k ]; - } - - silk_assert( LPC_LTP_res_nrg > 0 ); - *LTPredCodGain = 3.0f * silk_log2( LPC_res_nrg / LPC_LTP_res_nrg ); - } - - /* Smoothing */ - /* d = sum( B, 1 ); */ - b_ptr = b; - for( k = 0; k < nb_subfr; k++ ) { - d[ k ] = 0; - for( i = 0; i < LTP_ORDER; i++ ) { - d[ k ] += b_ptr[ i ]; - } - b_ptr += LTP_ORDER; - } - /* m = ( w * d' ) / ( sum( w ) + 1e-3 ); */ - temp = 1e-3f; - for( k = 0; k < nb_subfr; k++ ) { - temp += w[ k ]; - } - m = 0; - for( k = 0; k < nb_subfr; k++ ) { - m += d[ k ] * w[ k ]; - } - m = m / temp; - - b_ptr = b; - for( k = 0; k < nb_subfr; k++ ) { - g = LTP_SMOOTHING / ( LTP_SMOOTHING + w[ k ] ) * ( m - d[ k ] ); - temp = 0; - for( i = 0; i < LTP_ORDER; i++ ) { - delta_b[ i ] = silk_max_float( b_ptr[ i ], 0.1f ); - temp += delta_b[ i ]; - } - temp = g / temp; - for( i = 0; i < LTP_ORDER; i++ ) { - b_ptr[ i ] = b_ptr[ i ] + delta_b[ i ] * temp; - } - b_ptr += LTP_ORDER; + r_ptr += subfr_length; + XX_ptr += LTP_ORDER * LTP_ORDER; + xX_ptr += LTP_ORDER; } } diff --git a/thirdparty/opus/silk/float/find_pitch_lags_FLP.c b/thirdparty/opus/silk/float/find_pitch_lags_FLP.c index f3b22d25ce..dedbcd2836 100644 --- a/thirdparty/opus/silk/float/find_pitch_lags_FLP.c +++ b/thirdparty/opus/silk/float/find_pitch_lags_FLP.c @@ -56,7 +56,7 @@ void silk_find_pitch_lags_FLP( buf_len = psEnc->sCmn.la_pitch + psEnc->sCmn.frame_length + psEnc->sCmn.ltp_mem_length; /* Safety check */ - silk_assert( buf_len >= psEnc->sCmn.pitch_LPC_win_length ); + celt_assert( buf_len >= psEnc->sCmn.pitch_LPC_win_length ); x_buf = x - psEnc->sCmn.ltp_mem_length; diff --git a/thirdparty/opus/silk/float/find_pred_coefs_FLP.c b/thirdparty/opus/silk/float/find_pred_coefs_FLP.c index 1af4fe5f1b..dcf7c5202d 100644 --- a/thirdparty/opus/silk/float/find_pred_coefs_FLP.c +++ b/thirdparty/opus/silk/float/find_pred_coefs_FLP.c @@ -41,8 +41,9 @@ void silk_find_pred_coefs_FLP( ) { opus_int i; - silk_float WLTP[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ]; - silk_float invGains[ MAX_NB_SUBFR ], Wght[ MAX_NB_SUBFR ]; + silk_float XXLTP[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ]; + silk_float xXLTP[ MAX_NB_SUBFR * LTP_ORDER ]; + silk_float invGains[ MAX_NB_SUBFR ]; opus_int16 NLSF_Q15[ MAX_LPC_ORDER ]; const silk_float *x_ptr; silk_float *x_pre_ptr, LPC_in_pre[ MAX_NB_SUBFR * MAX_LPC_ORDER + MAX_FRAME_LENGTH ]; @@ -52,23 +53,20 @@ void silk_find_pred_coefs_FLP( for( i = 0; i < psEnc->sCmn.nb_subfr; i++ ) { silk_assert( psEncCtrl->Gains[ i ] > 0.0f ); invGains[ i ] = 1.0f / psEncCtrl->Gains[ i ]; - Wght[ i ] = invGains[ i ] * invGains[ i ]; } if( psEnc->sCmn.indices.signalType == TYPE_VOICED ) { /**********/ /* VOICED */ /**********/ - silk_assert( psEnc->sCmn.ltp_mem_length - psEnc->sCmn.predictLPCOrder >= psEncCtrl->pitchL[ 0 ] + LTP_ORDER / 2 ); + celt_assert( psEnc->sCmn.ltp_mem_length - psEnc->sCmn.predictLPCOrder >= psEncCtrl->pitchL[ 0 ] + LTP_ORDER / 2 ); /* LTP analysis */ - silk_find_LTP_FLP( psEncCtrl->LTPCoef, WLTP, &psEncCtrl->LTPredCodGain, res_pitch, - psEncCtrl->pitchL, Wght, psEnc->sCmn.subfr_length, psEnc->sCmn.nb_subfr, psEnc->sCmn.ltp_mem_length ); + silk_find_LTP_FLP( XXLTP, xXLTP, res_pitch, psEncCtrl->pitchL, psEnc->sCmn.subfr_length, psEnc->sCmn.nb_subfr ); /* Quantize LTP gain parameters */ silk_quant_LTP_gains_FLP( psEncCtrl->LTPCoef, psEnc->sCmn.indices.LTPIndex, &psEnc->sCmn.indices.PERIndex, - &psEnc->sCmn.sum_log_gain_Q7, WLTP, psEnc->sCmn.mu_LTP_Q9, psEnc->sCmn.LTPQuantLowComplexity, psEnc->sCmn.nb_subfr, - psEnc->sCmn.arch ); + &psEnc->sCmn.sum_log_gain_Q7, &psEncCtrl->LTPredCodGain, XXLTP, xXLTP, psEnc->sCmn.subfr_length, psEnc->sCmn.nb_subfr, psEnc->sCmn.arch ); /* Control LTP scaling */ silk_LTP_scale_ctrl_FLP( psEnc, psEncCtrl, condCoding ); diff --git a/thirdparty/opus/silk/float/inner_product_FLP.c b/thirdparty/opus/silk/float/inner_product_FLP.c index 029c012911..cdd39d24ce 100644 --- a/thirdparty/opus/silk/float/inner_product_FLP.c +++ b/thirdparty/opus/silk/float/inner_product_FLP.c @@ -38,13 +38,12 @@ double silk_inner_product_FLP( opus_int dataSize ) { - opus_int i, dataSize4; + opus_int i; double result; /* 4x unrolled loop */ result = 0.0; - dataSize4 = dataSize & 0xFFFC; - for( i = 0; i < dataSize4; i += 4 ) { + for( i = 0; i < dataSize - 3; i += 4 ) { result += data1[ i + 0 ] * (double)data2[ i + 0 ] + data1[ i + 1 ] * (double)data2[ i + 1 ] + data1[ i + 2 ] * (double)data2[ i + 2 ] + diff --git a/thirdparty/opus/silk/float/k2a_FLP.c b/thirdparty/opus/silk/float/k2a_FLP.c index 12af4e7669..1448008dbb 100644 --- a/thirdparty/opus/silk/float/k2a_FLP.c +++ b/thirdparty/opus/silk/float/k2a_FLP.c @@ -39,15 +39,16 @@ void silk_k2a_FLP( ) { opus_int k, n; - silk_float Atmp[ SILK_MAX_ORDER_LPC ]; + silk_float rck, tmp1, tmp2; for( k = 0; k < order; k++ ) { - for( n = 0; n < k; n++ ) { - Atmp[ n ] = A[ n ]; + rck = rc[ k ]; + for( n = 0; n < (k + 1) >> 1; n++ ) { + tmp1 = A[ n ]; + tmp2 = A[ k - n - 1 ]; + A[ n ] = tmp1 + tmp2 * rck; + A[ k - n - 1 ] = tmp2 + tmp1 * rck; } - for( n = 0; n < k; n++ ) { - A[ n ] += Atmp[ k - n - 1 ] * rc[ k ]; - } - A[ k ] = -rc[ k ]; + A[ k ] = -rck; } } diff --git a/thirdparty/opus/silk/float/levinsondurbin_FLP.c b/thirdparty/opus/silk/float/levinsondurbin_FLP.c deleted file mode 100644 index f0ba606981..0000000000 --- a/thirdparty/opus/silk/float/levinsondurbin_FLP.c +++ /dev/null @@ -1,81 +0,0 @@ -/*********************************************************************** -Copyright (c) 2006-2011, Skype Limited. All rights reserved. -Redistribution and use in source and binary forms, with or without -modification, are permitted provided that the following conditions -are met: -- Redistributions of source code must retain the above copyright notice, -this list of conditions and the following disclaimer. -- Redistributions in binary form must reproduce the above copyright -notice, this list of conditions and the following disclaimer in the -documentation and/or other materials provided with the distribution. -- Neither the name of Internet Society, IETF or IETF Trust, nor the -names of specific contributors, may be used to endorse or promote -products derived from this software without specific prior written -permission. -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE -LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE -POSSIBILITY OF SUCH DAMAGE. -***********************************************************************/ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#include "SigProc_FLP.h" - -/* Solve the normal equations using the Levinson-Durbin recursion */ -silk_float silk_levinsondurbin_FLP( /* O prediction error energy */ - silk_float A[], /* O prediction coefficients [order] */ - const silk_float corr[], /* I input auto-correlations [order + 1] */ - const opus_int order /* I prediction order */ -) -{ - opus_int i, mHalf, m; - silk_float min_nrg, nrg, t, km, Atmp1, Atmp2; - - min_nrg = 1e-12f * corr[ 0 ] + 1e-9f; - nrg = corr[ 0 ]; - nrg = silk_max_float(min_nrg, nrg); - A[ 0 ] = corr[ 1 ] / nrg; - nrg -= A[ 0 ] * corr[ 1 ]; - nrg = silk_max_float(min_nrg, nrg); - - for( m = 1; m < order; m++ ) - { - t = corr[ m + 1 ]; - for( i = 0; i < m; i++ ) { - t -= A[ i ] * corr[ m - i ]; - } - - /* reflection coefficient */ - km = t / nrg; - - /* residual energy */ - nrg -= km * t; - nrg = silk_max_float(min_nrg, nrg); - - mHalf = m >> 1; - for( i = 0; i < mHalf; i++ ) { - Atmp1 = A[ i ]; - Atmp2 = A[ m - i - 1 ]; - A[ m - i - 1 ] -= km * Atmp1; - A[ i ] -= km * Atmp2; - } - if( m & 1 ) { - A[ mHalf ] -= km * A[ mHalf ]; - } - A[ m ] = km; - } - - /* return the residual energy */ - return nrg; -} - diff --git a/thirdparty/opus/silk/float/main_FLP.h b/thirdparty/opus/silk/float/main_FLP.h index e5a75972e5..5dc0ccf4a4 100644 --- a/thirdparty/opus/silk/float/main_FLP.h +++ b/thirdparty/opus/silk/float/main_FLP.h @@ -56,7 +56,8 @@ void silk_HP_variable_cutoff( /* Encoder main function */ void silk_encode_do_VAD_FLP( - silk_encoder_state_FLP *psEnc /* I/O Encoder state FLP */ + silk_encoder_state_FLP *psEnc, /* I/O Encoder state FLP */ + opus_int activity /* I Decision of Opus voice activity detector */ ); /* Encoder main function */ @@ -79,22 +80,11 @@ opus_int silk_init_encoder( opus_int silk_control_encoder( silk_encoder_state_FLP *psEnc, /* I/O Pointer to Silk encoder state FLP */ silk_EncControlStruct *encControl, /* I Control structure */ - const opus_int32 TargetRate_bps, /* I Target max bitrate (bps) */ const opus_int allow_bw_switch, /* I Flag to allow switching audio bandwidth */ const opus_int channelNb, /* I Channel number */ const opus_int force_fs_kHz ); -/****************/ -/* Prefiltering */ -/****************/ -void silk_prefilter_FLP( - silk_encoder_state_FLP *psEnc, /* I/O Encoder state FLP */ - const silk_encoder_control_FLP *psEncCtrl, /* I Encoder control FLP */ - silk_float xw[], /* O Weighted signal */ - const silk_float x[] /* I Speech signal */ -); - /**************************/ /* Noise shaping analysis */ /**************************/ @@ -153,15 +143,12 @@ void silk_find_LPC_FLP( /* LTP analysis */ void silk_find_LTP_FLP( - silk_float b[ MAX_NB_SUBFR * LTP_ORDER ], /* O LTP coefs */ - silk_float WLTP[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ], /* O Weight for LTP quantization */ - silk_float *LTPredCodGain, /* O LTP coding gain */ - const silk_float r_lpc[], /* I LPC residual */ + silk_float XX[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ], /* O Weight for LTP quantization */ + silk_float xX[ MAX_NB_SUBFR * LTP_ORDER ], /* O Weight for LTP quantization */ + const silk_float r_ptr[], /* I LPC residual */ const opus_int lag[ MAX_NB_SUBFR ], /* I LTP lags */ - const silk_float Wght[ MAX_NB_SUBFR ], /* I Weights */ const opus_int subfr_length, /* I Subframe length */ - const opus_int nb_subfr, /* I number of subframes */ - const opus_int mem_offset /* I Number of samples in LTP memory */ + const opus_int nb_subfr /* I number of subframes */ ); void silk_LTP_analysis_filter_FLP( @@ -198,14 +185,15 @@ void silk_LPC_analysis_filter_FLP( /* LTP tap quantizer */ void silk_quant_LTP_gains_FLP( - silk_float B[ MAX_NB_SUBFR * LTP_ORDER ], /* I/O (Un-)quantized LTP gains */ + silk_float B[ MAX_NB_SUBFR * LTP_ORDER ], /* O Quantized LTP gains */ opus_int8 cbk_index[ MAX_NB_SUBFR ], /* O Codebook index */ opus_int8 *periodicity_index, /* O Periodicity index */ opus_int32 *sum_log_gain_Q7, /* I/O Cumulative max prediction gain */ - const silk_float W[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ], /* I Error weights */ - const opus_int mu_Q10, /* I Mu value (R/D tradeoff) */ - const opus_int lowComplexity, /* I Flag for low complexity */ - const opus_int nb_subfr, /* I number of subframes */ + silk_float *pred_gain_dB, /* O LTP prediction gain */ + const silk_float XX[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ], /* I Correlation matrix */ + const silk_float xX[ MAX_NB_SUBFR * LTP_ORDER ], /* I Correlation vector */ + const opus_int subfr_len, /* I Number of samples per subframe */ + const opus_int nb_subfr, /* I Number of subframes */ int arch /* I Run-time architecture */ ); @@ -245,22 +233,6 @@ void silk_corrVector_FLP( silk_float *Xt /* O X'*t correlation vector [order] */ ); -/* Add noise to matrix diagonal */ -void silk_regularize_correlations_FLP( - silk_float *XX, /* I/O Correlation matrices */ - silk_float *xx, /* I/O Correlation values */ - const silk_float noise, /* I Noise energy to add */ - const opus_int D /* I Dimension of XX */ -); - -/* Function to solve linear equation Ax = b, where A is an MxM symmetric matrix */ -void silk_solve_LDL_FLP( - silk_float *A, /* I/O Symmetric square matrix, out: reg. */ - const opus_int M, /* I Size of matrix */ - const silk_float *b, /* I Pointer to b vector */ - silk_float *x /* O Pointer to x solution vector */ -); - /* Apply sine window to signal vector. */ /* Window types: */ /* 1 -> sine window from 0 to pi/2 */ @@ -285,7 +257,8 @@ void silk_A2NLSF_FLP( void silk_NLSF2A_FLP( silk_float *pAR, /* O LPC coefficients [ LPC_order ] */ const opus_int16 *NLSF_Q15, /* I NLSF vector [ LPC_order ] */ - const opus_int LPC_order /* I LPC order */ + const opus_int LPC_order, /* I LPC order */ + int arch /* I Run-time architecture */ ); /* Limit, stabilize, and quantize NLSFs */ diff --git a/thirdparty/opus/silk/float/noise_shape_analysis_FLP.c b/thirdparty/opus/silk/float/noise_shape_analysis_FLP.c index 65f6ea5870..cb3d8a50b7 100644 --- a/thirdparty/opus/silk/float/noise_shape_analysis_FLP.c +++ b/thirdparty/opus/silk/float/noise_shape_analysis_FLP.c @@ -55,25 +55,21 @@ static OPUS_INLINE silk_float warped_gain( /* Convert warped filter coefficients to monic pseudo-warped coefficients and limit maximum */ /* amplitude of monic warped coefficients by using bandwidth expansion on the true coefficients */ static OPUS_INLINE void warped_true2monic_coefs( - silk_float *coefs_syn, - silk_float *coefs_ana, + silk_float *coefs, silk_float lambda, silk_float limit, opus_int order ) { opus_int i, iter, ind = 0; - silk_float tmp, maxabs, chirp, gain_syn, gain_ana; + silk_float tmp, maxabs, chirp, gain; /* Convert to monic coefficients */ for( i = order - 1; i > 0; i-- ) { - coefs_syn[ i - 1 ] -= lambda * coefs_syn[ i ]; - coefs_ana[ i - 1 ] -= lambda * coefs_ana[ i ]; + coefs[ i - 1 ] -= lambda * coefs[ i ]; } - gain_syn = ( 1.0f - lambda * lambda ) / ( 1.0f + lambda * coefs_syn[ 0 ] ); - gain_ana = ( 1.0f - lambda * lambda ) / ( 1.0f + lambda * coefs_ana[ 0 ] ); + gain = ( 1.0f - lambda * lambda ) / ( 1.0f + lambda * coefs[ 0 ] ); for( i = 0; i < order; i++ ) { - coefs_syn[ i ] *= gain_syn; - coefs_ana[ i ] *= gain_ana; + coefs[ i ] *= gain; } /* Limit */ @@ -81,7 +77,7 @@ static OPUS_INLINE void warped_true2monic_coefs( /* Find maximum absolute value */ maxabs = -1.0f; for( i = 0; i < order; i++ ) { - tmp = silk_max( silk_abs_float( coefs_syn[ i ] ), silk_abs_float( coefs_ana[ i ] ) ); + tmp = silk_abs_float( coefs[ i ] ); if( tmp > maxabs ) { maxabs = tmp; ind = i; @@ -94,36 +90,59 @@ static OPUS_INLINE void warped_true2monic_coefs( /* Convert back to true warped coefficients */ for( i = 1; i < order; i++ ) { - coefs_syn[ i - 1 ] += lambda * coefs_syn[ i ]; - coefs_ana[ i - 1 ] += lambda * coefs_ana[ i ]; + coefs[ i - 1 ] += lambda * coefs[ i ]; } - gain_syn = 1.0f / gain_syn; - gain_ana = 1.0f / gain_ana; + gain = 1.0f / gain; for( i = 0; i < order; i++ ) { - coefs_syn[ i ] *= gain_syn; - coefs_ana[ i ] *= gain_ana; + coefs[ i ] *= gain; } /* Apply bandwidth expansion */ chirp = 0.99f - ( 0.8f + 0.1f * iter ) * ( maxabs - limit ) / ( maxabs * ( ind + 1 ) ); - silk_bwexpander_FLP( coefs_syn, order, chirp ); - silk_bwexpander_FLP( coefs_ana, order, chirp ); + silk_bwexpander_FLP( coefs, order, chirp ); /* Convert to monic warped coefficients */ for( i = order - 1; i > 0; i-- ) { - coefs_syn[ i - 1 ] -= lambda * coefs_syn[ i ]; - coefs_ana[ i - 1 ] -= lambda * coefs_ana[ i ]; + coefs[ i - 1 ] -= lambda * coefs[ i ]; } - gain_syn = ( 1.0f - lambda * lambda ) / ( 1.0f + lambda * coefs_syn[ 0 ] ); - gain_ana = ( 1.0f - lambda * lambda ) / ( 1.0f + lambda * coefs_ana[ 0 ] ); + gain = ( 1.0f - lambda * lambda ) / ( 1.0f + lambda * coefs[ 0 ] ); for( i = 0; i < order; i++ ) { - coefs_syn[ i ] *= gain_syn; - coefs_ana[ i ] *= gain_ana; + coefs[ i ] *= gain; } } silk_assert( 0 ); } +static OPUS_INLINE void limit_coefs( + silk_float *coefs, + silk_float limit, + opus_int order +) { + opus_int i, iter, ind = 0; + silk_float tmp, maxabs, chirp; + + for( iter = 0; iter < 10; iter++ ) { + /* Find maximum absolute value */ + maxabs = -1.0f; + for( i = 0; i < order; i++ ) { + tmp = silk_abs_float( coefs[ i ] ); + if( tmp > maxabs ) { + maxabs = tmp; + ind = i; + } + } + if( maxabs <= limit ) { + /* Coefficients are within range - done */ + return; + } + + /* Apply bandwidth expansion */ + chirp = 0.99f - ( 0.8f + 0.1f * iter ) * ( maxabs - limit ) / ( maxabs * ( ind + 1 ) ); + silk_bwexpander_FLP( coefs, order, chirp ); + } + silk_assert( 0 ); +} + /* Compute noise shaping coefficients and initial gain values */ void silk_noise_shape_analysis_FLP( silk_encoder_state_FLP *psEnc, /* I/O Encoder state FLP */ @@ -133,12 +152,13 @@ void silk_noise_shape_analysis_FLP( ) { silk_shape_state_FLP *psShapeSt = &psEnc->sShape; - opus_int k, nSamples; - silk_float SNR_adj_dB, HarmBoost, HarmShapeGain, Tilt; - silk_float nrg, pre_nrg, log_energy, log_energy_prev, energy_variation; - silk_float delta, BWExp1, BWExp2, gain_mult, gain_add, strength, b, warping; + opus_int k, nSamples, nSegs; + silk_float SNR_adj_dB, HarmShapeGain, Tilt; + silk_float nrg, log_energy, log_energy_prev, energy_variation; + silk_float BWExp, gain_mult, gain_add, strength, b, warping; silk_float x_windowed[ SHAPE_LPC_WIN_MAX ]; silk_float auto_corr[ MAX_SHAPE_LPC_ORDER + 1 ]; + silk_float rc[ MAX_SHAPE_LPC_ORDER + 1 ]; const silk_float *x_ptr, *pitch_res_ptr; /* Point to start of first LPC analysis block */ @@ -176,14 +196,14 @@ void silk_noise_shape_analysis_FLP( if( psEnc->sCmn.indices.signalType == TYPE_VOICED ) { /* Initially set to 0; may be overruled in process_gains(..) */ psEnc->sCmn.indices.quantOffsetType = 0; - psEncCtrl->sparseness = 0.0f; } else { /* Sparseness measure, based on relative fluctuations of energy per 2 milliseconds */ nSamples = 2 * psEnc->sCmn.fs_kHz; energy_variation = 0.0f; log_energy_prev = 0.0f; pitch_res_ptr = pitch_res; - for( k = 0; k < silk_SMULBB( SUB_FRAME_LENGTH_MS, psEnc->sCmn.nb_subfr ) / 2; k++ ) { + nSegs = silk_SMULBB( SUB_FRAME_LENGTH_MS, psEnc->sCmn.nb_subfr ) / 2; + for( k = 0; k < nSegs; k++ ) { nrg = ( silk_float )nSamples + ( silk_float )silk_energy_FLP( pitch_res_ptr, nSamples ); log_energy = silk_log2( nrg ); if( k > 0 ) { @@ -192,17 +212,13 @@ void silk_noise_shape_analysis_FLP( log_energy_prev = log_energy; pitch_res_ptr += nSamples; } - psEncCtrl->sparseness = silk_sigmoid( 0.4f * ( energy_variation - 5.0f ) ); /* Set quantization offset depending on sparseness measure */ - if( psEncCtrl->sparseness > SPARSENESS_THRESHOLD_QNT_OFFSET ) { + if( energy_variation > ENERGY_VARIATION_THRESHOLD_QNT_OFFSET * (nSegs-1) ) { psEnc->sCmn.indices.quantOffsetType = 0; } else { psEnc->sCmn.indices.quantOffsetType = 1; } - - /* Increase coding SNR for sparse signals */ - SNR_adj_dB += SPARSE_SNR_INCR_dB * ( psEncCtrl->sparseness - 0.5f ); } /*******************************/ @@ -210,19 +226,10 @@ void silk_noise_shape_analysis_FLP( /*******************************/ /* More BWE for signals with high prediction gain */ strength = FIND_PITCH_WHITE_NOISE_FRACTION * psEncCtrl->predGain; /* between 0.0 and 1.0 */ - BWExp1 = BWExp2 = BANDWIDTH_EXPANSION / ( 1.0f + strength * strength ); - delta = LOW_RATE_BANDWIDTH_EXPANSION_DELTA * ( 1.0f - 0.75f * psEncCtrl->coding_quality ); - BWExp1 -= delta; - BWExp2 += delta; - /* BWExp1 will be applied after BWExp2, so make it relative */ - BWExp1 /= BWExp2; - - if( psEnc->sCmn.warping_Q16 > 0 ) { - /* Slightly more warping in analysis will move quantization noise up in frequency, where it's better masked */ - warping = (silk_float)psEnc->sCmn.warping_Q16 / 65536.0f + 0.01f * psEncCtrl->coding_quality; - } else { - warping = 0.0f; - } + BWExp = BANDWIDTH_EXPANSION / ( 1.0f + strength * strength ); + + /* Slightly more warping in analysis will move quantization noise up in frequency, where it's better masked */ + warping = (silk_float)psEnc->sCmn.warping_Q16 / 65536.0f + 0.01f * psEncCtrl->coding_quality; /********************************************/ /* Compute noise shaping AR coefs and gains */ @@ -252,37 +259,28 @@ void silk_noise_shape_analysis_FLP( } /* Add white noise, as a fraction of energy */ - auto_corr[ 0 ] += auto_corr[ 0 ] * SHAPE_WHITE_NOISE_FRACTION; + auto_corr[ 0 ] += auto_corr[ 0 ] * SHAPE_WHITE_NOISE_FRACTION + 1.0f; /* Convert correlations to prediction coefficients, and compute residual energy */ - nrg = silk_levinsondurbin_FLP( &psEncCtrl->AR2[ k * MAX_SHAPE_LPC_ORDER ], auto_corr, psEnc->sCmn.shapingLPCOrder ); + nrg = silk_schur_FLP( rc, auto_corr, psEnc->sCmn.shapingLPCOrder ); + silk_k2a_FLP( &psEncCtrl->AR[ k * MAX_SHAPE_LPC_ORDER ], rc, psEnc->sCmn.shapingLPCOrder ); psEncCtrl->Gains[ k ] = ( silk_float )sqrt( nrg ); if( psEnc->sCmn.warping_Q16 > 0 ) { /* Adjust gain for warping */ - psEncCtrl->Gains[ k ] *= warped_gain( &psEncCtrl->AR2[ k * MAX_SHAPE_LPC_ORDER ], warping, psEnc->sCmn.shapingLPCOrder ); + psEncCtrl->Gains[ k ] *= warped_gain( &psEncCtrl->AR[ k * MAX_SHAPE_LPC_ORDER ], warping, psEnc->sCmn.shapingLPCOrder ); } /* Bandwidth expansion for synthesis filter shaping */ - silk_bwexpander_FLP( &psEncCtrl->AR2[ k * MAX_SHAPE_LPC_ORDER ], psEnc->sCmn.shapingLPCOrder, BWExp2 ); - - /* Compute noise shaping filter coefficients */ - silk_memcpy( - &psEncCtrl->AR1[ k * MAX_SHAPE_LPC_ORDER ], - &psEncCtrl->AR2[ k * MAX_SHAPE_LPC_ORDER ], - psEnc->sCmn.shapingLPCOrder * sizeof( silk_float ) ); + silk_bwexpander_FLP( &psEncCtrl->AR[ k * MAX_SHAPE_LPC_ORDER ], psEnc->sCmn.shapingLPCOrder, BWExp ); - /* Bandwidth expansion for analysis filter shaping */ - silk_bwexpander_FLP( &psEncCtrl->AR1[ k * MAX_SHAPE_LPC_ORDER ], psEnc->sCmn.shapingLPCOrder, BWExp1 ); - - /* Ratio of prediction gains, in energy domain */ - pre_nrg = silk_LPC_inverse_pred_gain_FLP( &psEncCtrl->AR2[ k * MAX_SHAPE_LPC_ORDER ], psEnc->sCmn.shapingLPCOrder ); - nrg = silk_LPC_inverse_pred_gain_FLP( &psEncCtrl->AR1[ k * MAX_SHAPE_LPC_ORDER ], psEnc->sCmn.shapingLPCOrder ); - psEncCtrl->GainsPre[ k ] = 1.0f - 0.7f * ( 1.0f - pre_nrg / nrg ); - - /* Convert to monic warped prediction coefficients and limit absolute values */ - warped_true2monic_coefs( &psEncCtrl->AR2[ k * MAX_SHAPE_LPC_ORDER ], &psEncCtrl->AR1[ k * MAX_SHAPE_LPC_ORDER ], - warping, 3.999f, psEnc->sCmn.shapingLPCOrder ); + if( psEnc->sCmn.warping_Q16 > 0 ) { + /* Convert to monic warped prediction coefficients and limit absolute values */ + warped_true2monic_coefs( &psEncCtrl->AR[ k * MAX_SHAPE_LPC_ORDER ], warping, 3.999f, psEnc->sCmn.shapingLPCOrder ); + } else { + /* Limit absolute values */ + limit_coefs( &psEncCtrl->AR[ k * MAX_SHAPE_LPC_ORDER ], 3.999f, psEnc->sCmn.shapingLPCOrder ); + } } /*****************/ @@ -296,11 +294,6 @@ void silk_noise_shape_analysis_FLP( psEncCtrl->Gains[ k ] += gain_add; } - gain_mult = 1.0f + INPUT_TILT + psEncCtrl->coding_quality * HIGH_RATE_INPUT_TILT; - for( k = 0; k < psEnc->sCmn.nb_subfr; k++ ) { - psEncCtrl->GainsPre[ k ] *= gain_mult; - } - /************************************************/ /* Control low-frequency shaping and noise tilt */ /************************************************/ @@ -331,12 +324,6 @@ void silk_noise_shape_analysis_FLP( /****************************/ /* HARMONIC SHAPING CONTROL */ /****************************/ - /* Control boosting of harmonic frequencies */ - HarmBoost = LOW_RATE_HARMONIC_BOOST * ( 1.0f - psEncCtrl->coding_quality ) * psEnc->LTPCorr; - - /* More harmonic boost for noisy input signals */ - HarmBoost += LOW_INPUT_QUALITY_HARMONIC_BOOST * ( 1.0f - psEncCtrl->input_quality ); - if( USE_HARM_SHAPING && psEnc->sCmn.indices.signalType == TYPE_VOICED ) { /* Harmonic noise shaping */ HarmShapeGain = HARMONIC_SHAPING; @@ -355,8 +342,6 @@ void silk_noise_shape_analysis_FLP( /* Smooth over subframes */ /*************************/ for( k = 0; k < psEnc->sCmn.nb_subfr; k++ ) { - psShapeSt->HarmBoost_smth += SUBFR_SMTH_COEF * ( HarmBoost - psShapeSt->HarmBoost_smth ); - psEncCtrl->HarmBoost[ k ] = psShapeSt->HarmBoost_smth; psShapeSt->HarmShapeGain_smth += SUBFR_SMTH_COEF * ( HarmShapeGain - psShapeSt->HarmShapeGain_smth ); psEncCtrl->HarmShapeGain[ k ] = psShapeSt->HarmShapeGain_smth; psShapeSt->Tilt_smth += SUBFR_SMTH_COEF * ( Tilt - psShapeSt->Tilt_smth ); diff --git a/thirdparty/opus/silk/float/pitch_analysis_core_FLP.c b/thirdparty/opus/silk/float/pitch_analysis_core_FLP.c index d0e637a29d..f351bc3718 100644 --- a/thirdparty/opus/silk/float/pitch_analysis_core_FLP.c +++ b/thirdparty/opus/silk/float/pitch_analysis_core_FLP.c @@ -109,11 +109,11 @@ opus_int silk_pitch_analysis_core_FLP( /* O Voicing estimate: 0 voiced, const opus_int8 *Lag_CB_ptr; /* Check for valid sampling frequency */ - silk_assert( Fs_kHz == 8 || Fs_kHz == 12 || Fs_kHz == 16 ); + celt_assert( Fs_kHz == 8 || Fs_kHz == 12 || Fs_kHz == 16 ); /* Check for valid complexity setting */ - silk_assert( complexity >= SILK_PE_MIN_COMPLEX ); - silk_assert( complexity <= SILK_PE_MAX_COMPLEX ); + celt_assert( complexity >= SILK_PE_MIN_COMPLEX ); + celt_assert( complexity <= SILK_PE_MAX_COMPLEX ); silk_assert( search_thres1 >= 0.0f && search_thres1 <= 1.0f ); silk_assert( search_thres2 >= 0.0f && search_thres2 <= 1.0f ); @@ -148,7 +148,7 @@ opus_int silk_pitch_analysis_core_FLP( /* O Voicing estimate: 0 voiced, silk_resampler_down2_3( filt_state, frame_8_FIX, frame_12_FIX, frame_length ); silk_short2float_array( frame_8kHz, frame_8_FIX, frame_length_8kHz ); } else { - silk_assert( Fs_kHz == 8 ); + celt_assert( Fs_kHz == 8 ); silk_float2short_array( frame_8_FIX, frame, frame_length_8kHz ); } @@ -159,7 +159,7 @@ opus_int silk_pitch_analysis_core_FLP( /* O Voicing estimate: 0 voiced, /* Low-pass filter */ for( i = frame_length_4kHz - 1; i > 0; i-- ) { - frame_4kHz[ i ] += frame_4kHz[ i - 1 ]; + frame_4kHz[ i ] = silk_ADD_SAT16( frame_4kHz[ i ], frame_4kHz[ i - 1 ] ); } /****************************************************************************** @@ -169,14 +169,14 @@ opus_int silk_pitch_analysis_core_FLP( /* O Voicing estimate: 0 voiced, target_ptr = &frame_4kHz[ silk_LSHIFT( sf_length_4kHz, 2 ) ]; for( k = 0; k < nb_subfr >> 1; k++ ) { /* Check that we are within range of the array */ - silk_assert( target_ptr >= frame_4kHz ); - silk_assert( target_ptr + sf_length_8kHz <= frame_4kHz + frame_length_4kHz ); + celt_assert( target_ptr >= frame_4kHz ); + celt_assert( target_ptr + sf_length_8kHz <= frame_4kHz + frame_length_4kHz ); basis_ptr = target_ptr - min_lag_4kHz; /* Check that we are within range of the array */ - silk_assert( basis_ptr >= frame_4kHz ); - silk_assert( basis_ptr + sf_length_8kHz <= frame_4kHz + frame_length_4kHz ); + celt_assert( basis_ptr >= frame_4kHz ); + celt_assert( basis_ptr + sf_length_8kHz <= frame_4kHz + frame_length_4kHz ); celt_pitch_xcorr( target_ptr, target_ptr-max_lag_4kHz, xcorr, sf_length_8kHz, max_lag_4kHz - min_lag_4kHz + 1, arch ); @@ -215,7 +215,7 @@ opus_int silk_pitch_analysis_core_FLP( /* O Voicing estimate: 0 voiced, /* Sort */ length_d_srch = 4 + 2 * complexity; - silk_assert( 3 * length_d_srch <= PE_D_SRCH_LENGTH ); + celt_assert( 3 * length_d_srch <= PE_D_SRCH_LENGTH ); silk_insertion_sort_decreasing_FLP( &C[ 0 ][ min_lag_4kHz ], d_srch, max_lag_4kHz - min_lag_4kHz + 1, length_d_srch ); /* Escape if correlation is very low already here */ @@ -238,7 +238,7 @@ opus_int silk_pitch_analysis_core_FLP( /* O Voicing estimate: 0 voiced, break; } } - silk_assert( length_d_srch > 0 ); + celt_assert( length_d_srch > 0 ); for( i = min_lag_8kHz - 5; i < max_lag_8kHz + 5; i++ ) { d_comp[ i ] = 0; @@ -471,7 +471,7 @@ opus_int silk_pitch_analysis_core_FLP( /* O Voicing estimate: 0 voiced, *lagIndex = (opus_int16)( lag - min_lag_8kHz ); *contourIndex = (opus_int8)CBimax; } - silk_assert( *lagIndex >= 0 ); + celt_assert( *lagIndex >= 0 ); /* return as voiced */ return 0; } @@ -506,8 +506,8 @@ static void silk_P_Ana_calc_corr_st3( opus_val32 xcorr[ SCRATCH_SIZE ]; const opus_int8 *Lag_range_ptr, *Lag_CB_ptr; - silk_assert( complexity >= SILK_PE_MIN_COMPLEX ); - silk_assert( complexity <= SILK_PE_MAX_COMPLEX ); + celt_assert( complexity >= SILK_PE_MIN_COMPLEX ); + celt_assert( complexity <= SILK_PE_MAX_COMPLEX ); if( nb_subfr == PE_MAX_NB_SUBFR ) { Lag_range_ptr = &silk_Lag_range_stage3[ complexity ][ 0 ][ 0 ]; @@ -515,7 +515,7 @@ static void silk_P_Ana_calc_corr_st3( nb_cbk_search = silk_nb_cbk_searchs_stage3[ complexity ]; cbk_size = PE_NB_CBKS_STAGE3_MAX; } else { - silk_assert( nb_subfr == PE_MAX_NB_SUBFR >> 1); + celt_assert( nb_subfr == PE_MAX_NB_SUBFR >> 1); Lag_range_ptr = &silk_Lag_range_stage3_10_ms[ 0 ][ 0 ]; Lag_CB_ptr = &silk_CB_lags_stage3_10_ms[ 0 ][ 0 ]; nb_cbk_search = PE_NB_CBKS_STAGE3_10MS; @@ -572,8 +572,8 @@ static void silk_P_Ana_calc_energy_st3( silk_float scratch_mem[ SCRATCH_SIZE ]; const opus_int8 *Lag_range_ptr, *Lag_CB_ptr; - silk_assert( complexity >= SILK_PE_MIN_COMPLEX ); - silk_assert( complexity <= SILK_PE_MAX_COMPLEX ); + celt_assert( complexity >= SILK_PE_MIN_COMPLEX ); + celt_assert( complexity <= SILK_PE_MAX_COMPLEX ); if( nb_subfr == PE_MAX_NB_SUBFR ) { Lag_range_ptr = &silk_Lag_range_stage3[ complexity ][ 0 ][ 0 ]; @@ -581,7 +581,7 @@ static void silk_P_Ana_calc_energy_st3( nb_cbk_search = silk_nb_cbk_searchs_stage3[ complexity ]; cbk_size = PE_NB_CBKS_STAGE3_MAX; } else { - silk_assert( nb_subfr == PE_MAX_NB_SUBFR >> 1); + celt_assert( nb_subfr == PE_MAX_NB_SUBFR >> 1); Lag_range_ptr = &silk_Lag_range_stage3_10_ms[ 0 ][ 0 ]; Lag_CB_ptr = &silk_CB_lags_stage3_10_ms[ 0 ][ 0 ]; nb_cbk_search = PE_NB_CBKS_STAGE3_10MS; diff --git a/thirdparty/opus/silk/float/prefilter_FLP.c b/thirdparty/opus/silk/float/prefilter_FLP.c deleted file mode 100644 index 8bc32fb410..0000000000 --- a/thirdparty/opus/silk/float/prefilter_FLP.c +++ /dev/null @@ -1,206 +0,0 @@ -/*********************************************************************** -Copyright (c) 2006-2011, Skype Limited. All rights reserved. -Redistribution and use in source and binary forms, with or without -modification, are permitted provided that the following conditions -are met: -- Redistributions of source code must retain the above copyright notice, -this list of conditions and the following disclaimer. -- Redistributions in binary form must reproduce the above copyright -notice, this list of conditions and the following disclaimer in the -documentation and/or other materials provided with the distribution. -- Neither the name of Internet Society, IETF or IETF Trust, nor the -names of specific contributors, may be used to endorse or promote -products derived from this software without specific prior written -permission. -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE -LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE -POSSIBILITY OF SUCH DAMAGE. -***********************************************************************/ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#include "main_FLP.h" -#include "tuning_parameters.h" - -/* -* Prefilter for finding Quantizer input signal -*/ -static OPUS_INLINE void silk_prefilt_FLP( - silk_prefilter_state_FLP *P, /* I/O state */ - silk_float st_res[], /* I */ - silk_float xw[], /* O */ - silk_float *HarmShapeFIR, /* I */ - silk_float Tilt, /* I */ - silk_float LF_MA_shp, /* I */ - silk_float LF_AR_shp, /* I */ - opus_int lag, /* I */ - opus_int length /* I */ -); - -static void silk_warped_LPC_analysis_filter_FLP( - silk_float state[], /* I/O State [order + 1] */ - silk_float res[], /* O Residual signal [length] */ - const silk_float coef[], /* I Coefficients [order] */ - const silk_float input[], /* I Input signal [length] */ - const silk_float lambda, /* I Warping factor */ - const opus_int length, /* I Length of input signal */ - const opus_int order /* I Filter order (even) */ -) -{ - opus_int n, i; - silk_float acc, tmp1, tmp2; - - /* Order must be even */ - silk_assert( ( order & 1 ) == 0 ); - - for( n = 0; n < length; n++ ) { - /* Output of lowpass section */ - tmp2 = state[ 0 ] + lambda * state[ 1 ]; - state[ 0 ] = input[ n ]; - /* Output of allpass section */ - tmp1 = state[ 1 ] + lambda * ( state[ 2 ] - tmp2 ); - state[ 1 ] = tmp2; - acc = coef[ 0 ] * tmp2; - /* Loop over allpass sections */ - for( i = 2; i < order; i += 2 ) { - /* Output of allpass section */ - tmp2 = state[ i ] + lambda * ( state[ i + 1 ] - tmp1 ); - state[ i ] = tmp1; - acc += coef[ i - 1 ] * tmp1; - /* Output of allpass section */ - tmp1 = state[ i + 1 ] + lambda * ( state[ i + 2 ] - tmp2 ); - state[ i + 1 ] = tmp2; - acc += coef[ i ] * tmp2; - } - state[ order ] = tmp1; - acc += coef[ order - 1 ] * tmp1; - res[ n ] = input[ n ] - acc; - } -} - -/* -* silk_prefilter. Main prefilter function -*/ -void silk_prefilter_FLP( - silk_encoder_state_FLP *psEnc, /* I/O Encoder state FLP */ - const silk_encoder_control_FLP *psEncCtrl, /* I Encoder control FLP */ - silk_float xw[], /* O Weighted signal */ - const silk_float x[] /* I Speech signal */ -) -{ - silk_prefilter_state_FLP *P = &psEnc->sPrefilt; - opus_int j, k, lag; - silk_float HarmShapeGain, Tilt, LF_MA_shp, LF_AR_shp; - silk_float B[ 2 ]; - const silk_float *AR1_shp; - const silk_float *px; - silk_float *pxw; - silk_float HarmShapeFIR[ 3 ]; - silk_float st_res[ MAX_SUB_FRAME_LENGTH + MAX_LPC_ORDER ]; - - /* Set up pointers */ - px = x; - pxw = xw; - lag = P->lagPrev; - for( k = 0; k < psEnc->sCmn.nb_subfr; k++ ) { - /* Update Variables that change per sub frame */ - if( psEnc->sCmn.indices.signalType == TYPE_VOICED ) { - lag = psEncCtrl->pitchL[ k ]; - } - - /* Noise shape parameters */ - HarmShapeGain = psEncCtrl->HarmShapeGain[ k ] * ( 1.0f - psEncCtrl->HarmBoost[ k ] ); - HarmShapeFIR[ 0 ] = 0.25f * HarmShapeGain; - HarmShapeFIR[ 1 ] = 32767.0f / 65536.0f * HarmShapeGain; - HarmShapeFIR[ 2 ] = 0.25f * HarmShapeGain; - Tilt = psEncCtrl->Tilt[ k ]; - LF_MA_shp = psEncCtrl->LF_MA_shp[ k ]; - LF_AR_shp = psEncCtrl->LF_AR_shp[ k ]; - AR1_shp = &psEncCtrl->AR1[ k * MAX_SHAPE_LPC_ORDER ]; - - /* Short term FIR filtering */ - silk_warped_LPC_analysis_filter_FLP( P->sAR_shp, st_res, AR1_shp, px, - (silk_float)psEnc->sCmn.warping_Q16 / 65536.0f, psEnc->sCmn.subfr_length, psEnc->sCmn.shapingLPCOrder ); - - /* Reduce (mainly) low frequencies during harmonic emphasis */ - B[ 0 ] = psEncCtrl->GainsPre[ k ]; - B[ 1 ] = -psEncCtrl->GainsPre[ k ] * - ( psEncCtrl->HarmBoost[ k ] * HarmShapeGain + INPUT_TILT + psEncCtrl->coding_quality * HIGH_RATE_INPUT_TILT ); - pxw[ 0 ] = B[ 0 ] * st_res[ 0 ] + B[ 1 ] * P->sHarmHP; - for( j = 1; j < psEnc->sCmn.subfr_length; j++ ) { - pxw[ j ] = B[ 0 ] * st_res[ j ] + B[ 1 ] * st_res[ j - 1 ]; - } - P->sHarmHP = st_res[ psEnc->sCmn.subfr_length - 1 ]; - - silk_prefilt_FLP( P, pxw, pxw, HarmShapeFIR, Tilt, LF_MA_shp, LF_AR_shp, lag, psEnc->sCmn.subfr_length ); - - px += psEnc->sCmn.subfr_length; - pxw += psEnc->sCmn.subfr_length; - } - P->lagPrev = psEncCtrl->pitchL[ psEnc->sCmn.nb_subfr - 1 ]; -} - -/* -* Prefilter for finding Quantizer input signal -*/ -static OPUS_INLINE void silk_prefilt_FLP( - silk_prefilter_state_FLP *P, /* I/O state */ - silk_float st_res[], /* I */ - silk_float xw[], /* O */ - silk_float *HarmShapeFIR, /* I */ - silk_float Tilt, /* I */ - silk_float LF_MA_shp, /* I */ - silk_float LF_AR_shp, /* I */ - opus_int lag, /* I */ - opus_int length /* I */ -) -{ - opus_int i; - opus_int idx, LTP_shp_buf_idx; - silk_float n_Tilt, n_LF, n_LTP; - silk_float sLF_AR_shp, sLF_MA_shp; - silk_float *LTP_shp_buf; - - /* To speed up use temp variables instead of using the struct */ - LTP_shp_buf = P->sLTP_shp; - LTP_shp_buf_idx = P->sLTP_shp_buf_idx; - sLF_AR_shp = P->sLF_AR_shp; - sLF_MA_shp = P->sLF_MA_shp; - - for( i = 0; i < length; i++ ) { - if( lag > 0 ) { - silk_assert( HARM_SHAPE_FIR_TAPS == 3 ); - idx = lag + LTP_shp_buf_idx; - n_LTP = LTP_shp_buf[ ( idx - HARM_SHAPE_FIR_TAPS / 2 - 1) & LTP_MASK ] * HarmShapeFIR[ 0 ]; - n_LTP += LTP_shp_buf[ ( idx - HARM_SHAPE_FIR_TAPS / 2 ) & LTP_MASK ] * HarmShapeFIR[ 1 ]; - n_LTP += LTP_shp_buf[ ( idx - HARM_SHAPE_FIR_TAPS / 2 + 1) & LTP_MASK ] * HarmShapeFIR[ 2 ]; - } else { - n_LTP = 0; - } - - n_Tilt = sLF_AR_shp * Tilt; - n_LF = sLF_AR_shp * LF_AR_shp + sLF_MA_shp * LF_MA_shp; - - sLF_AR_shp = st_res[ i ] - n_Tilt; - sLF_MA_shp = sLF_AR_shp - n_LF; - - LTP_shp_buf_idx = ( LTP_shp_buf_idx - 1 ) & LTP_MASK; - LTP_shp_buf[ LTP_shp_buf_idx ] = sLF_MA_shp; - - xw[ i ] = sLF_MA_shp - n_LTP; - } - /* Copy temp variable back to state */ - P->sLF_AR_shp = sLF_AR_shp; - P->sLF_MA_shp = sLF_MA_shp; - P->sLTP_shp_buf_idx = LTP_shp_buf_idx; -} diff --git a/thirdparty/opus/silk/float/residual_energy_FLP.c b/thirdparty/opus/silk/float/residual_energy_FLP.c index b2e03a86a4..1bd07b33a4 100644 --- a/thirdparty/opus/silk/float/residual_energy_FLP.c +++ b/thirdparty/opus/silk/float/residual_energy_FLP.c @@ -47,7 +47,7 @@ silk_float silk_residual_energy_covar_FLP( /* O silk_float tmp, nrg = 0.0f, regularization; /* Safety checks */ - silk_assert( D >= 0 ); + celt_assert( D >= 0 ); regularization = REGULARIZATION_FACTOR * ( wXX[ 0 ] + wXX[ D * D - 1 ] ); for( k = 0; k < MAX_ITERATIONS_RESIDUAL_NRG; k++ ) { diff --git a/thirdparty/opus/silk/float/schur_FLP.c b/thirdparty/opus/silk/float/schur_FLP.c index ee436f8351..8526c748d3 100644 --- a/thirdparty/opus/silk/float/schur_FLP.c +++ b/thirdparty/opus/silk/float/schur_FLP.c @@ -38,22 +38,23 @@ silk_float silk_schur_FLP( /* O returns residual energy ) { opus_int k, n; - silk_float C[ SILK_MAX_ORDER_LPC + 1 ][ 2 ]; - silk_float Ctmp1, Ctmp2, rc_tmp; + double C[ SILK_MAX_ORDER_LPC + 1 ][ 2 ]; + double Ctmp1, Ctmp2, rc_tmp; - silk_assert( order==6||order==8||order==10||order==12||order==14||order==16 ); + celt_assert( order >= 0 && order <= SILK_MAX_ORDER_LPC ); /* Copy correlations */ - for( k = 0; k < order+1; k++ ) { + k = 0; + do { C[ k ][ 0 ] = C[ k ][ 1 ] = auto_corr[ k ]; - } + } while( ++k <= order ); for( k = 0; k < order; k++ ) { /* Get reflection coefficient */ rc_tmp = -C[ k + 1 ][ 0 ] / silk_max_float( C[ 0 ][ 1 ], 1e-9f ); /* Save the output */ - refl_coef[ k ] = rc_tmp; + refl_coef[ k ] = (silk_float)rc_tmp; /* Update correlations */ for( n = 0; n < order - k; n++ ) { @@ -65,6 +66,5 @@ silk_float silk_schur_FLP( /* O returns residual energy } /* Return residual energy */ - return C[ 0 ][ 1 ]; + return (silk_float)C[ 0 ][ 1 ]; } - diff --git a/thirdparty/opus/silk/float/solve_LS_FLP.c b/thirdparty/opus/silk/float/solve_LS_FLP.c deleted file mode 100644 index 7c90d665a0..0000000000 --- a/thirdparty/opus/silk/float/solve_LS_FLP.c +++ /dev/null @@ -1,207 +0,0 @@ -/*********************************************************************** -Copyright (c) 2006-2011, Skype Limited. All rights reserved. -Redistribution and use in source and binary forms, with or without -modification, are permitted provided that the following conditions -are met: -- Redistributions of source code must retain the above copyright notice, -this list of conditions and the following disclaimer. -- Redistributions in binary form must reproduce the above copyright -notice, this list of conditions and the following disclaimer in the -documentation and/or other materials provided with the distribution. -- Neither the name of Internet Society, IETF or IETF Trust, nor the -names of specific contributors, may be used to endorse or promote -products derived from this software without specific prior written -permission. -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" -AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE -LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR -CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF -SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN -CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) -ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE -POSSIBILITY OF SUCH DAMAGE. -***********************************************************************/ - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#include "main_FLP.h" -#include "tuning_parameters.h" - -/********************************************************************** - * LDL Factorisation. Finds the upper triangular matrix L and the diagonal - * Matrix D (only the diagonal elements returned in a vector)such that - * the symmetric matric A is given by A = L*D*L'. - **********************************************************************/ -static OPUS_INLINE void silk_LDL_FLP( - silk_float *A, /* I/O Pointer to Symetric Square Matrix */ - opus_int M, /* I Size of Matrix */ - silk_float *L, /* I/O Pointer to Square Upper triangular Matrix */ - silk_float *Dinv /* I/O Pointer to vector holding the inverse diagonal elements of D */ -); - -/********************************************************************** - * Function to solve linear equation Ax = b, when A is a MxM lower - * triangular matrix, with ones on the diagonal. - **********************************************************************/ -static OPUS_INLINE void silk_SolveWithLowerTriangularWdiagOnes_FLP( - const silk_float *L, /* I Pointer to Lower Triangular Matrix */ - opus_int M, /* I Dim of Matrix equation */ - const silk_float *b, /* I b Vector */ - silk_float *x /* O x Vector */ -); - -/********************************************************************** - * Function to solve linear equation (A^T)x = b, when A is a MxM lower - * triangular, with ones on the diagonal. (ie then A^T is upper triangular) - **********************************************************************/ -static OPUS_INLINE void silk_SolveWithUpperTriangularFromLowerWdiagOnes_FLP( - const silk_float *L, /* I Pointer to Lower Triangular Matrix */ - opus_int M, /* I Dim of Matrix equation */ - const silk_float *b, /* I b Vector */ - silk_float *x /* O x Vector */ -); - -/********************************************************************** - * Function to solve linear equation Ax = b, when A is a MxM - * symmetric square matrix - using LDL factorisation - **********************************************************************/ -void silk_solve_LDL_FLP( - silk_float *A, /* I/O Symmetric square matrix, out: reg. */ - const opus_int M, /* I Size of matrix */ - const silk_float *b, /* I Pointer to b vector */ - silk_float *x /* O Pointer to x solution vector */ -) -{ - opus_int i; - silk_float L[ MAX_MATRIX_SIZE ][ MAX_MATRIX_SIZE ]; - silk_float T[ MAX_MATRIX_SIZE ]; - silk_float Dinv[ MAX_MATRIX_SIZE ]; /* inverse diagonal elements of D*/ - - silk_assert( M <= MAX_MATRIX_SIZE ); - - /*************************************************** - Factorize A by LDL such that A = L*D*(L^T), - where L is lower triangular with ones on diagonal - ****************************************************/ - silk_LDL_FLP( A, M, &L[ 0 ][ 0 ], Dinv ); - - /**************************************************** - * substitute D*(L^T) = T. ie: - L*D*(L^T)*x = b => L*T = b <=> T = inv(L)*b - ******************************************************/ - silk_SolveWithLowerTriangularWdiagOnes_FLP( &L[ 0 ][ 0 ], M, b, T ); - - /**************************************************** - D*(L^T)*x = T <=> (L^T)*x = inv(D)*T, because D is - diagonal just multiply with 1/d_i - ****************************************************/ - for( i = 0; i < M; i++ ) { - T[ i ] = T[ i ] * Dinv[ i ]; - } - /**************************************************** - x = inv(L') * inv(D) * T - *****************************************************/ - silk_SolveWithUpperTriangularFromLowerWdiagOnes_FLP( &L[ 0 ][ 0 ], M, T, x ); -} - -static OPUS_INLINE void silk_SolveWithUpperTriangularFromLowerWdiagOnes_FLP( - const silk_float *L, /* I Pointer to Lower Triangular Matrix */ - opus_int M, /* I Dim of Matrix equation */ - const silk_float *b, /* I b Vector */ - silk_float *x /* O x Vector */ -) -{ - opus_int i, j; - silk_float temp; - const silk_float *ptr1; - - for( i = M - 1; i >= 0; i-- ) { - ptr1 = matrix_adr( L, 0, i, M ); - temp = 0; - for( j = M - 1; j > i ; j-- ) { - temp += ptr1[ j * M ] * x[ j ]; - } - temp = b[ i ] - temp; - x[ i ] = temp; - } -} - -static OPUS_INLINE void silk_SolveWithLowerTriangularWdiagOnes_FLP( - const silk_float *L, /* I Pointer to Lower Triangular Matrix */ - opus_int M, /* I Dim of Matrix equation */ - const silk_float *b, /* I b Vector */ - silk_float *x /* O x Vector */ -) -{ - opus_int i, j; - silk_float temp; - const silk_float *ptr1; - - for( i = 0; i < M; i++ ) { - ptr1 = matrix_adr( L, i, 0, M ); - temp = 0; - for( j = 0; j < i; j++ ) { - temp += ptr1[ j ] * x[ j ]; - } - temp = b[ i ] - temp; - x[ i ] = temp; - } -} - -static OPUS_INLINE void silk_LDL_FLP( - silk_float *A, /* I/O Pointer to Symetric Square Matrix */ - opus_int M, /* I Size of Matrix */ - silk_float *L, /* I/O Pointer to Square Upper triangular Matrix */ - silk_float *Dinv /* I/O Pointer to vector holding the inverse diagonal elements of D */ -) -{ - opus_int i, j, k, loop_count, err = 1; - silk_float *ptr1, *ptr2; - double temp, diag_min_value; - silk_float v[ MAX_MATRIX_SIZE ], D[ MAX_MATRIX_SIZE ]; /* temp arrays*/ - - silk_assert( M <= MAX_MATRIX_SIZE ); - - diag_min_value = FIND_LTP_COND_FAC * 0.5f * ( A[ 0 ] + A[ M * M - 1 ] ); - for( loop_count = 0; loop_count < M && err == 1; loop_count++ ) { - err = 0; - for( j = 0; j < M; j++ ) { - ptr1 = matrix_adr( L, j, 0, M ); - temp = matrix_ptr( A, j, j, M ); /* element in row j column j*/ - for( i = 0; i < j; i++ ) { - v[ i ] = ptr1[ i ] * D[ i ]; - temp -= ptr1[ i ] * v[ i ]; - } - if( temp < diag_min_value ) { - /* Badly conditioned matrix: add white noise and run again */ - temp = ( loop_count + 1 ) * diag_min_value - temp; - for( i = 0; i < M; i++ ) { - matrix_ptr( A, i, i, M ) += ( silk_float )temp; - } - err = 1; - break; - } - D[ j ] = ( silk_float )temp; - Dinv[ j ] = ( silk_float )( 1.0f / temp ); - matrix_ptr( L, j, j, M ) = 1.0f; - - ptr1 = matrix_adr( A, j, 0, M ); - ptr2 = matrix_adr( L, j + 1, 0, M); - for( i = j + 1; i < M; i++ ) { - temp = 0.0; - for( k = 0; k < j; k++ ) { - temp += ptr2[ k ] * v[ k ]; - } - matrix_ptr( L, i, j, M ) = ( silk_float )( ( ptr1[ i ] - temp ) * Dinv[ j ] ); - ptr2 += M; /* go to next column*/ - } - } - } - silk_assert( err == 0 ); -} - diff --git a/thirdparty/opus/silk/float/sort_FLP.c b/thirdparty/opus/silk/float/sort_FLP.c index f08d7592c5..0e18f31950 100644 --- a/thirdparty/opus/silk/float/sort_FLP.c +++ b/thirdparty/opus/silk/float/sort_FLP.c @@ -47,9 +47,9 @@ void silk_insertion_sort_decreasing_FLP( opus_int i, j; /* Safety checks */ - silk_assert( K > 0 ); - silk_assert( L > 0 ); - silk_assert( L >= K ); + celt_assert( K > 0 ); + celt_assert( L > 0 ); + celt_assert( L >= K ); /* Write start indices in index vector */ for( i = 0; i < K; i++ ) { diff --git a/thirdparty/opus/silk/float/structs_FLP.h b/thirdparty/opus/silk/float/structs_FLP.h index 14d647ced2..3150b386e4 100644 --- a/thirdparty/opus/silk/float/structs_FLP.h +++ b/thirdparty/opus/silk/float/structs_FLP.h @@ -42,32 +42,16 @@ extern "C" /********************************/ typedef struct { opus_int8 LastGainIndex; - silk_float HarmBoost_smth; silk_float HarmShapeGain_smth; silk_float Tilt_smth; } silk_shape_state_FLP; /********************************/ -/* Prefilter state */ -/********************************/ -typedef struct { - silk_float sLTP_shp[ LTP_BUF_LENGTH ]; - silk_float sAR_shp[ MAX_SHAPE_LPC_ORDER + 1 ]; - opus_int sLTP_shp_buf_idx; - silk_float sLF_AR_shp; - silk_float sLF_MA_shp; - silk_float sHarmHP; - opus_int32 rand_seed; - opus_int lagPrev; -} silk_prefilter_state_FLP; - -/********************************/ /* Encoder state FLP */ /********************************/ typedef struct { silk_encoder_state sCmn; /* Common struct, shared with fixed-point code */ silk_shape_state_FLP sShape; /* Noise shaping state */ - silk_prefilter_state_FLP sPrefilt; /* Prefilter State */ /* Buffer for find pitch and noise shape analysis */ silk_float x_buf[ 2 * MAX_FRAME_LENGTH + LA_SHAPE_MAX ];/* Buffer for find pitch and noise shape analysis */ @@ -86,12 +70,9 @@ typedef struct { opus_int pitchL[ MAX_NB_SUBFR ]; /* Noise shaping parameters */ - silk_float AR1[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ]; - silk_float AR2[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ]; + silk_float AR[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ]; silk_float LF_MA_shp[ MAX_NB_SUBFR ]; silk_float LF_AR_shp[ MAX_NB_SUBFR ]; - silk_float GainsPre[ MAX_NB_SUBFR ]; - silk_float HarmBoost[ MAX_NB_SUBFR ]; silk_float Tilt[ MAX_NB_SUBFR ]; silk_float HarmShapeGain[ MAX_NB_SUBFR ]; silk_float Lambda; @@ -99,7 +80,6 @@ typedef struct { silk_float coding_quality; /* Measures */ - silk_float sparseness; silk_float predGain; silk_float LTPredCodGain; silk_float ResNrg[ MAX_NB_SUBFR ]; /* Residual energy per subframe */ diff --git a/thirdparty/opus/silk/float/warped_autocorrelation_FLP.c b/thirdparty/opus/silk/float/warped_autocorrelation_FLP.c index 542414f48e..09186e73d4 100644 --- a/thirdparty/opus/silk/float/warped_autocorrelation_FLP.c +++ b/thirdparty/opus/silk/float/warped_autocorrelation_FLP.c @@ -42,11 +42,11 @@ void silk_warped_autocorrelation_FLP( { opus_int n, i; double tmp1, tmp2; - double state[ MAX_SHAPE_LPC_ORDER + 1 ] = { 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 }; - double C[ MAX_SHAPE_LPC_ORDER + 1 ] = { 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 }; + double state[ MAX_SHAPE_LPC_ORDER + 1 ] = { 0 }; + double C[ MAX_SHAPE_LPC_ORDER + 1 ] = { 0 }; /* Order must be even */ - silk_assert( ( order & 1 ) == 0 ); + celt_assert( ( order & 1 ) == 0 ); /* Loop over samples */ for( n = 0; n < length; n++ ) { diff --git a/thirdparty/opus/silk/float/wrappers_FLP.c b/thirdparty/opus/silk/float/wrappers_FLP.c index 6666b8efaa..ad90b874a4 100644 --- a/thirdparty/opus/silk/float/wrappers_FLP.c +++ b/thirdparty/opus/silk/float/wrappers_FLP.c @@ -54,13 +54,14 @@ void silk_A2NLSF_FLP( void silk_NLSF2A_FLP( silk_float *pAR, /* O LPC coefficients [ LPC_order ] */ const opus_int16 *NLSF_Q15, /* I NLSF vector [ LPC_order ] */ - const opus_int LPC_order /* I LPC order */ + const opus_int LPC_order, /* I LPC order */ + int arch /* I Run-time architecture */ ) { opus_int i; opus_int16 a_fix_Q12[ MAX_LPC_ORDER ]; - silk_NLSF2A( a_fix_Q12, NLSF_Q15, LPC_order ); + silk_NLSF2A( a_fix_Q12, NLSF_Q15, LPC_order, arch ); for( i = 0; i < LPC_order; i++ ) { pAR[ i ] = ( silk_float )a_fix_Q12[ i ] * ( 1.0f / 4096.0f ); @@ -102,14 +103,14 @@ void silk_NSQ_wrapper_FLP( ) { opus_int i, j; - opus_int32 x_Q3[ MAX_FRAME_LENGTH ]; + opus_int16 x16[ MAX_FRAME_LENGTH ]; opus_int32 Gains_Q16[ MAX_NB_SUBFR ]; silk_DWORD_ALIGN opus_int16 PredCoef_Q12[ 2 ][ MAX_LPC_ORDER ]; opus_int16 LTPCoef_Q14[ LTP_ORDER * MAX_NB_SUBFR ]; opus_int LTP_scale_Q14; /* Noise shaping parameters */ - opus_int16 AR2_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ]; + opus_int16 AR_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ]; opus_int32 LF_shp_Q14[ MAX_NB_SUBFR ]; /* Packs two int16 coefficients per int32 value */ opus_int Lambda_Q10; opus_int Tilt_Q14[ MAX_NB_SUBFR ]; @@ -119,7 +120,7 @@ void silk_NSQ_wrapper_FLP( /* Noise shape parameters */ for( i = 0; i < psEnc->sCmn.nb_subfr; i++ ) { for( j = 0; j < psEnc->sCmn.shapingLPCOrder; j++ ) { - AR2_Q13[ i * MAX_SHAPE_LPC_ORDER + j ] = silk_float2int( psEncCtrl->AR2[ i * MAX_SHAPE_LPC_ORDER + j ] * 8192.0f ); + AR_Q13[ i * MAX_SHAPE_LPC_ORDER + j ] = silk_float2int( psEncCtrl->AR[ i * MAX_SHAPE_LPC_ORDER + j ] * 8192.0f ); } } @@ -155,16 +156,16 @@ void silk_NSQ_wrapper_FLP( /* Convert input to fix */ for( i = 0; i < psEnc->sCmn.frame_length; i++ ) { - x_Q3[ i ] = silk_float2int( 8.0f * x[ i ] ); + x16[ i ] = silk_float2int( x[ i ] ); } /* Call NSQ */ if( psEnc->sCmn.nStatesDelayedDecision > 1 || psEnc->sCmn.warping_Q16 > 0 ) { - silk_NSQ_del_dec( &psEnc->sCmn, psNSQ, psIndices, x_Q3, pulses, PredCoef_Q12[ 0 ], LTPCoef_Q14, - AR2_Q13, HarmShapeGain_Q14, Tilt_Q14, LF_shp_Q14, Gains_Q16, psEncCtrl->pitchL, Lambda_Q10, LTP_scale_Q14, psEnc->sCmn.arch ); + silk_NSQ_del_dec( &psEnc->sCmn, psNSQ, psIndices, x16, pulses, PredCoef_Q12[ 0 ], LTPCoef_Q14, + AR_Q13, HarmShapeGain_Q14, Tilt_Q14, LF_shp_Q14, Gains_Q16, psEncCtrl->pitchL, Lambda_Q10, LTP_scale_Q14, psEnc->sCmn.arch ); } else { - silk_NSQ( &psEnc->sCmn, psNSQ, psIndices, x_Q3, pulses, PredCoef_Q12[ 0 ], LTPCoef_Q14, - AR2_Q13, HarmShapeGain_Q14, Tilt_Q14, LF_shp_Q14, Gains_Q16, psEncCtrl->pitchL, Lambda_Q10, LTP_scale_Q14, psEnc->sCmn.arch ); + silk_NSQ( &psEnc->sCmn, psNSQ, psIndices, x16, pulses, PredCoef_Q12[ 0 ], LTPCoef_Q14, + AR_Q13, HarmShapeGain_Q14, Tilt_Q14, LF_shp_Q14, Gains_Q16, psEncCtrl->pitchL, Lambda_Q10, LTP_scale_Q14, psEnc->sCmn.arch ); } } @@ -172,31 +173,35 @@ void silk_NSQ_wrapper_FLP( /* Floating-point Silk LTP quantiation wrapper */ /***********************************************/ void silk_quant_LTP_gains_FLP( - silk_float B[ MAX_NB_SUBFR * LTP_ORDER ], /* I/O (Un-)quantized LTP gains */ + silk_float B[ MAX_NB_SUBFR * LTP_ORDER ], /* O Quantized LTP gains */ opus_int8 cbk_index[ MAX_NB_SUBFR ], /* O Codebook index */ opus_int8 *periodicity_index, /* O Periodicity index */ opus_int32 *sum_log_gain_Q7, /* I/O Cumulative max prediction gain */ - const silk_float W[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ], /* I Error weights */ - const opus_int mu_Q10, /* I Mu value (R/D tradeoff) */ - const opus_int lowComplexity, /* I Flag for low complexity */ - const opus_int nb_subfr, /* I number of subframes */ + silk_float *pred_gain_dB, /* O LTP prediction gain */ + const silk_float XX[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ], /* I Correlation matrix */ + const silk_float xX[ MAX_NB_SUBFR * LTP_ORDER ], /* I Correlation vector */ + const opus_int subfr_len, /* I Number of samples per subframe */ + const opus_int nb_subfr, /* I Number of subframes */ int arch /* I Run-time architecture */ ) { - opus_int i; + opus_int i, pred_gain_dB_Q7; opus_int16 B_Q14[ MAX_NB_SUBFR * LTP_ORDER ]; - opus_int32 W_Q18[ MAX_NB_SUBFR*LTP_ORDER*LTP_ORDER ]; + opus_int32 XX_Q17[ MAX_NB_SUBFR * LTP_ORDER * LTP_ORDER ]; + opus_int32 xX_Q17[ MAX_NB_SUBFR * LTP_ORDER ]; - for( i = 0; i < nb_subfr * LTP_ORDER; i++ ) { - B_Q14[ i ] = (opus_int16)silk_float2int( B[ i ] * 16384.0f ); - } for( i = 0; i < nb_subfr * LTP_ORDER * LTP_ORDER; i++ ) { - W_Q18[ i ] = (opus_int32)silk_float2int( W[ i ] * 262144.0f ); + XX_Q17[ i ] = (opus_int32)silk_float2int( XX[ i ] * 131072.0f ); + } + for( i = 0; i < nb_subfr * LTP_ORDER; i++ ) { + xX_Q17[ i ] = (opus_int32)silk_float2int( xX[ i ] * 131072.0f ); } - silk_quant_LTP_gains( B_Q14, cbk_index, periodicity_index, sum_log_gain_Q7, W_Q18, mu_Q10, lowComplexity, nb_subfr, arch ); + silk_quant_LTP_gains( B_Q14, cbk_index, periodicity_index, sum_log_gain_Q7, &pred_gain_dB_Q7, XX_Q17, xX_Q17, subfr_len, nb_subfr, arch ); for( i = 0; i < nb_subfr * LTP_ORDER; i++ ) { B[ i ] = (silk_float)B_Q14[ i ] * ( 1.0f / 16384.0f ); } + + *pred_gain_dB = (silk_float)pred_gain_dB_Q7 * ( 1.0f / 128.0f ); } diff --git a/thirdparty/opus/silk/gain_quant.c b/thirdparty/opus/silk/gain_quant.c index 64ccd0611b..ee65245aa3 100644 --- a/thirdparty/opus/silk/gain_quant.c +++ b/thirdparty/opus/silk/gain_quant.c @@ -76,6 +76,7 @@ void silk_gains_quant( /* Accumulate deltas */ if( ind[ k ] > double_step_size_threshold ) { *prev_ind += silk_LSHIFT( ind[ k ], 1 ) - double_step_size_threshold; + *prev_ind = silk_min_int( *prev_ind, N_LEVELS_QGAIN - 1 ); } else { *prev_ind += ind[ k ]; } diff --git a/thirdparty/opus/silk/init_decoder.c b/thirdparty/opus/silk/init_decoder.c index f887c67886..16c03dcd1c 100644 --- a/thirdparty/opus/silk/init_decoder.c +++ b/thirdparty/opus/silk/init_decoder.c @@ -44,6 +44,7 @@ opus_int silk_init_decoder( /* Used to deactivate LSF interpolation */ psDec->first_frame_after_reset = 1; psDec->prev_gain_Q16 = 65536; + psDec->arch = opus_select_arch(); /* Reset CNG state */ silk_CNG_Reset( psDec ); diff --git a/thirdparty/opus/silk/interpolate.c b/thirdparty/opus/silk/interpolate.c index 1bd8ca4d53..833c28ef8e 100644 --- a/thirdparty/opus/silk/interpolate.c +++ b/thirdparty/opus/silk/interpolate.c @@ -42,8 +42,8 @@ void silk_interpolate( { opus_int i; - silk_assert( ifact_Q2 >= 0 ); - silk_assert( ifact_Q2 <= 4 ); + celt_assert( ifact_Q2 >= 0 ); + celt_assert( ifact_Q2 <= 4 ); for( i = 0; i < d; i++ ) { xi[ i ] = (opus_int16)silk_ADD_RSHIFT( x0[ i ], silk_SMULBB( x1[ i ] - x0[ i ], ifact_Q2 ), 2 ); diff --git a/thirdparty/opus/silk/lin2log.c b/thirdparty/opus/silk/lin2log.c index d4fe515321..0d5155aa86 100644 --- a/thirdparty/opus/silk/lin2log.c +++ b/thirdparty/opus/silk/lin2log.c @@ -41,6 +41,6 @@ opus_int32 silk_lin2log( silk_CLZ_FRAC( inLin, &lz, &frac_Q7 ); /* Piece-wise parabolic approximation */ - return silk_LSHIFT( 31 - lz, 7 ) + silk_SMLAWB( frac_Q7, silk_MUL( frac_Q7, 128 - frac_Q7 ), 179 ); + return silk_ADD_LSHIFT32( silk_SMLAWB( frac_Q7, silk_MUL( frac_Q7, 128 - frac_Q7 ), 179 ), 31 - lz, 7 ); } diff --git a/thirdparty/opus/silk/macros.h b/thirdparty/opus/silk/macros.h index d3ca347520..3c67b6e5d9 100644 --- a/thirdparty/opus/silk/macros.h +++ b/thirdparty/opus/silk/macros.h @@ -36,14 +36,6 @@ POSSIBILITY OF SUCH DAMAGE. #include "opus_defines.h" #include "arch.h" -#if OPUS_GNUC_PREREQ(3, 0) -#define opus_likely(x) (__builtin_expect(!!(x), 1)) -#define opus_unlikely(x) (__builtin_expect(!!(x), 0)) -#else -#define opus_likely(x) (!!(x)) -#define opus_unlikely(x) (!!(x)) -#endif - /* This is an OPUS_INLINE header file for general platform. */ /* (a32 * (opus_int32)((opus_int16)(b32))) >> 16 output have to be 32bit int */ diff --git a/thirdparty/opus/silk/main.h b/thirdparty/opus/silk/main.h index 2f90d68f7d..1a33eed549 100644 --- a/thirdparty/opus/silk/main.h +++ b/thirdparty/opus/silk/main.h @@ -42,6 +42,10 @@ POSSIBILITY OF SUCH DAMAGE. #include "x86/main_sse.h" #endif +#if (defined(OPUS_ARM_ASM) || defined(OPUS_ARM_MAY_HAVE_NEON_INTR)) +#include "arm/NSQ_del_dec_arm.h" +#endif + /* Convert Left/Right stereo signal to adaptive Mid/Side representation */ void silk_stereo_LR_to_MS( stereo_enc_state *state, /* I/O State */ @@ -109,22 +113,22 @@ void silk_stereo_decode_mid_only( /* Encodes signs of excitation */ void silk_encode_signs( - ec_enc *psRangeEnc, /* I/O Compressor data structure */ - const opus_int8 pulses[], /* I pulse signal */ - opus_int length, /* I length of input */ - const opus_int signalType, /* I Signal type */ - const opus_int quantOffsetType, /* I Quantization offset type */ - const opus_int sum_pulses[ MAX_NB_SHELL_BLOCKS ] /* I Sum of absolute pulses per block */ + ec_enc *psRangeEnc, /* I/O Compressor data structure */ + const opus_int8 pulses[], /* I pulse signal */ + opus_int length, /* I length of input */ + const opus_int signalType, /* I Signal type */ + const opus_int quantOffsetType, /* I Quantization offset type */ + const opus_int sum_pulses[ MAX_NB_SHELL_BLOCKS ] /* I Sum of absolute pulses per block */ ); /* Decodes signs of excitation */ void silk_decode_signs( - ec_dec *psRangeDec, /* I/O Compressor data structure */ - opus_int16 pulses[], /* I/O pulse signal */ - opus_int length, /* I length of input */ - const opus_int signalType, /* I Signal type */ - const opus_int quantOffsetType, /* I Quantization offset type */ - const opus_int sum_pulses[ MAX_NB_SHELL_BLOCKS ] /* I Sum of absolute pulses per block */ + ec_dec *psRangeDec, /* I/O Compressor data structure */ + opus_int16 pulses[], /* I/O pulse signal */ + opus_int length, /* I length of input */ + const opus_int signalType, /* I Signal type */ + const opus_int quantOffsetType, /* I Quantization offset type */ + const opus_int sum_pulses[ MAX_NB_SHELL_BLOCKS ] /* I Sum of absolute pulses per block */ ); /* Check encoder control struct */ @@ -205,37 +209,37 @@ void silk_interpolate( /* LTP tap quantizer */ void silk_quant_LTP_gains( - opus_int16 B_Q14[ MAX_NB_SUBFR * LTP_ORDER ], /* I/O (un)quantized LTP gains */ + opus_int16 B_Q14[ MAX_NB_SUBFR * LTP_ORDER ], /* O Quantized LTP gains */ opus_int8 cbk_index[ MAX_NB_SUBFR ], /* O Codebook Index */ opus_int8 *periodicity_index, /* O Periodicity Index */ opus_int32 *sum_gain_dB_Q7, /* I/O Cumulative max prediction gain */ - const opus_int32 W_Q18[ MAX_NB_SUBFR*LTP_ORDER*LTP_ORDER ], /* I Error Weights in Q18 */ - opus_int mu_Q9, /* I Mu value (R/D tradeoff) */ - opus_int lowComplexity, /* I Flag for low complexity */ - const opus_int nb_subfr, /* I number of subframes */ + opus_int *pred_gain_dB_Q7, /* O LTP prediction gain */ + const opus_int32 XX_Q17[ MAX_NB_SUBFR*LTP_ORDER*LTP_ORDER ], /* I Correlation matrix in Q18 */ + const opus_int32 xX_Q17[ MAX_NB_SUBFR*LTP_ORDER ], /* I Correlation vector in Q18 */ + const opus_int subfr_len, /* I Number of samples per subframe */ + const opus_int nb_subfr, /* I Number of subframes */ int arch /* I Run-time architecture */ ); /* Entropy constrained matrix-weighted VQ, for a single input data vector */ void silk_VQ_WMat_EC_c( opus_int8 *ind, /* O index of best codebook vector */ - opus_int32 *rate_dist_Q14, /* O best weighted quant error + mu * rate */ + opus_int32 *res_nrg_Q15, /* O best residual energy */ + opus_int32 *rate_dist_Q8, /* O best total bitrate */ opus_int *gain_Q7, /* O sum of absolute LTP coefficients */ - const opus_int16 *in_Q14, /* I input vector to be quantized */ - const opus_int32 *W_Q18, /* I weighting matrix */ + const opus_int32 *XX_Q17, /* I correlation matrix */ + const opus_int32 *xX_Q17, /* I correlation vector */ const opus_int8 *cb_Q7, /* I codebook */ const opus_uint8 *cb_gain_Q7, /* I codebook effective gain */ const opus_uint8 *cl_Q5, /* I code length for each codebook vector */ - const opus_int mu_Q9, /* I tradeoff betw. weighted error and rate */ + const opus_int subfr_len, /* I number of samples per subframe */ const opus_int32 max_gain_Q7, /* I maximum sum of absolute LTP coefficients */ - opus_int L /* I number of vectors in codebook */ + const opus_int L /* I number of vectors in codebook */ ); #if !defined(OVERRIDE_silk_VQ_WMat_EC) -#define silk_VQ_WMat_EC(ind, rate_dist_Q14, gain_Q7, in_Q14, W_Q18, cb_Q7, cb_gain_Q7, cl_Q5, \ - mu_Q9, max_gain_Q7, L, arch) \ - ((void)(arch),silk_VQ_WMat_EC_c(ind, rate_dist_Q14, gain_Q7, in_Q14, W_Q18, cb_Q7, cb_gain_Q7, cl_Q5, \ - mu_Q9, max_gain_Q7, L)) +#define silk_VQ_WMat_EC(ind, res_nrg_Q15, rate_dist_Q8, gain_Q7, XX_Q17, xX_Q17, cb_Q7, cb_gain_Q7, cl_Q5, subfr_len, max_gain_Q7, L, arch) \ + ((void)(arch),silk_VQ_WMat_EC_c(ind, res_nrg_Q15, rate_dist_Q8, gain_Q7, XX_Q17, xX_Q17, cb_Q7, cb_gain_Q7, cl_Q5, subfr_len, max_gain_Q7, L)) #endif /************************************/ @@ -243,14 +247,14 @@ void silk_VQ_WMat_EC_c( /************************************/ void silk_NSQ_c( - const silk_encoder_state *psEncC, /* I/O Encoder State */ + const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ SideInfoIndices *psIndices, /* I/O Quantization Indices */ - const opus_int32 x_Q3[], /* I Prefiltered input signal */ + const opus_int16 x16[], /* I Input */ opus_int8 pulses[], /* O Quantized pulse signal */ const opus_int16 PredCoef_Q12[ 2 * MAX_LPC_ORDER ], /* I Short term prediction coefs */ const opus_int16 LTPCoef_Q14[ LTP_ORDER * MAX_NB_SUBFR ], /* I Long term prediction coefs */ - const opus_int16 AR2_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ], /* I Noise shaping coefs */ + const opus_int16 AR_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ], /* I Noise shaping coefs */ const opus_int HarmShapeGain_Q14[ MAX_NB_SUBFR ], /* I Long term shaping coefs */ const opus_int Tilt_Q14[ MAX_NB_SUBFR ], /* I Spectral tilt */ const opus_int32 LF_shp_Q14[ MAX_NB_SUBFR ], /* I Low frequency shaping coefs */ @@ -261,22 +265,22 @@ void silk_NSQ_c( ); #if !defined(OVERRIDE_silk_NSQ) -#define silk_NSQ(psEncC, NSQ, psIndices, x_Q3, pulses, PredCoef_Q12, LTPCoef_Q14, AR2_Q13, \ +#define silk_NSQ(psEncC, NSQ, psIndices, x16, pulses, PredCoef_Q12, LTPCoef_Q14, AR_Q13, \ HarmShapeGain_Q14, Tilt_Q14, LF_shp_Q14, Gains_Q16, pitchL, Lambda_Q10, LTP_scale_Q14, arch) \ - ((void)(arch),silk_NSQ_c(psEncC, NSQ, psIndices, x_Q3, pulses, PredCoef_Q12, LTPCoef_Q14, AR2_Q13, \ + ((void)(arch),silk_NSQ_c(psEncC, NSQ, psIndices, x16, pulses, PredCoef_Q12, LTPCoef_Q14, AR_Q13, \ HarmShapeGain_Q14, Tilt_Q14, LF_shp_Q14, Gains_Q16, pitchL, Lambda_Q10, LTP_scale_Q14)) #endif /* Noise shaping using delayed decision */ void silk_NSQ_del_dec_c( - const silk_encoder_state *psEncC, /* I/O Encoder State */ + const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ SideInfoIndices *psIndices, /* I/O Quantization Indices */ - const opus_int32 x_Q3[], /* I Prefiltered input signal */ + const opus_int16 x16[], /* I Input */ opus_int8 pulses[], /* O Quantized pulse signal */ const opus_int16 PredCoef_Q12[ 2 * MAX_LPC_ORDER ], /* I Short term prediction coefs */ const opus_int16 LTPCoef_Q14[ LTP_ORDER * MAX_NB_SUBFR ], /* I Long term prediction coefs */ - const opus_int16 AR2_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ], /* I Noise shaping coefs */ + const opus_int16 AR_Q13[ MAX_NB_SUBFR * MAX_SHAPE_LPC_ORDER ], /* I Noise shaping coefs */ const opus_int HarmShapeGain_Q14[ MAX_NB_SUBFR ], /* I Long term shaping coefs */ const opus_int Tilt_Q14[ MAX_NB_SUBFR ], /* I Spectral tilt */ const opus_int32 LF_shp_Q14[ MAX_NB_SUBFR ], /* I Low frequency shaping coefs */ @@ -287,9 +291,9 @@ void silk_NSQ_del_dec_c( ); #if !defined(OVERRIDE_silk_NSQ_del_dec) -#define silk_NSQ_del_dec(psEncC, NSQ, psIndices, x_Q3, pulses, PredCoef_Q12, LTPCoef_Q14, AR2_Q13, \ +#define silk_NSQ_del_dec(psEncC, NSQ, psIndices, x16, pulses, PredCoef_Q12, LTPCoef_Q14, AR_Q13, \ HarmShapeGain_Q14, Tilt_Q14, LF_shp_Q14, Gains_Q16, pitchL, Lambda_Q10, LTP_scale_Q14, arch) \ - ((void)(arch),silk_NSQ_del_dec_c(psEncC, NSQ, psIndices, x_Q3, pulses, PredCoef_Q12, LTPCoef_Q14, AR2_Q13, \ + ((void)(arch),silk_NSQ_del_dec_c(psEncC, NSQ, psIndices, x16, pulses, PredCoef_Q12, LTPCoef_Q14, AR_Q13, \ HarmShapeGain_Q14, Tilt_Q14, LF_shp_Q14, Gains_Q16, pitchL, Lambda_Q10, LTP_scale_Q14)) #endif @@ -346,6 +350,7 @@ void silk_NLSF_VQ( opus_int32 err_Q26[], /* O Quantization errors [K] */ const opus_int16 in_Q15[], /* I Input vectors to be quantized [LPC_order] */ const opus_uint8 pCB_Q8[], /* I Codebook vectors [K*LPC_order] */ + const opus_int16 pWght_Q9[], /* I Codebook weights [K*LPC_order] */ const opus_int K, /* I Number of codebook vectors */ const opus_int LPC_order /* I Number of LPCs */ ); diff --git a/thirdparty/opus/silk/mips/NSQ_del_dec_mipsr1.h b/thirdparty/opus/silk/mips/NSQ_del_dec_mipsr1.h index ad1cfe2a9b..cd70713a8f 100644 --- a/thirdparty/opus/silk/mips/NSQ_del_dec_mipsr1.h +++ b/thirdparty/opus/silk/mips/NSQ_del_dec_mipsr1.h @@ -61,7 +61,7 @@ static inline void silk_noise_shape_quantizer_del_dec( opus_int predictLPCOrder, /* I Prediction filter order */ opus_int warping_Q16, /* I */ opus_int nStatesDelayedDecision, /* I Number of states in decision tree */ - opus_int *smpl_buf_idx, /* I Index to newest samples in buffers */ + opus_int *smpl_buf_idx, /* I/O Index to newest samples in buffers */ opus_int decisionDelay, /* I */ int arch /* I */ ) @@ -323,8 +323,9 @@ static inline void silk_noise_shape_quantizer_del_dec( psSS[ 1 ].xq_Q14 = xq_Q14; } - *smpl_buf_idx = ( *smpl_buf_idx - 1 ) & DECISION_DELAY_MASK; /* Index to newest samples */ - last_smple_idx = ( *smpl_buf_idx + decisionDelay ) & DECISION_DELAY_MASK; /* Index to decisionDelay old samples */ + *smpl_buf_idx = ( *smpl_buf_idx - 1 ) % DECISION_DELAY; + if( *smpl_buf_idx < 0 ) *smpl_buf_idx += DECISION_DELAY; + last_smple_idx = ( *smpl_buf_idx + decisionDelay ) % DECISION_DELAY; /* Find winner */ RDmin_Q10 = psSampleState[ 0 ][ 0 ].RD_Q10; diff --git a/thirdparty/opus/silk/mips/sigproc_fix_mipsr1.h b/thirdparty/opus/silk/mips/sigproc_fix_mipsr1.h index 3b0a695365..51520c0a6f 100644 --- a/thirdparty/opus/silk/mips/sigproc_fix_mipsr1.h +++ b/thirdparty/opus/silk/mips/sigproc_fix_mipsr1.h @@ -28,11 +28,6 @@ POSSIBILITY OF SUCH DAMAGE. #ifndef SILK_SIGPROC_FIX_MIPSR1_H #define SILK_SIGPROC_FIX_MIPSR1_H -#ifdef __cplusplus -extern "C" -{ -#endif - #undef silk_SAT16 static inline short int silk_SAT16(int a) { diff --git a/thirdparty/opus/silk/process_NLSFs.c b/thirdparty/opus/silk/process_NLSFs.c index 0ab71f0163..d130809541 100644 --- a/thirdparty/opus/silk/process_NLSFs.c +++ b/thirdparty/opus/silk/process_NLSFs.c @@ -48,7 +48,7 @@ void silk_process_NLSFs( silk_assert( psEncC->speech_activity_Q8 >= 0 ); silk_assert( psEncC->speech_activity_Q8 <= SILK_FIX_CONST( 1.0, 8 ) ); - silk_assert( psEncC->useInterpolatedNLSFs == 1 || psEncC->indices.NLSFInterpCoef_Q2 == ( 1 << 2 ) ); + celt_assert( psEncC->useInterpolatedNLSFs == 1 || psEncC->indices.NLSFInterpCoef_Q2 == ( 1 << 2 ) ); /***********************/ /* Calculate mu values */ @@ -60,7 +60,7 @@ void silk_process_NLSFs( NLSF_mu_Q20 = silk_ADD_RSHIFT( NLSF_mu_Q20, NLSF_mu_Q20, 1 ); } - silk_assert( NLSF_mu_Q20 > 0 ); + celt_assert( NLSF_mu_Q20 > 0 ); silk_assert( NLSF_mu_Q20 <= SILK_FIX_CONST( 0.005, 20 ) ); /* Calculate NLSF weights */ @@ -89,7 +89,7 @@ void silk_process_NLSFs( NLSF_mu_Q20, psEncC->NLSF_MSVQ_Survivors, psEncC->indices.signalType ); /* Convert quantized NLSFs back to LPC coefficients */ - silk_NLSF2A( PredCoef_Q12[ 1 ], pNLSF_Q15, psEncC->predictLPCOrder ); + silk_NLSF2A( PredCoef_Q12[ 1 ], pNLSF_Q15, psEncC->predictLPCOrder, psEncC->arch ); if( doInterpolate ) { /* Calculate the interpolated, quantized LSF vector for the first half */ @@ -97,11 +97,11 @@ void silk_process_NLSFs( psEncC->indices.NLSFInterpCoef_Q2, psEncC->predictLPCOrder ); /* Convert back to LPC coefficients */ - silk_NLSF2A( PredCoef_Q12[ 0 ], pNLSF0_temp_Q15, psEncC->predictLPCOrder ); + silk_NLSF2A( PredCoef_Q12[ 0 ], pNLSF0_temp_Q15, psEncC->predictLPCOrder, psEncC->arch ); } else { /* Copy LPC coefficients for first half from second half */ - silk_assert( psEncC->predictLPCOrder <= MAX_LPC_ORDER ); + celt_assert( psEncC->predictLPCOrder <= MAX_LPC_ORDER ); silk_memcpy( PredCoef_Q12[ 0 ], PredCoef_Q12[ 1 ], psEncC->predictLPCOrder * sizeof( opus_int16 ) ); } } diff --git a/thirdparty/opus/silk/quant_LTP_gains.c b/thirdparty/opus/silk/quant_LTP_gains.c index 513a8c4468..d6b8eff8d1 100644 --- a/thirdparty/opus/silk/quant_LTP_gains.c +++ b/thirdparty/opus/silk/quant_LTP_gains.c @@ -33,14 +33,15 @@ POSSIBILITY OF SUCH DAMAGE. #include "tuning_parameters.h" void silk_quant_LTP_gains( - opus_int16 B_Q14[ MAX_NB_SUBFR * LTP_ORDER ], /* I/O (un)quantized LTP gains */ + opus_int16 B_Q14[ MAX_NB_SUBFR * LTP_ORDER ], /* O Quantized LTP gains */ opus_int8 cbk_index[ MAX_NB_SUBFR ], /* O Codebook Index */ opus_int8 *periodicity_index, /* O Periodicity Index */ opus_int32 *sum_log_gain_Q7, /* I/O Cumulative max prediction gain */ - const opus_int32 W_Q18[ MAX_NB_SUBFR*LTP_ORDER*LTP_ORDER ], /* I Error Weights in Q18 */ - opus_int mu_Q9, /* I Mu value (R/D tradeoff) */ - opus_int lowComplexity, /* I Flag for low complexity */ - const opus_int nb_subfr, /* I number of subframes */ + opus_int *pred_gain_dB_Q7, /* O LTP prediction gain */ + const opus_int32 XX_Q17[ MAX_NB_SUBFR*LTP_ORDER*LTP_ORDER ], /* I Correlation matrix in Q18 */ + const opus_int32 xX_Q17[ MAX_NB_SUBFR*LTP_ORDER ], /* I Correlation vector in Q18 */ + const opus_int subfr_len, /* I Number of samples per subframe */ + const opus_int nb_subfr, /* I Number of subframes */ int arch /* I Run-time architecture */ ) { @@ -49,16 +50,16 @@ void silk_quant_LTP_gains( const opus_uint8 *cl_ptr_Q5; const opus_int8 *cbk_ptr_Q7; const opus_uint8 *cbk_gain_ptr_Q7; - const opus_int16 *b_Q14_ptr; - const opus_int32 *W_Q18_ptr; - opus_int32 rate_dist_Q14_subfr, rate_dist_Q14, min_rate_dist_Q14; - opus_int32 sum_log_gain_tmp_Q7, best_sum_log_gain_Q7, max_gain_Q7, gain_Q7; + const opus_int32 *XX_Q17_ptr, *xX_Q17_ptr; + opus_int32 res_nrg_Q15_subfr, res_nrg_Q15, rate_dist_Q7_subfr, rate_dist_Q7, min_rate_dist_Q7; + opus_int32 sum_log_gain_tmp_Q7, best_sum_log_gain_Q7, max_gain_Q7; + opus_int gain_Q7; /***************************************************/ /* iterate over different codebooks with different */ /* rates/distortions, and choose best */ /***************************************************/ - min_rate_dist_Q14 = silk_int32_MAX; + min_rate_dist_Q7 = silk_int32_MAX; best_sum_log_gain_Q7 = 0; for( k = 0; k < 3; k++ ) { /* Safety margin for pitch gain control, to take into account factors @@ -70,53 +71,47 @@ void silk_quant_LTP_gains( cbk_gain_ptr_Q7 = silk_LTP_vq_gain_ptrs_Q7[ k ]; cbk_size = silk_LTP_vq_sizes[ k ]; - /* Set up pointer to first subframe */ - W_Q18_ptr = W_Q18; - b_Q14_ptr = B_Q14; + /* Set up pointers to first subframe */ + XX_Q17_ptr = XX_Q17; + xX_Q17_ptr = xX_Q17; - rate_dist_Q14 = 0; + res_nrg_Q15 = 0; + rate_dist_Q7 = 0; sum_log_gain_tmp_Q7 = *sum_log_gain_Q7; for( j = 0; j < nb_subfr; j++ ) { max_gain_Q7 = silk_log2lin( ( SILK_FIX_CONST( MAX_SUM_LOG_GAIN_DB / 6.0, 7 ) - sum_log_gain_tmp_Q7 ) + SILK_FIX_CONST( 7, 7 ) ) - gain_safety; - silk_VQ_WMat_EC( &temp_idx[ j ], /* O index of best codebook vector */ - &rate_dist_Q14_subfr, /* O best weighted quantization error + mu * rate */ + &res_nrg_Q15_subfr, /* O residual energy */ + &rate_dist_Q7_subfr, /* O best weighted quantization error + mu * rate */ &gain_Q7, /* O sum of absolute LTP coefficients */ - b_Q14_ptr, /* I input vector to be quantized */ - W_Q18_ptr, /* I weighting matrix */ + XX_Q17_ptr, /* I correlation matrix */ + xX_Q17_ptr, /* I correlation vector */ cbk_ptr_Q7, /* I codebook */ cbk_gain_ptr_Q7, /* I codebook effective gains */ cl_ptr_Q5, /* I code length for each codebook vector */ - mu_Q9, /* I tradeoff between weighted error and rate */ + subfr_len, /* I number of samples per subframe */ max_gain_Q7, /* I maximum sum of absolute LTP coefficients */ cbk_size, /* I number of vectors in codebook */ arch /* I Run-time architecture */ ); - rate_dist_Q14 = silk_ADD_POS_SAT32( rate_dist_Q14, rate_dist_Q14_subfr ); + res_nrg_Q15 = silk_ADD_POS_SAT32( res_nrg_Q15, res_nrg_Q15_subfr ); + rate_dist_Q7 = silk_ADD_POS_SAT32( rate_dist_Q7, rate_dist_Q7_subfr ); sum_log_gain_tmp_Q7 = silk_max(0, sum_log_gain_tmp_Q7 + silk_lin2log( gain_safety + gain_Q7 ) - SILK_FIX_CONST( 7, 7 )); - b_Q14_ptr += LTP_ORDER; - W_Q18_ptr += LTP_ORDER * LTP_ORDER; + XX_Q17_ptr += LTP_ORDER * LTP_ORDER; + xX_Q17_ptr += LTP_ORDER; } - /* Avoid never finding a codebook */ - rate_dist_Q14 = silk_min( silk_int32_MAX - 1, rate_dist_Q14 ); - - if( rate_dist_Q14 < min_rate_dist_Q14 ) { - min_rate_dist_Q14 = rate_dist_Q14; + if( rate_dist_Q7 <= min_rate_dist_Q7 ) { + min_rate_dist_Q7 = rate_dist_Q7; *periodicity_index = (opus_int8)k; silk_memcpy( cbk_index, temp_idx, nb_subfr * sizeof( opus_int8 ) ); best_sum_log_gain_Q7 = sum_log_gain_tmp_Q7; } - - /* Break early in low-complexity mode if rate distortion is below threshold */ - if( lowComplexity && ( rate_dist_Q14 < silk_LTP_gain_middle_avg_RD_Q14 ) ) { - break; - } } cbk_ptr_Q7 = silk_LTP_vq_ptrs_Q7[ *periodicity_index ]; @@ -125,5 +120,13 @@ void silk_quant_LTP_gains( B_Q14[ j * LTP_ORDER + k ] = silk_LSHIFT( cbk_ptr_Q7[ cbk_index[ j ] * LTP_ORDER + k ], 7 ); } } + + if( nb_subfr == 2 ) { + res_nrg_Q15 = silk_RSHIFT32( res_nrg_Q15, 1 ); + } else { + res_nrg_Q15 = silk_RSHIFT32( res_nrg_Q15, 2 ); + } + *sum_log_gain_Q7 = best_sum_log_gain_Q7; + *pred_gain_dB_Q7 = (opus_int)silk_SMULBB( -3, silk_lin2log( res_nrg_Q15 ) - ( 15 << 7 ) ); } diff --git a/thirdparty/opus/silk/resampler.c b/thirdparty/opus/silk/resampler.c index 374fbb3722..1f11e50891 100644 --- a/thirdparty/opus/silk/resampler.c +++ b/thirdparty/opus/silk/resampler.c @@ -91,14 +91,14 @@ opus_int silk_resampler_init( if( forEnc ) { if( ( Fs_Hz_in != 8000 && Fs_Hz_in != 12000 && Fs_Hz_in != 16000 && Fs_Hz_in != 24000 && Fs_Hz_in != 48000 ) || ( Fs_Hz_out != 8000 && Fs_Hz_out != 12000 && Fs_Hz_out != 16000 ) ) { - silk_assert( 0 ); + celt_assert( 0 ); return -1; } S->inputDelay = delay_matrix_enc[ rateID( Fs_Hz_in ) ][ rateID( Fs_Hz_out ) ]; } else { if( ( Fs_Hz_in != 8000 && Fs_Hz_in != 12000 && Fs_Hz_in != 16000 ) || ( Fs_Hz_out != 8000 && Fs_Hz_out != 12000 && Fs_Hz_out != 16000 && Fs_Hz_out != 24000 && Fs_Hz_out != 48000 ) ) { - silk_assert( 0 ); + celt_assert( 0 ); return -1; } S->inputDelay = delay_matrix_dec[ rateID( Fs_Hz_in ) ][ rateID( Fs_Hz_out ) ]; @@ -151,7 +151,7 @@ opus_int silk_resampler_init( S->Coefs = silk_Resampler_1_6_COEFS; } else { /* None available */ - silk_assert( 0 ); + celt_assert( 0 ); return -1; } } else { @@ -181,9 +181,9 @@ opus_int silk_resampler( opus_int nSamples; /* Need at least 1 ms of input data */ - silk_assert( inLen >= S->Fs_in_kHz ); + celt_assert( inLen >= S->Fs_in_kHz ); /* Delay can't exceed the 1 ms of buffering */ - silk_assert( S->inputDelay <= S->Fs_in_kHz ); + celt_assert( S->inputDelay <= S->Fs_in_kHz ); nSamples = S->Fs_in_kHz - S->inputDelay; diff --git a/thirdparty/opus/silk/resampler_down2.c b/thirdparty/opus/silk/resampler_down2.c index cec3634640..971d7bfd4a 100644 --- a/thirdparty/opus/silk/resampler_down2.c +++ b/thirdparty/opus/silk/resampler_down2.c @@ -43,8 +43,8 @@ void silk_resampler_down2( opus_int32 k, len2 = silk_RSHIFT32( inLen, 1 ); opus_int32 in32, out32, Y, X; - silk_assert( silk_resampler_down2_0 > 0 ); - silk_assert( silk_resampler_down2_1 < 0 ); + celt_assert( silk_resampler_down2_0 > 0 ); + celt_assert( silk_resampler_down2_1 < 0 ); /* Internal variables and state are in Q10 format */ for( k = 0; k < len2; k++ ) { diff --git a/thirdparty/opus/silk/resampler_private_down_FIR.c b/thirdparty/opus/silk/resampler_private_down_FIR.c index 783e42b356..3e8735a35a 100644 --- a/thirdparty/opus/silk/resampler_private_down_FIR.c +++ b/thirdparty/opus/silk/resampler_private_down_FIR.c @@ -136,7 +136,7 @@ static OPUS_INLINE opus_int16 *silk_resampler_private_down_FIR_INTERPOL( } break; default: - silk_assert( 0 ); + celt_assert( 0 ); } return out; } diff --git a/thirdparty/opus/silk/sort.c b/thirdparty/opus/silk/sort.c index 7187c9efb1..4fba16f831 100644 --- a/thirdparty/opus/silk/sort.c +++ b/thirdparty/opus/silk/sort.c @@ -48,9 +48,9 @@ void silk_insertion_sort_increasing( opus_int i, j; /* Safety checks */ - silk_assert( K > 0 ); - silk_assert( L > 0 ); - silk_assert( L >= K ); + celt_assert( K > 0 ); + celt_assert( L > 0 ); + celt_assert( L >= K ); /* Write start indices in index vector */ for( i = 0; i < K; i++ ) { @@ -96,9 +96,9 @@ void silk_insertion_sort_decreasing_int16( opus_int value; /* Safety checks */ - silk_assert( K > 0 ); - silk_assert( L > 0 ); - silk_assert( L >= K ); + celt_assert( K > 0 ); + celt_assert( L > 0 ); + celt_assert( L >= K ); /* Write start indices in index vector */ for( i = 0; i < K; i++ ) { @@ -141,7 +141,7 @@ void silk_insertion_sort_increasing_all_values_int16( opus_int i, j; /* Safety checks */ - silk_assert( L > 0 ); + celt_assert( L > 0 ); /* Sort vector elements by value, increasing order */ for( i = 1; i < L; i++ ) { diff --git a/thirdparty/opus/silk/stereo_LR_to_MS.c b/thirdparty/opus/silk/stereo_LR_to_MS.c index dda0298de2..c8226663c8 100644 --- a/thirdparty/opus/silk/stereo_LR_to_MS.c +++ b/thirdparty/opus/silk/stereo_LR_to_MS.c @@ -109,7 +109,7 @@ void silk_stereo_LR_to_MS( if( total_rate_bps < 1 ) { total_rate_bps = 1; } - min_mid_rate_bps = silk_SMLABB( 2000, fs_kHz, 900 ); + min_mid_rate_bps = silk_SMLABB( 2000, fs_kHz, 600 ); silk_assert( min_mid_rate_bps < 32767 ); /* Default bitrate distribution: 8 parts for Mid and (5+3*frac) parts for Side. so: mid_rate = ( 8 / ( 13 + 3 * frac ) ) * total_ rate */ frac_3_Q16 = silk_MUL( 3, frac_Q16 ); diff --git a/thirdparty/opus/silk/stereo_encode_pred.c b/thirdparty/opus/silk/stereo_encode_pred.c index e6dd195066..03becb6736 100644 --- a/thirdparty/opus/silk/stereo_encode_pred.c +++ b/thirdparty/opus/silk/stereo_encode_pred.c @@ -41,11 +41,11 @@ void silk_stereo_encode_pred( /* Entropy coding */ n = 5 * ix[ 0 ][ 2 ] + ix[ 1 ][ 2 ]; - silk_assert( n < 25 ); + celt_assert( n < 25 ); ec_enc_icdf( psRangeEnc, n, silk_stereo_pred_joint_iCDF, 8 ); for( n = 0; n < 2; n++ ) { - silk_assert( ix[ n ][ 0 ] < 3 ); - silk_assert( ix[ n ][ 1 ] < STEREO_QUANT_SUB_STEPS ); + celt_assert( ix[ n ][ 0 ] < 3 ); + celt_assert( ix[ n ][ 1 ] < STEREO_QUANT_SUB_STEPS ); ec_enc_icdf( psRangeEnc, ix[ n ][ 0 ], silk_uniform3_iCDF, 8 ); ec_enc_icdf( psRangeEnc, ix[ n ][ 1 ], silk_uniform5_iCDF, 8 ); } diff --git a/thirdparty/opus/silk/structs.h b/thirdparty/opus/silk/structs.h index 827829dc6f..3380c757b2 100644 --- a/thirdparty/opus/silk/structs.h +++ b/thirdparty/opus/silk/structs.h @@ -48,6 +48,7 @@ typedef struct { opus_int32 sLPC_Q14[ MAX_SUB_FRAME_LENGTH + NSQ_LPC_BUF_LENGTH ]; opus_int32 sAR2_Q14[ MAX_SHAPE_LPC_ORDER ]; opus_int32 sLF_AR_shp_Q14; + opus_int32 sDiff_shp_Q14; opus_int lagPrev; opus_int sLTP_buf_idx; opus_int sLTP_shp_buf_idx; @@ -77,6 +78,7 @@ typedef struct { opus_int32 In_LP_State[ 2 ]; /* Low pass filter state */ opus_int32 transition_frame_no; /* Counter which is mapped to a cut-off frequency */ opus_int mode; /* Operating mode, <0: switch down, >0: switch up; 0: do nothing */ + opus_int32 saved_fs_kHz; /* If non-zero, holds the last sampling rate before a bandwidth switching reset. */ } silk_LP_state; /* Structure containing NLSF codebook */ @@ -86,6 +88,7 @@ typedef struct { const opus_int16 quantStepSize_Q16; const opus_int16 invQuantStepSize_Q6; const opus_uint8 *CB1_NLSF_Q8; + const opus_int16 *CB1_Wght_Q9; const opus_uint8 *CB1_iCDF; const opus_uint8 *pred_Q8; const opus_uint8 *ec_sel; @@ -169,8 +172,6 @@ typedef struct { opus_int pitchEstimationComplexity; /* Complexity level for pitch estimator */ opus_int pitchEstimationLPCOrder; /* Whitening filter order for pitch estimator */ opus_int32 pitchEstimationThreshold_Q16; /* Threshold for pitch estimator */ - opus_int LTPQuantLowComplexity; /* Flag for low complexity LTP quantization */ - opus_int mu_LTP_Q9; /* Rate-distortion tradeoff in LTP quantization */ opus_int32 sum_log_gain_Q7; /* Cumulative max prediction gain */ opus_int NLSF_MSVQ_Survivors; /* Number of survivors in NLSF MSVQ */ opus_int first_frame_after_reset; /* Flag for deactivating NLSF interpolation, pitch prediction */ @@ -301,6 +302,7 @@ typedef struct { /* Stuff used for PLC */ opus_int lossCnt; opus_int prevSignalType; + int arch; silk_PLC_struct sPLC; diff --git a/thirdparty/opus/silk/sum_sqr_shift.c b/thirdparty/opus/silk/sum_sqr_shift.c index 129df191d8..4fd0c3d7d5 100644 --- a/thirdparty/opus/silk/sum_sqr_shift.c +++ b/thirdparty/opus/silk/sum_sqr_shift.c @@ -41,43 +41,40 @@ void silk_sum_sqr_shift( ) { opus_int i, shft; - opus_int32 nrg_tmp, nrg; + opus_uint32 nrg_tmp; + opus_int32 nrg; - nrg = 0; - shft = 0; - len--; - for( i = 0; i < len; i += 2 ) { - nrg = silk_SMLABB_ovflw( nrg, x[ i ], x[ i ] ); - nrg = silk_SMLABB_ovflw( nrg, x[ i + 1 ], x[ i + 1 ] ); - if( nrg < 0 ) { - /* Scale down */ - nrg = (opus_int32)silk_RSHIFT_uint( (opus_uint32)nrg, 2 ); - shft = 2; - i+=2; - break; - } + /* Do a first run with the maximum shift we could have. */ + shft = 31-silk_CLZ32(len); + /* Let's be conservative with rounding and start with nrg=len. */ + nrg = len; + for( i = 0; i < len - 1; i += 2 ) { + nrg_tmp = silk_SMULBB( x[ i ], x[ i ] ); + nrg_tmp = silk_SMLABB_ovflw( nrg_tmp, x[ i + 1 ], x[ i + 1 ] ); + nrg = (opus_int32)silk_ADD_RSHIFT_uint( nrg, nrg_tmp, shft ); } - for( ; i < len; i += 2 ) { + if( i < len ) { + /* One sample left to process */ + nrg_tmp = silk_SMULBB( x[ i ], x[ i ] ); + nrg = (opus_int32)silk_ADD_RSHIFT_uint( nrg, nrg_tmp, shft ); + } + silk_assert( nrg >= 0 ); + /* Make sure the result will fit in a 32-bit signed integer with two bits + of headroom. */ + shft = silk_max_32(0, shft+3 - silk_CLZ32(nrg)); + nrg = 0; + for( i = 0 ; i < len - 1; i += 2 ) { nrg_tmp = silk_SMULBB( x[ i ], x[ i ] ); nrg_tmp = silk_SMLABB_ovflw( nrg_tmp, x[ i + 1 ], x[ i + 1 ] ); - nrg = (opus_int32)silk_ADD_RSHIFT_uint( nrg, (opus_uint32)nrg_tmp, shft ); - if( nrg < 0 ) { - /* Scale down */ - nrg = (opus_int32)silk_RSHIFT_uint( (opus_uint32)nrg, 2 ); - shft += 2; - } + nrg = (opus_int32)silk_ADD_RSHIFT_uint( nrg, nrg_tmp, shft ); } - if( i == len ) { + if( i < len ) { /* One sample left to process */ nrg_tmp = silk_SMULBB( x[ i ], x[ i ] ); nrg = (opus_int32)silk_ADD_RSHIFT_uint( nrg, nrg_tmp, shft ); } - /* Make sure to have at least one extra leading zero (two leading zeros in total) */ - if( nrg & 0xC0000000 ) { - nrg = silk_RSHIFT_uint( (opus_uint32)nrg, 2 ); - shft += 2; - } + silk_assert( nrg >= 0 ); /* Output arguments */ *shift = shft; diff --git a/thirdparty/opus/silk/tables.h b/thirdparty/opus/silk/tables.h index 7fea6fda39..95230c451a 100644 --- a/thirdparty/opus/silk/tables.h +++ b/thirdparty/opus/silk/tables.h @@ -76,10 +76,8 @@ extern const opus_uint8 silk_NLSF_EXT_iCDF[ 7 ]; extern const opus_uint8 silk_LTP_per_index_iCDF[ 3 ]; /* 3 */ extern const opus_uint8 * const silk_LTP_gain_iCDF_ptrs[ NB_LTP_CBKS ]; /* 3 */ extern const opus_uint8 * const silk_LTP_gain_BITS_Q5_ptrs[ NB_LTP_CBKS ]; /* 3 */ -extern const opus_int16 silk_LTP_gain_middle_avg_RD_Q14; extern const opus_int8 * const silk_LTP_vq_ptrs_Q7[ NB_LTP_CBKS ]; /* 168 */ extern const opus_uint8 * const silk_LTP_vq_gain_ptrs_Q7[NB_LTP_CBKS]; - extern const opus_int8 silk_LTP_vq_sizes[ NB_LTP_CBKS ]; /* 3 */ extern const opus_uint8 silk_LTPscale_iCDF[ 3 ]; /* 4 */ @@ -99,12 +97,6 @@ extern const opus_uint8 silk_NLSF_interpolation_factor_iCDF[ 5 ]; extern const silk_NLSF_CB_struct silk_NLSF_CB_WB; /* 1040 */ extern const silk_NLSF_CB_struct silk_NLSF_CB_NB_MB; /* 728 */ -/* Piece-wise linear mapping from bitrate in kbps to coding quality in dB SNR */ -extern const opus_int32 silk_TargetRate_table_NB[ TARGET_RATE_TAB_SZ ]; /* 32 */ -extern const opus_int32 silk_TargetRate_table_MB[ TARGET_RATE_TAB_SZ ]; /* 32 */ -extern const opus_int32 silk_TargetRate_table_WB[ TARGET_RATE_TAB_SZ ]; /* 32 */ -extern const opus_int16 silk_SNR_table_Q1[ TARGET_RATE_TAB_SZ ]; /* 32 */ - /* Quantization offsets */ extern const opus_int16 silk_Quantization_Offsets_Q10[ 2 ][ 2 ]; /* 8 */ diff --git a/thirdparty/opus/silk/tables_LTP.c b/thirdparty/opus/silk/tables_LTP.c index 0e6a0254d5..5e12c8643e 100644 --- a/thirdparty/opus/silk/tables_LTP.c +++ b/thirdparty/opus/silk/tables_LTP.c @@ -51,8 +51,6 @@ static const opus_uint8 silk_LTP_gain_iCDF_2[32] = { 24, 20, 16, 12, 9, 5, 2, 0 }; -const opus_int16 silk_LTP_gain_middle_avg_RD_Q14 = 12304; - static const opus_uint8 silk_LTP_gain_BITS_Q5_0[8] = { 15, 131, 138, 138, 155, 155, 173, 173 }; diff --git a/thirdparty/opus/silk/tables_NLSF_CB_NB_MB.c b/thirdparty/opus/silk/tables_NLSF_CB_NB_MB.c index 8c59d207aa..195d5b95bd 100644 --- a/thirdparty/opus/silk/tables_NLSF_CB_NB_MB.c +++ b/thirdparty/opus/silk/tables_NLSF_CB_NB_MB.c @@ -74,6 +74,41 @@ static const opus_uint8 silk_NLSF_CB1_NB_MB_Q8[ 320 ] = { 64, 84, 104, 118, 156, 177, 201, 230 }; +static const opus_int16 silk_NLSF_CB1_Wght_Q9[ 320 ] = { + 2897, 2314, 2314, 2314, 2287, 2287, 2314, 2300, 2327, 2287, + 2888, 2580, 2394, 2367, 2314, 2274, 2274, 2274, 2274, 2194, + 2487, 2340, 2340, 2314, 2314, 2314, 2340, 2340, 2367, 2354, + 3216, 2766, 2340, 2340, 2314, 2274, 2221, 2207, 2261, 2194, + 2460, 2474, 2367, 2394, 2394, 2394, 2394, 2367, 2407, 2314, + 3479, 3056, 2127, 2207, 2274, 2274, 2274, 2287, 2314, 2261, + 3282, 3141, 2580, 2394, 2247, 2221, 2207, 2194, 2194, 2114, + 4096, 3845, 2221, 2620, 2620, 2407, 2314, 2394, 2367, 2074, + 3178, 3244, 2367, 2221, 2553, 2434, 2340, 2314, 2167, 2221, + 3338, 3488, 2726, 2194, 2261, 2460, 2354, 2367, 2207, 2101, + 2354, 2420, 2327, 2367, 2394, 2420, 2420, 2420, 2460, 2367, + 3779, 3629, 2434, 2527, 2367, 2274, 2274, 2300, 2207, 2048, + 3254, 3225, 2713, 2846, 2447, 2327, 2300, 2300, 2274, 2127, + 3263, 3300, 2753, 2806, 2447, 2261, 2261, 2247, 2127, 2101, + 2873, 2981, 2633, 2367, 2407, 2354, 2194, 2247, 2247, 2114, + 3225, 3197, 2633, 2580, 2274, 2181, 2247, 2221, 2221, 2141, + 3178, 3310, 2740, 2407, 2274, 2274, 2274, 2287, 2194, 2114, + 3141, 3272, 2460, 2061, 2287, 2500, 2367, 2487, 2434, 2181, + 3507, 3282, 2314, 2700, 2647, 2474, 2367, 2394, 2340, 2127, + 3423, 3535, 3038, 3056, 2300, 1950, 2221, 2274, 2274, 2274, + 3404, 3366, 2087, 2687, 2873, 2354, 2420, 2274, 2474, 2540, + 3760, 3488, 1950, 2660, 2897, 2527, 2394, 2367, 2460, 2261, + 3028, 3272, 2740, 2888, 2740, 2154, 2127, 2287, 2234, 2247, + 3695, 3657, 2025, 1969, 2660, 2700, 2580, 2500, 2327, 2367, + 3207, 3413, 2354, 2074, 2888, 2888, 2340, 2487, 2247, 2167, + 3338, 3366, 2846, 2780, 2327, 2154, 2274, 2287, 2114, 2061, + 2327, 2300, 2181, 2167, 2181, 2367, 2633, 2700, 2700, 2553, + 2407, 2434, 2221, 2261, 2221, 2221, 2340, 2420, 2607, 2700, + 3038, 3244, 2806, 2888, 2474, 2074, 2300, 2314, 2354, 2380, + 2221, 2154, 2127, 2287, 2500, 2793, 2793, 2620, 2580, 2367, + 3676, 3713, 2234, 1838, 2181, 2753, 2726, 2673, 2513, 2207, + 2793, 3160, 2726, 2553, 2846, 2513, 2181, 2394, 2221, 2181 +}; + static const opus_uint8 silk_NLSF_CB1_iCDF_NB_MB[ 64 ] = { 212, 178, 148, 129, 108, 96, 85, 82, 79, 77, 61, 59, 57, 56, 51, 49, @@ -150,6 +185,7 @@ const silk_NLSF_CB_struct silk_NLSF_CB_NB_MB = SILK_FIX_CONST( 0.18, 16 ), SILK_FIX_CONST( 1.0 / 0.18, 6 ), silk_NLSF_CB1_NB_MB_Q8, + silk_NLSF_CB1_Wght_Q9, silk_NLSF_CB1_iCDF_NB_MB, silk_NLSF_PRED_NB_MB_Q8, silk_NLSF_CB2_SELECT_NB_MB, diff --git a/thirdparty/opus/silk/tables_NLSF_CB_WB.c b/thirdparty/opus/silk/tables_NLSF_CB_WB.c index 50af87eb2e..5cc9f57bff 100644 --- a/thirdparty/opus/silk/tables_NLSF_CB_WB.c +++ b/thirdparty/opus/silk/tables_NLSF_CB_WB.c @@ -98,6 +98,41 @@ static const opus_uint8 silk_NLSF_CB1_WB_Q8[ 512 ] = { 110, 119, 129, 141, 175, 198, 218, 237 }; +static const opus_int16 silk_NLSF_CB1_WB_Wght_Q9[ 512 ] = { + 3657, 2925, 2925, 2925, 2925, 2925, 2925, 2925, 2925, 2925, 2925, 2925, 2963, 2963, 2925, 2846, + 3216, 3085, 2972, 3056, 3056, 3010, 3010, 3010, 2963, 2963, 3010, 2972, 2888, 2846, 2846, 2726, + 3920, 4014, 2981, 3207, 3207, 2934, 3056, 2846, 3122, 3244, 2925, 2846, 2620, 2553, 2780, 2925, + 3516, 3197, 3010, 3103, 3019, 2888, 2925, 2925, 2925, 2925, 2888, 2888, 2888, 2888, 2888, 2753, + 5054, 5054, 2934, 3573, 3385, 3056, 3085, 2793, 3160, 3160, 2972, 2846, 2513, 2540, 2753, 2888, + 4428, 4149, 2700, 2753, 2972, 3010, 2925, 2846, 2981, 3019, 2925, 2925, 2925, 2925, 2888, 2726, + 3620, 3019, 2972, 3056, 3056, 2873, 2806, 3056, 3216, 3047, 2981, 3291, 3291, 2981, 3310, 2991, + 5227, 5014, 2540, 3338, 3526, 3385, 3197, 3094, 3376, 2981, 2700, 2647, 2687, 2793, 2846, 2673, + 5081, 5174, 4615, 4428, 2460, 2897, 3047, 3207, 3169, 2687, 2740, 2888, 2846, 2793, 2846, 2700, + 3122, 2888, 2963, 2925, 2925, 2925, 2925, 2963, 2963, 2963, 2963, 2925, 2925, 2963, 2963, 2963, + 4202, 3207, 2981, 3103, 3010, 2888, 2888, 2925, 2972, 2873, 2916, 3019, 2972, 3010, 3197, 2873, + 3760, 3760, 3244, 3103, 2981, 2888, 2925, 2888, 2972, 2934, 2793, 2793, 2846, 2888, 2888, 2660, + 3854, 4014, 3207, 3122, 3244, 2934, 3047, 2963, 2963, 3085, 2846, 2793, 2793, 2793, 2793, 2580, + 3845, 4080, 3357, 3516, 3094, 2740, 3010, 2934, 3122, 3085, 2846, 2846, 2647, 2647, 2846, 2806, + 5147, 4894, 3225, 3845, 3441, 3169, 2897, 3413, 3451, 2700, 2580, 2673, 2740, 2846, 2806, 2753, + 4109, 3789, 3291, 3160, 2925, 2888, 2888, 2925, 2793, 2740, 2793, 2740, 2793, 2846, 2888, 2806, + 5081, 5054, 3047, 3545, 3244, 3056, 3085, 2944, 3103, 2897, 2740, 2740, 2740, 2846, 2793, 2620, + 4309, 4309, 2860, 2527, 3207, 3376, 3376, 3075, 3075, 3376, 3056, 2846, 2647, 2580, 2726, 2753, + 3056, 2916, 2806, 2888, 2740, 2687, 2897, 3103, 3150, 3150, 3216, 3169, 3056, 3010, 2963, 2846, + 4375, 3882, 2925, 2888, 2846, 2888, 2846, 2846, 2888, 2888, 2888, 2846, 2888, 2925, 2888, 2846, + 2981, 2916, 2916, 2981, 2981, 3056, 3122, 3216, 3150, 3056, 3010, 2972, 2972, 2972, 2925, 2740, + 4229, 4149, 3310, 3347, 2925, 2963, 2888, 2981, 2981, 2846, 2793, 2740, 2846, 2846, 2846, 2793, + 4080, 4014, 3103, 3010, 2925, 2925, 2925, 2888, 2925, 2925, 2846, 2846, 2846, 2793, 2888, 2780, + 4615, 4575, 3169, 3441, 3207, 2981, 2897, 3038, 3122, 2740, 2687, 2687, 2687, 2740, 2793, 2700, + 4149, 4269, 3789, 3657, 2726, 2780, 2888, 2888, 3010, 2972, 2925, 2846, 2687, 2687, 2793, 2888, + 4215, 3554, 2753, 2846, 2846, 2888, 2888, 2888, 2925, 2925, 2888, 2925, 2925, 2925, 2963, 2888, + 5174, 4921, 2261, 3432, 3789, 3479, 3347, 2846, 3310, 3479, 3150, 2897, 2460, 2487, 2753, 2925, + 3451, 3685, 3122, 3197, 3357, 3047, 3207, 3207, 2981, 3216, 3085, 2925, 2925, 2687, 2540, 2434, + 2981, 3010, 2793, 2793, 2740, 2793, 2846, 2972, 3056, 3103, 3150, 3150, 3150, 3103, 3010, 3010, + 2944, 2873, 2687, 2726, 2780, 3010, 3432, 3545, 3357, 3244, 3056, 3010, 2963, 2925, 2888, 2846, + 3019, 2944, 2897, 3010, 3010, 2972, 3019, 3103, 3056, 3056, 3010, 2888, 2846, 2925, 2925, 2888, + 3920, 3967, 3010, 3197, 3357, 3216, 3291, 3291, 3479, 3704, 3441, 2726, 2181, 2460, 2580, 2607 +}; + static const opus_uint8 silk_NLSF_CB1_iCDF_WB[ 64 ] = { 225, 204, 201, 184, 183, 175, 158, 154, 153, 135, 119, 115, 113, 110, 109, 99, @@ -188,6 +223,7 @@ const silk_NLSF_CB_struct silk_NLSF_CB_WB = SILK_FIX_CONST( 0.15, 16 ), SILK_FIX_CONST( 1.0 / 0.15, 6 ), silk_NLSF_CB1_WB_Q8, + silk_NLSF_CB1_WB_Wght_Q9, silk_NLSF_CB1_iCDF_WB, silk_NLSF_PRED_WB_Q8, silk_NLSF_CB2_SELECT_WB, diff --git a/thirdparty/opus/silk/tables_other.c b/thirdparty/opus/silk/tables_other.c index 398686bf26..e34d90777b 100644 --- a/thirdparty/opus/silk/tables_other.c +++ b/thirdparty/opus/silk/tables_other.c @@ -38,20 +38,6 @@ extern "C" { #endif -/* Piece-wise linear mapping from bitrate in kbps to coding quality in dB SNR */ -const opus_int32 silk_TargetRate_table_NB[ TARGET_RATE_TAB_SZ ] = { - 0, 8000, 9400, 11500, 13500, 17500, 25000, MAX_TARGET_RATE_BPS -}; -const opus_int32 silk_TargetRate_table_MB[ TARGET_RATE_TAB_SZ ] = { - 0, 9000, 12000, 14500, 18500, 24500, 35500, MAX_TARGET_RATE_BPS -}; -const opus_int32 silk_TargetRate_table_WB[ TARGET_RATE_TAB_SZ ] = { - 0, 10500, 14000, 17000, 21500, 28500, 42000, MAX_TARGET_RATE_BPS -}; -const opus_int16 silk_SNR_table_Q1[ TARGET_RATE_TAB_SZ ] = { - 18, 29, 38, 40, 46, 52, 62, 84 -}; - /* Tables for stereo predictor coding */ const opus_int16 silk_stereo_pred_quant_Q13[ STEREO_QUANT_TAB_SIZE ] = { -13732, -10050, -8266, -7526, -6500, -5000, -2950, -820, diff --git a/thirdparty/opus/silk/tuning_parameters.h b/thirdparty/opus/silk/tuning_parameters.h index 5b8f404235..d70275fd8f 100644 --- a/thirdparty/opus/silk/tuning_parameters.h +++ b/thirdparty/opus/silk/tuning_parameters.h @@ -53,19 +53,12 @@ extern "C" /* LPC analysis regularization */ #define FIND_LPC_COND_FAC 1e-5f -/* LTP analysis defines */ -#define FIND_LTP_COND_FAC 1e-5f -#define LTP_DAMPING 0.05f -#define LTP_SMOOTHING 0.1f - -/* LTP quantization settings */ -#define MU_LTP_QUANT_NB 0.03f -#define MU_LTP_QUANT_MB 0.025f -#define MU_LTP_QUANT_WB 0.02f - /* Max cumulative LTP gain */ #define MAX_SUM_LOG_GAIN_DB 250.0f +/* LTP analysis defines */ +#define LTP_CORR_INV_MAX 0.03f + /***********************/ /* High pass filtering */ /***********************/ @@ -103,25 +96,16 @@ extern "C" #define SPARSE_SNR_INCR_dB 2.0f /* threshold for sparseness measure above which to use lower quantization offset during unvoiced */ -#define SPARSENESS_THRESHOLD_QNT_OFFSET 0.75f +#define ENERGY_VARIATION_THRESHOLD_QNT_OFFSET 0.6f /* warping control */ #define WARPING_MULTIPLIER 0.015f /* fraction added to first autocorrelation value */ -#define SHAPE_WHITE_NOISE_FRACTION 5e-5f +#define SHAPE_WHITE_NOISE_FRACTION 3e-5f /* noise shaping filter chirp factor */ -#define BANDWIDTH_EXPANSION 0.95f - -/* difference between chirp factors for analysis and synthesis noise shaping filters at low bitrates */ -#define LOW_RATE_BANDWIDTH_EXPANSION_DELTA 0.01f - -/* extra harmonic boosting (signal shaping) at low bitrates */ -#define LOW_RATE_HARMONIC_BOOST 0.1f - -/* extra harmonic boosting (signal shaping) for noisy input signals */ -#define LOW_INPUT_QUALITY_HARMONIC_BOOST 0.1f +#define BANDWIDTH_EXPANSION 0.94f /* harmonic noise shaping */ #define HARMONIC_SHAPING 0.3f diff --git a/thirdparty/opus/silk/x86/NSQ_del_dec_sse.c b/thirdparty/opus/silk/x86/NSQ_del_dec_sse4_1.c index 21d4a8bc1e..2c75ede2dd 100644 --- a/thirdparty/opus/silk/x86/NSQ_del_dec_sse.c +++ b/thirdparty/opus/silk/x86/NSQ_del_dec_sse4_1.c @@ -107,12 +107,12 @@ static OPUS_INLINE void silk_noise_shape_quantizer_del_dec_sse4_1( opus_int predictLPCOrder, /* I Prediction filter order */ opus_int warping_Q16, /* I */ opus_int nStatesDelayedDecision, /* I Number of states in decision tree */ - opus_int *smpl_buf_idx, /* I Index to newest samples in buffers */ + opus_int *smpl_buf_idx, /* I/O Index to newest samples in buffers */ opus_int decisionDelay /* I */ ); void silk_NSQ_del_dec_sse4_1( - const silk_encoder_state *psEncC, /* I/O Encoder State */ + const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ SideInfoIndices *psIndices, /* I/O Quantization Indices */ const opus_int32 x_Q3[], /* I Prefiltered input signal */ @@ -234,7 +234,8 @@ void silk_NSQ_del_dec_sse4_1( psDD = &psDelDec[ Winner_ind ]; last_smple_idx = smpl_buf_idx + decisionDelay; for( i = 0; i < decisionDelay; i++ ) { - last_smple_idx = ( last_smple_idx - 1 ) & DECISION_DELAY_MASK; + last_smple_idx = ( last_smple_idx - 1 ) % DECISION_DELAY; + if( last_smple_idx < 0 ) last_smple_idx += DECISION_DELAY; pulses[ i - decisionDelay ] = (opus_int8)silk_RSHIFT_ROUND( psDD->Q_Q10[ last_smple_idx ], 10 ); pxq[ i - decisionDelay ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( silk_SMULWW( psDD->Xq_Q14[ last_smple_idx ], Gains_Q16[ 1 ] ), 14 ) ); @@ -246,7 +247,7 @@ void silk_NSQ_del_dec_sse4_1( /* Rewhiten with new A coefs */ start_idx = psEncC->ltp_mem_length - lag - psEncC->predictLPCOrder - LTP_ORDER / 2; - silk_assert( start_idx > 0 ); + celt_assert( start_idx > 0 ); silk_LPC_analysis_filter( &sLTP[ start_idx ], &NSQ->xq[ start_idx + k * psEncC->subfr_length ], A_Q12, psEncC->ltp_mem_length - start_idx, psEncC->predictLPCOrder, psEncC->arch ); @@ -285,7 +286,8 @@ void silk_NSQ_del_dec_sse4_1( last_smple_idx = smpl_buf_idx + decisionDelay; Gain_Q10 = silk_RSHIFT32( Gains_Q16[ psEncC->nb_subfr - 1 ], 6 ); for( i = 0; i < decisionDelay; i++ ) { - last_smple_idx = ( last_smple_idx - 1 ) & DECISION_DELAY_MASK; + last_smple_idx = ( last_smple_idx - 1 ) % DECISION_DELAY; + if( last_smple_idx < 0 ) last_smple_idx += DECISION_DELAY; pulses[ i - decisionDelay ] = (opus_int8)silk_RSHIFT_ROUND( psDD->Q_Q10[ last_smple_idx ], 10 ); pxq[ i - decisionDelay ] = (opus_int16)silk_SAT16( silk_RSHIFT_ROUND( silk_SMULWW( psDD->Xq_Q14[ last_smple_idx ], Gain_Q10 ), 8 ) ); @@ -299,7 +301,6 @@ void silk_NSQ_del_dec_sse4_1( NSQ->lagPrev = pitchL[ psEncC->nb_subfr - 1 ]; /* Save quantized speech signal */ - /* DEBUG_STORE_DATA( enc.pcm, &NSQ->xq[psEncC->ltp_mem_length], psEncC->frame_length * sizeof( opus_int16 ) ) */ silk_memmove( NSQ->xq, &NSQ->xq[ psEncC->frame_length ], psEncC->ltp_mem_length * sizeof( opus_int16 ) ); silk_memmove( NSQ->sLTP_shp_Q14, &NSQ->sLTP_shp_Q14[ psEncC->frame_length ], psEncC->ltp_mem_length * sizeof( opus_int32 ) ); RESTORE_STACK; @@ -333,7 +334,7 @@ static OPUS_INLINE void silk_noise_shape_quantizer_del_dec_sse4_1( opus_int predictLPCOrder, /* I Prediction filter order */ opus_int warping_Q16, /* I */ opus_int nStatesDelayedDecision, /* I Number of states in decision tree */ - opus_int *smpl_buf_idx, /* I Index to newest samples in buffers */ + opus_int *smpl_buf_idx, /* I/O Index to newest samples in buffers */ opus_int decisionDelay /* I */ ) { @@ -352,7 +353,7 @@ static OPUS_INLINE void silk_noise_shape_quantizer_del_dec_sse4_1( __m128i b_Q12_0123, b_sr_Q12_0123; SAVE_STACK; - silk_assert( nStatesDelayedDecision > 0 ); + celt_assert( nStatesDelayedDecision > 0 ); ALLOC( psSampleState, nStatesDelayedDecision, NSQ_sample_pair ); shp_lag_ptr = &NSQ->sLTP_shp_Q14[ NSQ->sLTP_shp_buf_idx - lag + HARM_SHAPE_FIR_TAPS / 2 ]; @@ -638,8 +639,9 @@ static OPUS_INLINE void silk_noise_shape_quantizer_del_dec_sse4_1( psSS[ 1 ].xq_Q14 = xq_Q14; } } - *smpl_buf_idx = ( *smpl_buf_idx - 1 ) & DECISION_DELAY_MASK; /* Index to newest samples */ - last_smple_idx = ( *smpl_buf_idx + decisionDelay ) & DECISION_DELAY_MASK; /* Index to decisionDelay old samples */ + *smpl_buf_idx = ( *smpl_buf_idx - 1 ) % DECISION_DELAY; + if( *smpl_buf_idx < 0 ) *smpl_buf_idx += DECISION_DELAY; + last_smple_idx = ( *smpl_buf_idx + decisionDelay ) % DECISION_DELAY; /* Find winner */ RDmin_Q10 = psSampleState[ 0 ][ 0 ].RD_Q10; diff --git a/thirdparty/opus/silk/x86/NSQ_sse.c b/thirdparty/opus/silk/x86/NSQ_sse4_1.c index bb3c5f1955..b0315e35fc 100644 --- a/thirdparty/opus/silk/x86/NSQ_sse.c +++ b/thirdparty/opus/silk/x86/NSQ_sse4_1.c @@ -71,7 +71,7 @@ static OPUS_INLINE void silk_noise_shape_quantizer_10_16_sse4_1( ); void silk_NSQ_sse4_1( - const silk_encoder_state *psEncC, /* I/O Encoder State */ + const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ SideInfoIndices *psIndices, /* I/O Quantization Indices */ const opus_int32 x_Q3[], /* I Prefiltered input signal */ @@ -199,7 +199,7 @@ void silk_NSQ_sse4_1( if( ( k & ( 3 - silk_LSHIFT( LSF_interpolation_flag, 1 ) ) ) == 0 ) { /* Rewhiten with new A coefs */ start_idx = psEncC->ltp_mem_length - lag - psEncC->predictLPCOrder - LTP_ORDER / 2; - silk_assert( start_idx > 0 ); + celt_assert( start_idx > 0 ); silk_LPC_analysis_filter( &sLTP[ start_idx ], &NSQ->xq[ start_idx + k * psEncC->subfr_length ], A_Q12, psEncC->ltp_mem_length - start_idx, psEncC->predictLPCOrder, psEncC->arch ); @@ -233,7 +233,6 @@ void silk_NSQ_sse4_1( NSQ->lagPrev = pitchL[ psEncC->nb_subfr - 1 ]; /* Save quantized speech and noise shaping signals */ - /* DEBUG_STORE_DATA( enc.pcm, &NSQ->xq[ psEncC->ltp_mem_length ], psEncC->frame_length * sizeof( opus_int16 ) ) */ silk_memmove( NSQ->xq, &NSQ->xq[ psEncC->frame_length ], psEncC->ltp_mem_length * sizeof( opus_int16 ) ); silk_memmove( NSQ->sLTP_shp_Q14, &NSQ->sLTP_shp_Q14[ psEncC->frame_length ], psEncC->ltp_mem_length * sizeof( opus_int32 ) ); RESTORE_STACK; diff --git a/thirdparty/opus/silk/x86/VAD_sse.c b/thirdparty/opus/silk/x86/VAD_sse4_1.c index 4e90f4410d..d02ddf4ad0 100644 --- a/thirdparty/opus/silk/x86/VAD_sse.c +++ b/thirdparty/opus/silk/x86/VAD_sse4_1.c @@ -65,9 +65,9 @@ opus_int silk_VAD_GetSA_Q8_sse4_1( /* O Return value, 0 if s /* Safety checks */ silk_assert( VAD_N_BANDS == 4 ); - silk_assert( MAX_FRAME_LENGTH >= psEncC->frame_length ); - silk_assert( psEncC->frame_length <= 512 ); - silk_assert( psEncC->frame_length == 8 * silk_RSHIFT( psEncC->frame_length, 3 ) ); + celt_assert( MAX_FRAME_LENGTH >= psEncC->frame_length ); + celt_assert( psEncC->frame_length <= 512 ); + celt_assert( psEncC->frame_length == 8 * silk_RSHIFT( psEncC->frame_length, 3 ) ); /***********************/ /* Filter and Decimate */ diff --git a/thirdparty/opus/silk/x86/VQ_WMat_EC_sse.c b/thirdparty/opus/silk/x86/VQ_WMat_EC_sse4_1.c index 74d6c6d0ec..74d6c6d0ec 100644 --- a/thirdparty/opus/silk/x86/VQ_WMat_EC_sse.c +++ b/thirdparty/opus/silk/x86/VQ_WMat_EC_sse4_1.c diff --git a/thirdparty/opus/silk/x86/main_sse.h b/thirdparty/opus/silk/x86/main_sse.h index d8d61310ed..2f15d44869 100644 --- a/thirdparty/opus/silk/x86/main_sse.h +++ b/thirdparty/opus/silk/x86/main_sse.h @@ -34,6 +34,7 @@ # if defined(OPUS_X86_MAY_HAVE_SSE4_1) +#if 0 /* FIXME: SSE disabled until silk_VQ_WMat_EC_sse4_1() gets updated. */ # define OVERRIDE_silk_VQ_WMat_EC void silk_VQ_WMat_EC_sse4_1( @@ -79,11 +80,13 @@ extern void (*const SILK_VQ_WMAT_EC_IMPL[OPUS_ARCHMASK + 1])( mu_Q9, max_gain_Q7, L)) #endif +#endif +#if 0 /* FIXME: SSE disabled until the NSQ code gets updated. */ # define OVERRIDE_silk_NSQ void silk_NSQ_sse4_1( - const silk_encoder_state *psEncC, /* I/O Encoder State */ + const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ SideInfoIndices *psIndices, /* I/O Quantization Indices */ const opus_int32 x_Q3[], /* I Prefiltered input signal */ @@ -110,7 +113,7 @@ void silk_NSQ_sse4_1( #else extern void (*const SILK_NSQ_IMPL[OPUS_ARCHMASK + 1])( - const silk_encoder_state *psEncC, /* I/O Encoder State */ + const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ SideInfoIndices *psIndices, /* I/O Quantization Indices */ const opus_int32 x_Q3[], /* I Prefiltered input signal */ @@ -137,7 +140,7 @@ extern void (*const SILK_NSQ_IMPL[OPUS_ARCHMASK + 1])( # define OVERRIDE_silk_NSQ_del_dec void silk_NSQ_del_dec_sse4_1( - const silk_encoder_state *psEncC, /* I/O Encoder State */ + const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ SideInfoIndices *psIndices, /* I/O Quantization Indices */ const opus_int32 x_Q3[], /* I Prefiltered input signal */ @@ -164,7 +167,7 @@ void silk_NSQ_del_dec_sse4_1( #else extern void (*const SILK_NSQ_DEL_DEC_IMPL[OPUS_ARCHMASK + 1])( - const silk_encoder_state *psEncC, /* I/O Encoder State */ + const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ SideInfoIndices *psIndices, /* I/O Quantization Indices */ const opus_int32 x_Q3[], /* I Prefiltered input signal */ @@ -187,6 +190,7 @@ extern void (*const SILK_NSQ_DEL_DEC_IMPL[OPUS_ARCHMASK + 1])( HarmShapeGain_Q14, Tilt_Q14, LF_shp_Q14, Gains_Q16, pitchL, Lambda_Q10, LTP_scale_Q14)) #endif +#endif void silk_noise_shape_quantizer( silk_nsq_state *NSQ, /* I/O NSQ state */ @@ -238,39 +242,6 @@ extern opus_int (*const SILK_VAD_GETSA_Q8_IMPL[OPUS_ARCHMASK + 1])( silk_encoder_state *psEnC, const opus_int16 pIn[]); -# define OVERRIDE_silk_warped_LPC_analysis_filter_FIX - -#endif - -void silk_warped_LPC_analysis_filter_FIX_sse4_1( - opus_int32 state[], /* I/O State [order + 1] */ - opus_int32 res_Q2[], /* O Residual signal [length] */ - const opus_int16 coef_Q13[], /* I Coefficients [order] */ - const opus_int16 input[], /* I Input signal [length] */ - const opus_int16 lambda_Q16, /* I Warping factor */ - const opus_int length, /* I Length of input signal */ - const opus_int order /* I Filter order (even) */ -); - -#if defined(OPUS_X86_PRESUME_SSE4_1) -#define silk_warped_LPC_analysis_filter_FIX(state, res_Q2, coef_Q13, input, lambda_Q16, length, order, arch) \ - ((void)(arch),silk_warped_LPC_analysis_filter_FIX_c(state, res_Q2, coef_Q13, input, lambda_Q16, length, order)) - -#else - -extern void (*const SILK_WARPED_LPC_ANALYSIS_FILTER_FIX_IMPL[OPUS_ARCHMASK + 1])( - opus_int32 state[], /* I/O State [order + 1] */ - opus_int32 res_Q2[], /* O Residual signal [length] */ - const opus_int16 coef_Q13[], /* I Coefficients [order] */ - const opus_int16 input[], /* I Input signal [length] */ - const opus_int16 lambda_Q16, /* I Warping factor */ - const opus_int length, /* I Length of input signal */ - const opus_int order /* I Filter order (even) */ -); - -# define silk_warped_LPC_analysis_filter_FIX(state, res_Q2, coef_Q13, input, lambda_Q16, length, order, arch) \ - ((*SILK_WARPED_LPC_ANALYSIS_FILTER_FIX_IMPL[(arch) & OPUS_ARCHMASK])(state, res_Q2, coef_Q13, input, lambda_Q16, length, order)) - #endif # endif diff --git a/thirdparty/opus/silk/x86/x86_silk_map.c b/thirdparty/opus/silk/x86/x86_silk_map.c index 818841f2c1..32dcc3cab7 100644 --- a/thirdparty/opus/silk/x86/x86_silk_map.c +++ b/thirdparty/opus/silk/x86/x86_silk_map.c @@ -66,8 +66,9 @@ opus_int (*const SILK_VAD_GETSA_Q8_IMPL[ OPUS_ARCHMASK + 1 ] )( MAY_HAVE_SSE4_1( silk_VAD_GetSA_Q8 ) /* avx */ }; +#if 0 /* FIXME: SSE disabled until the NSQ code gets updated. */ void (*const SILK_NSQ_IMPL[ OPUS_ARCHMASK + 1 ] )( - const silk_encoder_state *psEncC, /* I/O Encoder State */ + const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ SideInfoIndices *psIndices, /* I/O Quantization Indices */ const opus_int32 x_Q3[], /* I Prefiltered input signal */ @@ -89,7 +90,9 @@ void (*const SILK_NSQ_IMPL[ OPUS_ARCHMASK + 1 ] )( MAY_HAVE_SSE4_1( silk_NSQ ), /* sse4.1 */ MAY_HAVE_SSE4_1( silk_NSQ ) /* avx */ }; +#endif +#if 0 /* FIXME: SSE disabled until silk_VQ_WMat_EC_sse4_1() gets updated. */ void (*const SILK_VQ_WMAT_EC_IMPL[ OPUS_ARCHMASK + 1 ] )( opus_int8 *ind, /* O index of best codebook vector */ opus_int32 *rate_dist_Q14, /* O best weighted quant error + mu * rate */ @@ -109,9 +112,11 @@ void (*const SILK_VQ_WMAT_EC_IMPL[ OPUS_ARCHMASK + 1 ] )( MAY_HAVE_SSE4_1( silk_VQ_WMat_EC ), /* sse4.1 */ MAY_HAVE_SSE4_1( silk_VQ_WMat_EC ) /* avx */ }; +#endif +#if 0 /* FIXME: SSE disabled until the NSQ code gets updated. */ void (*const SILK_NSQ_DEL_DEC_IMPL[ OPUS_ARCHMASK + 1 ] )( - const silk_encoder_state *psEncC, /* I/O Encoder State */ + const silk_encoder_state *psEncC, /* I Encoder State */ silk_nsq_state *NSQ, /* I/O NSQ state */ SideInfoIndices *psIndices, /* I/O Quantization Indices */ const opus_int32 x_Q3[], /* I Prefiltered input signal */ @@ -133,25 +138,10 @@ void (*const SILK_NSQ_DEL_DEC_IMPL[ OPUS_ARCHMASK + 1 ] )( MAY_HAVE_SSE4_1( silk_NSQ_del_dec ), /* sse4.1 */ MAY_HAVE_SSE4_1( silk_NSQ_del_dec ) /* avx */ }; +#endif #if defined(FIXED_POINT) -void (*const SILK_WARPED_LPC_ANALYSIS_FILTER_FIX_IMPL[ OPUS_ARCHMASK + 1 ] )( - opus_int32 state[], /* I/O State [order + 1] */ - opus_int32 res_Q2[], /* O Residual signal [length] */ - const opus_int16 coef_Q13[], /* I Coefficients [order] */ - const opus_int16 input[], /* I Input signal [length] */ - const opus_int16 lambda_Q16, /* I Warping factor */ - const opus_int length, /* I Length of input signal */ - const opus_int order /* I Filter order (even) */ -) = { - silk_warped_LPC_analysis_filter_FIX_c, /* non-sse */ - silk_warped_LPC_analysis_filter_FIX_c, - silk_warped_LPC_analysis_filter_FIX_c, - MAY_HAVE_SSE4_1( silk_warped_LPC_analysis_filter_FIX ), /* sse4.1 */ - MAY_HAVE_SSE4_1( silk_warped_LPC_analysis_filter_FIX ) /* avx */ -}; - void (*const SILK_BURG_MODIFIED_IMPL[ OPUS_ARCHMASK + 1 ] )( opus_int32 *res_nrg, /* O Residual energy */ opus_int *res_nrg_Q, /* O Residual energy Q value */ diff --git a/thirdparty/opus/stream.c b/thirdparty/opus/stream.c index 0238a6b31b..6a85197a66 100644 --- a/thirdparty/opus/stream.c +++ b/thirdparty/opus/stream.c @@ -235,8 +235,7 @@ void *op_fopen(OpusFileCallbacks *_cb,const char *_path,const char *_mode){ fp=fopen(_path,_mode); #else fp=NULL; - if(_path==NULL||_mode==NULL)errno=EINVAL; - else{ + { wchar_t *wpath; wchar_t *wmode; wpath=op_utf8_to_utf16(_path); @@ -266,8 +265,7 @@ void *op_freopen(OpusFileCallbacks *_cb,const char *_path,const char *_mode, fp=freopen(_path,_mode,(FILE *)_stream); #else fp=NULL; - if(_path==NULL||_mode==NULL)errno=EINVAL; - else{ + { wchar_t *wpath; wchar_t *wmode; wpath=op_utf8_to_utf16(_path); diff --git a/thirdparty/tinyexr/tinyexr.h b/thirdparty/tinyexr/tinyexr.h index f22163738f..bfc52b51a5 100644 --- a/thirdparty/tinyexr/tinyexr.h +++ b/thirdparty/tinyexr/tinyexr.h @@ -111,13 +111,13 @@ extern "C" { #define TINYEXR_ERROR_INVALID_ARGUMENT (-3) #define TINYEXR_ERROR_INVALID_DATA (-4) #define TINYEXR_ERROR_INVALID_FILE (-5) -#define TINYEXR_ERROR_INVALID_PARAMETER (-5) -#define TINYEXR_ERROR_CANT_OPEN_FILE (-6) -#define TINYEXR_ERROR_UNSUPPORTED_FORMAT (-7) -#define TINYEXR_ERROR_INVALID_HEADER (-8) -#define TINYEXR_ERROR_UNSUPPORTED_FEATURE (-9) -#define TINYEXR_ERROR_CANT_WRITE_FILE (-10) -#define TINYEXR_ERROR_SERIALZATION_FAILED (-11) +#define TINYEXR_ERROR_INVALID_PARAMETER (-6) +#define TINYEXR_ERROR_CANT_OPEN_FILE (-7) +#define TINYEXR_ERROR_UNSUPPORTED_FORMAT (-8) +#define TINYEXR_ERROR_INVALID_HEADER (-9) +#define TINYEXR_ERROR_UNSUPPORTED_FEATURE (-10) +#define TINYEXR_ERROR_CANT_WRITE_FILE (-11) +#define TINYEXR_ERROR_SERIALZATION_FAILED (-12) // @note { OpenEXR file format: http://www.openexr.com/openexrfilelayout.pdf } @@ -554,6 +554,10 @@ namespace miniz { #pragma clang diagnostic ignored "-Wtautological-constant-compare" #endif +#if __has_warning("-Wextra-semi-stmt") +#pragma clang diagnostic ignored "-Wextra-semi-stmt" +#endif + #endif /* miniz.c v1.15 - public domain deflate/inflate, zlib-subset, ZIP @@ -7625,6 +7629,9 @@ static bool DecompressZip(unsigned char *dst, #ifdef __clang__ #pragma clang diagnostic push #pragma clang diagnostic ignored "-Wsign-conversion" +#if __has_warning("-Wextra-semi-stmt") +#pragma clang diagnostic ignored "-Wextra-semi-stmt" +#endif #endif #ifdef _MSC_VER @@ -7886,6 +7893,10 @@ static bool DecompressRle(unsigned char *dst, #pragma clang diagnostic ignored "-Wcast-qual" #endif +#if __has_warning("-Wextra-semi-stmt") +#pragma clang diagnostic ignored "-Wextra-semi-stmt" +#endif + #endif // diff --git a/thirdparty/vhacd/src/FloatMath.inl b/thirdparty/vhacd/src/FloatMath.inl index ce529e6f71..a30deba45d 100644 --- a/thirdparty/vhacd/src/FloatMath.inl +++ b/thirdparty/vhacd/src/FloatMath.inl @@ -7,6 +7,10 @@ // a quaternion is a 'float *' to 4 floats representing a quaternion x,y,z,w // +#ifdef _MSC_VER +#pragma warning(disable:4996) +#endif + namespace FLOAT_MATH { diff --git a/thirdparty/zstd/common/bitstream.h b/thirdparty/zstd/common/bitstream.h index d955bd677b..1c294b80d1 100644 --- a/thirdparty/zstd/common/bitstream.h +++ b/thirdparty/zstd/common/bitstream.h @@ -57,6 +57,8 @@ extern "C" { =========================================*/ #if defined(__BMI__) && defined(__GNUC__) # include <immintrin.h> /* support for bextr (experimental) */ +#elif defined(__ICCARM__) +# include <intrinsics.h> #endif #define STREAM_ACCUMULATOR_MIN_32 25 @@ -162,7 +164,9 @@ MEM_STATIC unsigned BIT_highbit32 (U32 val) _BitScanReverse ( &r, val ); return (unsigned) r; # elif defined(__GNUC__) && (__GNUC__ >= 3) /* Use GCC Intrinsic */ - return 31 - __builtin_clz (val); + return __builtin_clz (val) ^ 31; +# elif defined(__ICCARM__) /* IAR Intrinsic */ + return 31 - __CLZ(val); # else /* Software version */ static const unsigned DeBruijnClz[32] = { 0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30, @@ -240,9 +244,9 @@ MEM_STATIC void BIT_flushBitsFast(BIT_CStream_t* bitC) { size_t const nbBytes = bitC->bitPos >> 3; assert(bitC->bitPos < sizeof(bitC->bitContainer) * 8); + assert(bitC->ptr <= bitC->endPtr); MEM_writeLEST(bitC->ptr, bitC->bitContainer); bitC->ptr += nbBytes; - assert(bitC->ptr <= bitC->endPtr); bitC->bitPos &= 7; bitC->bitContainer >>= nbBytes*8; } @@ -256,6 +260,7 @@ MEM_STATIC void BIT_flushBits(BIT_CStream_t* bitC) { size_t const nbBytes = bitC->bitPos >> 3; assert(bitC->bitPos < sizeof(bitC->bitContainer) * 8); + assert(bitC->ptr <= bitC->endPtr); MEM_writeLEST(bitC->ptr, bitC->bitContainer); bitC->ptr += nbBytes; if (bitC->ptr > bitC->endPtr) bitC->ptr = bitC->endPtr; diff --git a/thirdparty/zstd/common/compiler.h b/thirdparty/zstd/common/compiler.h index 87bf51ae8c..1877a0c1d9 100644 --- a/thirdparty/zstd/common/compiler.h +++ b/thirdparty/zstd/common/compiler.h @@ -23,7 +23,7 @@ # define INLINE_KEYWORD #endif -#if defined(__GNUC__) +#if defined(__GNUC__) || defined(__ICCARM__) # define FORCE_INLINE_ATTR __attribute__((always_inline)) #elif defined(_MSC_VER) # define FORCE_INLINE_ATTR __forceinline @@ -61,11 +61,18 @@ # define HINT_INLINE static INLINE_KEYWORD FORCE_INLINE_ATTR #endif +/* UNUSED_ATTR tells the compiler it is okay if the function is unused. */ +#if defined(__GNUC__) +# define UNUSED_ATTR __attribute__((unused)) +#else +# define UNUSED_ATTR +#endif + /* force no inlining */ #ifdef _MSC_VER # define FORCE_NOINLINE static __declspec(noinline) #else -# ifdef __GNUC__ +# if defined(__GNUC__) || defined(__ICCARM__) # define FORCE_NOINLINE static __attribute__((__noinline__)) # else # define FORCE_NOINLINE static @@ -76,7 +83,7 @@ #ifndef __has_attribute #define __has_attribute(x) 0 /* Compatibility with non-clang compilers. */ #endif -#if defined(__GNUC__) +#if defined(__GNUC__) || defined(__ICCARM__) # define TARGET_ATTRIBUTE(target) __attribute__((__target__(target))) #else # define TARGET_ATTRIBUTE(target) @@ -127,9 +134,14 @@ } \ } -/* vectorization */ +/* vectorization + * older GCC (pre gcc-4.3 picked as the cutoff) uses a different syntax */ #if !defined(__clang__) && defined(__GNUC__) -# define DONT_VECTORIZE __attribute__((optimize("no-tree-vectorize"))) +# if (__GNUC__ == 4 && __GNUC_MINOR__ > 3) || (__GNUC__ >= 5) +# define DONT_VECTORIZE __attribute__((optimize("no-tree-vectorize"))) +# else +# define DONT_VECTORIZE _Pragma("GCC optimize(\"no-tree-vectorize\")") +# endif #else # define DONT_VECTORIZE #endif diff --git a/thirdparty/zstd/common/fse.h b/thirdparty/zstd/common/fse.h index 811c670bdd..a7553e3721 100644 --- a/thirdparty/zstd/common/fse.h +++ b/thirdparty/zstd/common/fse.h @@ -308,7 +308,7 @@ If there is an error, the function will return an error code, which can be teste *******************************************/ /* FSE buffer bounds */ #define FSE_NCOUNTBOUND 512 -#define FSE_BLOCKBOUND(size) (size + (size>>7)) +#define FSE_BLOCKBOUND(size) (size + (size>>7) + 4 /* fse states */ + sizeof(size_t) /* bitContainer */) #define FSE_COMPRESSBOUND(size) (FSE_NCOUNTBOUND + FSE_BLOCKBOUND(size)) /* Macro version, useful for static allocation */ /* It is possible to statically allocate FSE CTable/DTable as a table of FSE_CTable/FSE_DTable using below macros */ diff --git a/thirdparty/zstd/common/fse_decompress.c b/thirdparty/zstd/common/fse_decompress.c index 72bbead5be..4f07378982 100644 --- a/thirdparty/zstd/common/fse_decompress.c +++ b/thirdparty/zstd/common/fse_decompress.c @@ -52,7 +52,9 @@ #define FSE_STATIC_ASSERT(c) DEBUG_STATIC_ASSERT(c) /* use only *after* variable declarations */ /* check and forward error code */ +#ifndef CHECK_F #define CHECK_F(f) { size_t const e = f; if (FSE_isError(e)) return e; } +#endif /* ************************************************************** diff --git a/thirdparty/zstd/common/mem.h b/thirdparty/zstd/common/mem.h index 5da248756f..530d30c8f7 100644 --- a/thirdparty/zstd/common/mem.h +++ b/thirdparty/zstd/common/mem.h @@ -47,6 +47,79 @@ extern "C" { #define MEM_STATIC_ASSERT(c) { enum { MEM_static_assert = 1/(int)(!!(c)) }; } MEM_STATIC void MEM_check(void) { MEM_STATIC_ASSERT((sizeof(size_t)==4) || (sizeof(size_t)==8)); } +/* detects whether we are being compiled under msan */ +#if defined (__has_feature) +# if __has_feature(memory_sanitizer) +# define MEMORY_SANITIZER 1 +# endif +#endif + +#if defined (MEMORY_SANITIZER) +/* Not all platforms that support msan provide sanitizers/msan_interface.h. + * We therefore declare the functions we need ourselves, rather than trying to + * include the header file... */ + +#include <stdint.h> /* intptr_t */ + +/* Make memory region fully initialized (without changing its contents). */ +void __msan_unpoison(const volatile void *a, size_t size); + +/* Make memory region fully uninitialized (without changing its contents). + This is a legacy interface that does not update origin information. Use + __msan_allocated_memory() instead. */ +void __msan_poison(const volatile void *a, size_t size); + +/* Returns the offset of the first (at least partially) poisoned byte in the + memory range, or -1 if the whole range is good. */ +intptr_t __msan_test_shadow(const volatile void *x, size_t size); +#endif + +/* detects whether we are being compiled under asan */ +#if defined (__has_feature) +# if __has_feature(address_sanitizer) +# define ADDRESS_SANITIZER 1 +# endif +#elif defined(__SANITIZE_ADDRESS__) +# define ADDRESS_SANITIZER 1 +#endif + +#if defined (ADDRESS_SANITIZER) +/* Not all platforms that support asan provide sanitizers/asan_interface.h. + * We therefore declare the functions we need ourselves, rather than trying to + * include the header file... */ + +/** + * Marks a memory region (<c>[addr, addr+size)</c>) as unaddressable. + * + * This memory must be previously allocated by your program. Instrumented + * code is forbidden from accessing addresses in this region until it is + * unpoisoned. This function is not guaranteed to poison the entire region - + * it could poison only a subregion of <c>[addr, addr+size)</c> due to ASan + * alignment restrictions. + * + * \note This function is not thread-safe because no two threads can poison or + * unpoison memory in the same memory region simultaneously. + * + * \param addr Start of memory region. + * \param size Size of memory region. */ +void __asan_poison_memory_region(void const volatile *addr, size_t size); + +/** + * Marks a memory region (<c>[addr, addr+size)</c>) as addressable. + * + * This memory must be previously allocated by your program. Accessing + * addresses in this region is allowed until this region is poisoned again. + * This function could unpoison a super-region of <c>[addr, addr+size)</c> due + * to ASan alignment restrictions. + * + * \note This function is not thread-safe because no two threads can + * poison or unpoison memory in the same memory region simultaneously. + * + * \param addr Start of memory region. + * \param size Size of memory region. */ +void __asan_unpoison_memory_region(void const volatile *addr, size_t size); +#endif + /*-************************************************************** * Basic Types @@ -102,7 +175,7 @@ MEM_STATIC void MEM_check(void) { MEM_STATIC_ASSERT((sizeof(size_t)==4) || (size #ifndef MEM_FORCE_MEMORY_ACCESS /* can be defined externally, on command line for example */ # if defined(__GNUC__) && ( defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) || defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) || defined(__ARM_ARCH_6T2__) ) # define MEM_FORCE_MEMORY_ACCESS 2 -# elif defined(__INTEL_COMPILER) || defined(__GNUC__) +# elif defined(__INTEL_COMPILER) || defined(__GNUC__) || defined(__ICCARM__) # define MEM_FORCE_MEMORY_ACCESS 1 # endif #endif diff --git a/thirdparty/zstd/common/pool.c b/thirdparty/zstd/common/pool.c index 7a82945432..f575935076 100644 --- a/thirdparty/zstd/common/pool.c +++ b/thirdparty/zstd/common/pool.c @@ -127,9 +127,13 @@ POOL_ctx* POOL_create_advanced(size_t numThreads, size_t queueSize, ctx->queueTail = 0; ctx->numThreadsBusy = 0; ctx->queueEmpty = 1; - (void)ZSTD_pthread_mutex_init(&ctx->queueMutex, NULL); - (void)ZSTD_pthread_cond_init(&ctx->queuePushCond, NULL); - (void)ZSTD_pthread_cond_init(&ctx->queuePopCond, NULL); + { + int error = 0; + error |= ZSTD_pthread_mutex_init(&ctx->queueMutex, NULL); + error |= ZSTD_pthread_cond_init(&ctx->queuePushCond, NULL); + error |= ZSTD_pthread_cond_init(&ctx->queuePopCond, NULL); + if (error) { POOL_free(ctx); return NULL; } + } ctx->shutdown = 0; /* Allocate space for the thread handles */ ctx->threads = (ZSTD_pthread_t*)ZSTD_malloc(numThreads * sizeof(ZSTD_pthread_t), customMem); diff --git a/thirdparty/zstd/common/threading.c b/thirdparty/zstd/common/threading.c index f3d4fa8418..482664bd9a 100644 --- a/thirdparty/zstd/common/threading.c +++ b/thirdparty/zstd/common/threading.c @@ -14,6 +14,8 @@ * This file will hold wrapper for systems, which do not support pthreads */ +#include "threading.h" + /* create fake symbol to avoid empty translation unit warning */ int g_ZSTD_threading_useless_symbol; @@ -28,7 +30,6 @@ int g_ZSTD_threading_useless_symbol; /* === Dependencies === */ #include <process.h> #include <errno.h> -#include "threading.h" /* === Implementation === */ @@ -73,3 +74,47 @@ int ZSTD_pthread_join(ZSTD_pthread_t thread, void **value_ptr) } #endif /* ZSTD_MULTITHREAD */ + +#if defined(ZSTD_MULTITHREAD) && DEBUGLEVEL >= 1 && !defined(_WIN32) + +#include <stdlib.h> + +int ZSTD_pthread_mutex_init(ZSTD_pthread_mutex_t* mutex, pthread_mutexattr_t const* attr) +{ + *mutex = (pthread_mutex_t*)malloc(sizeof(pthread_mutex_t)); + if (!*mutex) + return 1; + return pthread_mutex_init(*mutex, attr); +} + +int ZSTD_pthread_mutex_destroy(ZSTD_pthread_mutex_t* mutex) +{ + if (!*mutex) + return 0; + { + int const ret = pthread_mutex_destroy(*mutex); + free(*mutex); + return ret; + } +} + +int ZSTD_pthread_cond_init(ZSTD_pthread_cond_t* cond, pthread_condattr_t const* attr) +{ + *cond = (pthread_cond_t*)malloc(sizeof(pthread_cond_t)); + if (!*cond) + return 1; + return pthread_cond_init(*cond, attr); +} + +int ZSTD_pthread_cond_destroy(ZSTD_pthread_cond_t* cond) +{ + if (!*cond) + return 0; + { + int const ret = pthread_cond_destroy(*cond); + free(*cond); + return ret; + } +} + +#endif diff --git a/thirdparty/zstd/common/threading.h b/thirdparty/zstd/common/threading.h index d806c89d01..3193ca7db8 100644 --- a/thirdparty/zstd/common/threading.h +++ b/thirdparty/zstd/common/threading.h @@ -13,6 +13,8 @@ #ifndef THREADING_H_938743 #define THREADING_H_938743 +#include "debug.h" + #if defined (__cplusplus) extern "C" { #endif @@ -75,10 +77,12 @@ int ZSTD_pthread_join(ZSTD_pthread_t thread, void** value_ptr); */ -#elif defined(ZSTD_MULTITHREAD) /* posix assumed ; need a better detection method */ +#elif defined(ZSTD_MULTITHREAD) /* posix assumed ; need a better detection method */ /* === POSIX Systems === */ # include <pthread.h> +#if DEBUGLEVEL < 1 + #define ZSTD_pthread_mutex_t pthread_mutex_t #define ZSTD_pthread_mutex_init(a, b) pthread_mutex_init((a), (b)) #define ZSTD_pthread_mutex_destroy(a) pthread_mutex_destroy((a)) @@ -96,6 +100,33 @@ int ZSTD_pthread_join(ZSTD_pthread_t thread, void** value_ptr); #define ZSTD_pthread_create(a, b, c, d) pthread_create((a), (b), (c), (d)) #define ZSTD_pthread_join(a, b) pthread_join((a),(b)) +#else /* DEBUGLEVEL >= 1 */ + +/* Debug implementation of threading. + * In this implementation we use pointers for mutexes and condition variables. + * This way, if we forget to init/destroy them the program will crash or ASAN + * will report leaks. + */ + +#define ZSTD_pthread_mutex_t pthread_mutex_t* +int ZSTD_pthread_mutex_init(ZSTD_pthread_mutex_t* mutex, pthread_mutexattr_t const* attr); +int ZSTD_pthread_mutex_destroy(ZSTD_pthread_mutex_t* mutex); +#define ZSTD_pthread_mutex_lock(a) pthread_mutex_lock(*(a)) +#define ZSTD_pthread_mutex_unlock(a) pthread_mutex_unlock(*(a)) + +#define ZSTD_pthread_cond_t pthread_cond_t* +int ZSTD_pthread_cond_init(ZSTD_pthread_cond_t* cond, pthread_condattr_t const* attr); +int ZSTD_pthread_cond_destroy(ZSTD_pthread_cond_t* cond); +#define ZSTD_pthread_cond_wait(a, b) pthread_cond_wait(*(a), *(b)) +#define ZSTD_pthread_cond_signal(a) pthread_cond_signal(*(a)) +#define ZSTD_pthread_cond_broadcast(a) pthread_cond_broadcast(*(a)) + +#define ZSTD_pthread_t pthread_t +#define ZSTD_pthread_create(a, b, c, d) pthread_create((a), (b), (c), (d)) +#define ZSTD_pthread_join(a, b) pthread_join((a),(b)) + +#endif + #else /* ZSTD_MULTITHREAD not defined */ /* No multithreading support */ diff --git a/thirdparty/zstd/common/xxhash.c b/thirdparty/zstd/common/xxhash.c index 30599aaae4..99d2459621 100644 --- a/thirdparty/zstd/common/xxhash.c +++ b/thirdparty/zstd/common/xxhash.c @@ -53,7 +53,8 @@ # if defined(__GNUC__) && ( defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) || defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) || defined(__ARM_ARCH_6T2__) ) # define XXH_FORCE_MEMORY_ACCESS 2 # elif (defined(__INTEL_COMPILER) && !defined(WIN32)) || \ - (defined(__GNUC__) && ( defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) || defined(__ARM_ARCH_7S__) )) + (defined(__GNUC__) && ( defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) || defined(__ARM_ARCH_7S__) )) || \ + defined(__ICCARM__) # define XXH_FORCE_MEMORY_ACCESS 1 # endif #endif @@ -120,7 +121,7 @@ static void* XXH_memcpy(void* dest, const void* src, size_t size) { return memcp # define INLINE_KEYWORD #endif -#if defined(__GNUC__) +#if defined(__GNUC__) || defined(__ICCARM__) # define FORCE_INLINE_ATTR __attribute__((always_inline)) #elif defined(_MSC_VER) # define FORCE_INLINE_ATTR __forceinline @@ -206,7 +207,12 @@ static U64 XXH_read64(const void* memPtr) # define XXH_rotl32(x,r) _rotl(x,r) # define XXH_rotl64(x,r) _rotl64(x,r) #else +#if defined(__ICCARM__) +# include <intrinsics.h> +# define XXH_rotl32(x,r) __ROR(x,(32 - r)) +#else # define XXH_rotl32(x,r) ((x << r) | (x >> (32 - r))) +#endif # define XXH_rotl64(x,r) ((x << r) | (x >> (64 - r))) #endif diff --git a/thirdparty/zstd/common/zstd_internal.h b/thirdparty/zstd/common/zstd_internal.h index 81b16eac2e..dcdcbdb81c 100644 --- a/thirdparty/zstd/common/zstd_internal.h +++ b/thirdparty/zstd/common/zstd_internal.h @@ -56,9 +56,9 @@ extern "C" { /** * Return the specified error if the condition evaluates to true. * - * In debug modes, prints additional information. In order to do that - * (particularly, printing the conditional that failed), this can't just wrap - * RETURN_ERROR(). + * In debug modes, prints additional information. + * In order to do that (particularly, printing the conditional that failed), + * this can't just wrap RETURN_ERROR(). */ #define RETURN_ERROR_IF(cond, err, ...) \ if (cond) { \ @@ -197,79 +197,56 @@ static void ZSTD_copy8(void* dst, const void* src) { memcpy(dst, src, 8); } static void ZSTD_copy16(void* dst, const void* src) { memcpy(dst, src, 16); } #define COPY16(d,s) { ZSTD_copy16(d,s); d+=16; s+=16; } -#define WILDCOPY_OVERLENGTH 8 -#define VECLEN 16 +#define WILDCOPY_OVERLENGTH 32 +#define WILDCOPY_VECLEN 16 typedef enum { ZSTD_no_overlap, - ZSTD_overlap_src_before_dst, + ZSTD_overlap_src_before_dst /* ZSTD_overlap_dst_before_src, */ } ZSTD_overlap_e; /*! ZSTD_wildcopy() : - * custom version of memcpy(), can overwrite up to WILDCOPY_OVERLENGTH bytes (if length==0) */ + * Custom version of memcpy(), can over read/write up to WILDCOPY_OVERLENGTH bytes (if length==0) + * @param ovtype controls the overlap detection + * - ZSTD_no_overlap: The source and destination are guaranteed to be at least WILDCOPY_VECLEN bytes apart. + * - ZSTD_overlap_src_before_dst: The src and dst may overlap, but they MUST be at least 8 bytes apart. + * The src buffer must be before the dst buffer. + */ MEM_STATIC FORCE_INLINE_ATTR DONT_VECTORIZE -void ZSTD_wildcopy(void* dst, const void* src, ptrdiff_t length, ZSTD_overlap_e ovtype) +void ZSTD_wildcopy(void* dst, const void* src, ptrdiff_t length, ZSTD_overlap_e const ovtype) { ptrdiff_t diff = (BYTE*)dst - (const BYTE*)src; const BYTE* ip = (const BYTE*)src; BYTE* op = (BYTE*)dst; BYTE* const oend = op + length; - assert(diff >= 8 || (ovtype == ZSTD_no_overlap && diff < -8)); - if (length < VECLEN || (ovtype == ZSTD_overlap_src_before_dst && diff < VECLEN)) { - do - COPY8(op, ip) - while (op < oend); - } - else { - if ((length & 8) == 0) - COPY8(op, ip); - do { + assert(diff >= 8 || (ovtype == ZSTD_no_overlap && diff <= -WILDCOPY_VECLEN)); + + if (ovtype == ZSTD_overlap_src_before_dst && diff < WILDCOPY_VECLEN) { + /* Handle short offset copies. */ + do { + COPY8(op, ip) + } while (op < oend); + } else { + assert(diff >= WILDCOPY_VECLEN || diff <= -WILDCOPY_VECLEN); + /* Separate out the first two COPY16() calls because the copy length is + * almost certain to be short, so the branches have different + * probabilities. + * On gcc-9 unrolling once is +1.6%, twice is +2%, thrice is +1.8%. + * On clang-8 unrolling once is +1.4%, twice is +3.3%, thrice is +3%. + */ COPY16(op, ip); - } - while (op < oend); - } -} - -/*! ZSTD_wildcopy_16min() : - * same semantics as ZSTD_wilcopy() except guaranteed to be able to copy 16 bytes at the start */ -MEM_STATIC FORCE_INLINE_ATTR DONT_VECTORIZE -void ZSTD_wildcopy_16min(void* dst, const void* src, ptrdiff_t length, ZSTD_overlap_e ovtype) -{ - ptrdiff_t diff = (BYTE*)dst - (const BYTE*)src; - const BYTE* ip = (const BYTE*)src; - BYTE* op = (BYTE*)dst; - BYTE* const oend = op + length; - - assert(length >= 8); - assert(diff >= 8 || (ovtype == ZSTD_no_overlap && diff < -8)); - - if (ovtype == ZSTD_overlap_src_before_dst && diff < VECLEN) { - do - COPY8(op, ip) - while (op < oend); - } - else { - if ((length & 8) == 0) - COPY8(op, ip); - do { COPY16(op, ip); - } - while (op < oend); + if (op >= oend) return; + do { + COPY16(op, ip); + COPY16(op, ip); + } + while (op < oend); } } -MEM_STATIC void ZSTD_wildcopy_e(void* dst, const void* src, void* dstEnd) /* should be faster for decoding, but strangely, not verified on all platform */ -{ - const BYTE* ip = (const BYTE*)src; - BYTE* op = (BYTE*)dst; - BYTE* const oend = (BYTE*)dstEnd; - do - COPY8(op, ip) - while (op < oend); -} - /*-******************************************* * Private declarations @@ -323,7 +300,9 @@ MEM_STATIC U32 ZSTD_highbit32(U32 val) /* compress, dictBuilder, decodeCorpus _BitScanReverse(&r, val); return (unsigned)r; # elif defined(__GNUC__) && (__GNUC__ >= 3) /* GCC Intrinsic */ - return 31 - __builtin_clz(val); + return __builtin_clz (val) ^ 31; +# elif defined(__ICCARM__) /* IAR Intrinsic */ + return 31 - __CLZ(val); # else /* Software version */ static const U32 DeBruijnClz[32] = { 0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30, 8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31 }; U32 v = val; diff --git a/thirdparty/zstd/compress/zstd_compress.c b/thirdparty/zstd/compress/zstd_compress.c index 1476512580..35346b92cb 100644 --- a/thirdparty/zstd/compress/zstd_compress.c +++ b/thirdparty/zstd/compress/zstd_compress.c @@ -21,6 +21,8 @@ #define HUF_STATIC_LINKING_ONLY #include "huf.h" #include "zstd_compress_internal.h" +#include "zstd_compress_sequences.h" +#include "zstd_compress_literals.h" #include "zstd_fast.h" #include "zstd_double_fast.h" #include "zstd_lazy.h" @@ -40,15 +42,15 @@ size_t ZSTD_compressBound(size_t srcSize) { * Context memory management ***************************************/ struct ZSTD_CDict_s { - void* dictBuffer; const void* dictContent; size_t dictContentSize; - void* workspace; - size_t workspaceSize; + U32* entropyWorkspace; /* entropy workspace of HUF_WORKSPACE_SIZE bytes */ + ZSTD_cwksp workspace; ZSTD_matchState_t matchState; ZSTD_compressedBlockState_t cBlockState; ZSTD_customMem customMem; U32 dictID; + int compressionLevel; /* 0 indicates that advanced API was used to select CDict params */ }; /* typedef'd to ZSTD_CDict within "zstd.h" */ ZSTD_CCtx* ZSTD_createCCtx(void) @@ -82,23 +84,26 @@ ZSTD_CCtx* ZSTD_createCCtx_advanced(ZSTD_customMem customMem) ZSTD_CCtx* ZSTD_initStaticCCtx(void *workspace, size_t workspaceSize) { - ZSTD_CCtx* const cctx = (ZSTD_CCtx*) workspace; + ZSTD_cwksp ws; + ZSTD_CCtx* cctx; if (workspaceSize <= sizeof(ZSTD_CCtx)) return NULL; /* minimum size */ if ((size_t)workspace & 7) return NULL; /* must be 8-aligned */ - memset(workspace, 0, workspaceSize); /* may be a bit generous, could memset be smaller ? */ + ZSTD_cwksp_init(&ws, workspace, workspaceSize); + + cctx = (ZSTD_CCtx*)ZSTD_cwksp_reserve_object(&ws, sizeof(ZSTD_CCtx)); + if (cctx == NULL) { + return NULL; + } + memset(cctx, 0, sizeof(ZSTD_CCtx)); + ZSTD_cwksp_move(&cctx->workspace, &ws); cctx->staticSize = workspaceSize; - cctx->workSpace = (void*)(cctx+1); - cctx->workSpaceSize = workspaceSize - sizeof(ZSTD_CCtx); /* statically sized space. entropyWorkspace never moves (but prev/next block swap places) */ - if (cctx->workSpaceSize < HUF_WORKSPACE_SIZE + 2 * sizeof(ZSTD_compressedBlockState_t)) return NULL; - assert(((size_t)cctx->workSpace & (sizeof(void*)-1)) == 0); /* ensure correct alignment */ - cctx->blockState.prevCBlock = (ZSTD_compressedBlockState_t*)cctx->workSpace; - cctx->blockState.nextCBlock = cctx->blockState.prevCBlock + 1; - { - void* const ptr = cctx->blockState.nextCBlock + 1; - cctx->entropyWorkspace = (U32*)ptr; - } + if (!ZSTD_cwksp_check_available(&cctx->workspace, HUF_WORKSPACE_SIZE + 2 * sizeof(ZSTD_compressedBlockState_t))) return NULL; + cctx->blockState.prevCBlock = (ZSTD_compressedBlockState_t*)ZSTD_cwksp_reserve_object(&cctx->workspace, sizeof(ZSTD_compressedBlockState_t)); + cctx->blockState.nextCBlock = (ZSTD_compressedBlockState_t*)ZSTD_cwksp_reserve_object(&cctx->workspace, sizeof(ZSTD_compressedBlockState_t)); + cctx->entropyWorkspace = (U32*)ZSTD_cwksp_reserve_object( + &cctx->workspace, HUF_WORKSPACE_SIZE); cctx->bmi2 = ZSTD_cpuid_bmi2(ZSTD_cpuid()); return cctx; } @@ -126,11 +131,11 @@ static void ZSTD_freeCCtxContent(ZSTD_CCtx* cctx) { assert(cctx != NULL); assert(cctx->staticSize == 0); - ZSTD_free(cctx->workSpace, cctx->customMem); cctx->workSpace = NULL; ZSTD_clearAllDicts(cctx); #ifdef ZSTD_MULTITHREAD ZSTDMT_freeCCtx(cctx->mtctx); cctx->mtctx = NULL; #endif + ZSTD_cwksp_free(&cctx->workspace, cctx->customMem); } size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx) @@ -138,8 +143,13 @@ size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx) if (cctx==NULL) return 0; /* support free on NULL */ RETURN_ERROR_IF(cctx->staticSize, memory_allocation, "not compatible with static CCtx"); - ZSTD_freeCCtxContent(cctx); - ZSTD_free(cctx, cctx->customMem); + { + int cctxInWorkspace = ZSTD_cwksp_owns_buffer(&cctx->workspace, cctx); + ZSTD_freeCCtxContent(cctx); + if (!cctxInWorkspace) { + ZSTD_free(cctx, cctx->customMem); + } + } return 0; } @@ -158,7 +168,9 @@ static size_t ZSTD_sizeof_mtctx(const ZSTD_CCtx* cctx) size_t ZSTD_sizeof_CCtx(const ZSTD_CCtx* cctx) { if (cctx==NULL) return 0; /* support sizeof on NULL */ - return sizeof(*cctx) + cctx->workSpaceSize + /* cctx may be in the workspace */ + return (cctx->workspace.workspace == cctx ? 0 : sizeof(*cctx)) + + ZSTD_cwksp_sizeof(&cctx->workspace) + ZSTD_sizeof_localDict(cctx->localDict) + ZSTD_sizeof_mtctx(cctx); } @@ -227,23 +239,23 @@ size_t ZSTD_CCtxParams_init_advanced(ZSTD_CCtx_params* cctxParams, ZSTD_paramete RETURN_ERROR_IF(!cctxParams, GENERIC); FORWARD_IF_ERROR( ZSTD_checkCParams(params.cParams) ); memset(cctxParams, 0, sizeof(*cctxParams)); + assert(!ZSTD_checkCParams(params.cParams)); cctxParams->cParams = params.cParams; cctxParams->fParams = params.fParams; cctxParams->compressionLevel = ZSTD_CLEVEL_DEFAULT; /* should not matter, as all cParams are presumed properly defined */ - assert(!ZSTD_checkCParams(params.cParams)); return 0; } /* ZSTD_assignParamsToCCtxParams() : * params is presumed valid at this stage */ static ZSTD_CCtx_params ZSTD_assignParamsToCCtxParams( - ZSTD_CCtx_params cctxParams, ZSTD_parameters params) + const ZSTD_CCtx_params* cctxParams, ZSTD_parameters params) { - ZSTD_CCtx_params ret = cctxParams; + ZSTD_CCtx_params ret = *cctxParams; + assert(!ZSTD_checkCParams(params.cParams)); ret.cParams = params.cParams; ret.fParams = params.fParams; ret.compressionLevel = ZSTD_CLEVEL_DEFAULT; /* should not matter, as all cParams are presumed properly defined */ - assert(!ZSTD_checkCParams(params.cParams)); return ret; } @@ -376,7 +388,7 @@ ZSTD_bounds ZSTD_cParam_getBounds(ZSTD_cParameter param) case ZSTD_c_forceAttachDict: ZSTD_STATIC_ASSERT(ZSTD_dictDefaultAttach < ZSTD_dictForceCopy); bounds.lowerBound = ZSTD_dictDefaultAttach; - bounds.upperBound = ZSTD_dictForceCopy; /* note : how to ensure at compile time that this is the highest value enum ? */ + bounds.upperBound = ZSTD_dictForceLoad; /* note : how to ensure at compile time that this is the highest value enum ? */ return bounds; case ZSTD_c_literalCompressionMode: @@ -390,6 +402,11 @@ ZSTD_bounds ZSTD_cParam_getBounds(ZSTD_cParameter param) bounds.upperBound = ZSTD_TARGETCBLOCKSIZE_MAX; return bounds; + case ZSTD_c_srcSizeHint: + bounds.lowerBound = ZSTD_SRCSIZEHINT_MIN; + bounds.upperBound = ZSTD_SRCSIZEHINT_MAX; + return bounds; + default: { ZSTD_bounds const boundError = { ERROR(parameter_unsupported), 0, 0 }; return boundError; @@ -397,18 +414,6 @@ ZSTD_bounds ZSTD_cParam_getBounds(ZSTD_cParameter param) } } -/* ZSTD_cParam_withinBounds: - * @return 1 if value is within cParam bounds, - * 0 otherwise */ -static int ZSTD_cParam_withinBounds(ZSTD_cParameter cParam, int value) -{ - ZSTD_bounds const bounds = ZSTD_cParam_getBounds(cParam); - if (ZSTD_isError(bounds.error)) return 0; - if (value < bounds.lowerBound) return 0; - if (value > bounds.upperBound) return 0; - return 1; -} - /* ZSTD_cParam_clampBounds: * Clamps the value into the bounded range. */ @@ -458,6 +463,7 @@ static int ZSTD_isUpdateAuthorized(ZSTD_cParameter param) case ZSTD_c_forceAttachDict: case ZSTD_c_literalCompressionMode: case ZSTD_c_targetCBlockSize: + case ZSTD_c_srcSizeHint: default: return 0; } @@ -504,6 +510,7 @@ size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, int value) case ZSTD_c_ldmMinMatch: case ZSTD_c_ldmBucketSizeLog: case ZSTD_c_targetCBlockSize: + case ZSTD_c_srcSizeHint: break; default: RETURN_ERROR(parameter_unsupported); @@ -527,33 +534,33 @@ size_t ZSTD_CCtxParams_setParameter(ZSTD_CCtx_params* CCtxParams, if (value) { /* 0 : does not change current level */ CCtxParams->compressionLevel = value; } - if (CCtxParams->compressionLevel >= 0) return CCtxParams->compressionLevel; + if (CCtxParams->compressionLevel >= 0) return (size_t)CCtxParams->compressionLevel; return 0; /* return type (size_t) cannot represent negative values */ } case ZSTD_c_windowLog : if (value!=0) /* 0 => use default */ BOUNDCHECK(ZSTD_c_windowLog, value); - CCtxParams->cParams.windowLog = value; + CCtxParams->cParams.windowLog = (U32)value; return CCtxParams->cParams.windowLog; case ZSTD_c_hashLog : if (value!=0) /* 0 => use default */ BOUNDCHECK(ZSTD_c_hashLog, value); - CCtxParams->cParams.hashLog = value; + CCtxParams->cParams.hashLog = (U32)value; return CCtxParams->cParams.hashLog; case ZSTD_c_chainLog : if (value!=0) /* 0 => use default */ BOUNDCHECK(ZSTD_c_chainLog, value); - CCtxParams->cParams.chainLog = value; + CCtxParams->cParams.chainLog = (U32)value; return CCtxParams->cParams.chainLog; case ZSTD_c_searchLog : if (value!=0) /* 0 => use default */ BOUNDCHECK(ZSTD_c_searchLog, value); - CCtxParams->cParams.searchLog = value; - return value; + CCtxParams->cParams.searchLog = (U32)value; + return (size_t)value; case ZSTD_c_minMatch : if (value!=0) /* 0 => use default */ @@ -684,6 +691,12 @@ size_t ZSTD_CCtxParams_setParameter(ZSTD_CCtx_params* CCtxParams, CCtxParams->targetCBlockSize = value; return CCtxParams->targetCBlockSize; + case ZSTD_c_srcSizeHint : + if (value!=0) /* 0 ==> default */ + BOUNDCHECK(ZSTD_c_srcSizeHint, value); + CCtxParams->srcSizeHint = value; + return CCtxParams->srcSizeHint; + default: RETURN_ERROR(parameter_unsupported, "unknown parameter"); } } @@ -789,6 +802,9 @@ size_t ZSTD_CCtxParams_getParameter( case ZSTD_c_targetCBlockSize : *value = (int)CCtxParams->targetCBlockSize; break; + case ZSTD_c_srcSizeHint : + *value = (int)CCtxParams->srcSizeHint; + break; default: RETURN_ERROR(parameter_unsupported, "unknown parameter"); } return 0; @@ -1039,7 +1055,11 @@ ZSTD_adjustCParams(ZSTD_compressionParameters cPar, ZSTD_compressionParameters ZSTD_getCParamsFromCCtxParams( const ZSTD_CCtx_params* CCtxParams, U64 srcSizeHint, size_t dictSize) { - ZSTD_compressionParameters cParams = ZSTD_getCParams(CCtxParams->compressionLevel, srcSizeHint, dictSize); + ZSTD_compressionParameters cParams; + if (srcSizeHint == ZSTD_CONTENTSIZE_UNKNOWN && CCtxParams->srcSizeHint > 0) { + srcSizeHint = CCtxParams->srcSizeHint; + } + cParams = ZSTD_getCParams(CCtxParams->compressionLevel, srcSizeHint, dictSize); if (CCtxParams->ldmParams.enableLdm) cParams.windowLog = ZSTD_LDM_DEFAULT_WINDOW_LOG; if (CCtxParams->cParams.windowLog) cParams.windowLog = CCtxParams->cParams.windowLog; if (CCtxParams->cParams.hashLog) cParams.hashLog = CCtxParams->cParams.hashLog; @@ -1059,10 +1079,19 @@ ZSTD_sizeof_matchState(const ZSTD_compressionParameters* const cParams, size_t const chainSize = (cParams->strategy == ZSTD_fast) ? 0 : ((size_t)1 << cParams->chainLog); size_t const hSize = ((size_t)1) << cParams->hashLog; U32 const hashLog3 = (forCCtx && cParams->minMatch==3) ? MIN(ZSTD_HASHLOG3_MAX, cParams->windowLog) : 0; - size_t const h3Size = ((size_t)1) << hashLog3; - size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32); - size_t const optPotentialSpace = ((MaxML+1) + (MaxLL+1) + (MaxOff+1) + (1<<Litbits)) * sizeof(U32) - + (ZSTD_OPT_NUM+1) * (sizeof(ZSTD_match_t)+sizeof(ZSTD_optimal_t)); + size_t const h3Size = hashLog3 ? ((size_t)1) << hashLog3 : 0; + /* We don't use ZSTD_cwksp_alloc_size() here because the tables aren't + * surrounded by redzones in ASAN. */ + size_t const tableSpace = chainSize * sizeof(U32) + + hSize * sizeof(U32) + + h3Size * sizeof(U32); + size_t const optPotentialSpace = + ZSTD_cwksp_alloc_size((MaxML+1) * sizeof(U32)) + + ZSTD_cwksp_alloc_size((MaxLL+1) * sizeof(U32)) + + ZSTD_cwksp_alloc_size((MaxOff+1) * sizeof(U32)) + + ZSTD_cwksp_alloc_size((1<<Litbits) * sizeof(U32)) + + ZSTD_cwksp_alloc_size((ZSTD_OPT_NUM+1) * sizeof(ZSTD_match_t)) + + ZSTD_cwksp_alloc_size((ZSTD_OPT_NUM+1) * sizeof(ZSTD_optimal_t)); size_t const optSpace = (forCCtx && (cParams->strategy >= ZSTD_btopt)) ? optPotentialSpace : 0; @@ -1079,20 +1108,23 @@ size_t ZSTD_estimateCCtxSize_usingCCtxParams(const ZSTD_CCtx_params* params) size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, (size_t)1 << cParams.windowLog); U32 const divider = (cParams.minMatch==3) ? 3 : 4; size_t const maxNbSeq = blockSize / divider; - size_t const tokenSpace = WILDCOPY_OVERLENGTH + blockSize + 11*maxNbSeq; - size_t const entropySpace = HUF_WORKSPACE_SIZE; - size_t const blockStateSpace = 2 * sizeof(ZSTD_compressedBlockState_t); + size_t const tokenSpace = ZSTD_cwksp_alloc_size(WILDCOPY_OVERLENGTH + blockSize) + + ZSTD_cwksp_alloc_size(maxNbSeq * sizeof(seqDef)) + + 3 * ZSTD_cwksp_alloc_size(maxNbSeq * sizeof(BYTE)); + size_t const entropySpace = ZSTD_cwksp_alloc_size(HUF_WORKSPACE_SIZE); + size_t const blockStateSpace = 2 * ZSTD_cwksp_alloc_size(sizeof(ZSTD_compressedBlockState_t)); size_t const matchStateSize = ZSTD_sizeof_matchState(&cParams, /* forCCtx */ 1); size_t const ldmSpace = ZSTD_ldm_getTableSize(params->ldmParams); - size_t const ldmSeqSpace = ZSTD_ldm_getMaxNbSeq(params->ldmParams, blockSize) * sizeof(rawSeq); + size_t const ldmSeqSpace = ZSTD_cwksp_alloc_size(ZSTD_ldm_getMaxNbSeq(params->ldmParams, blockSize) * sizeof(rawSeq)); size_t const neededSpace = entropySpace + blockStateSpace + tokenSpace + matchStateSize + ldmSpace + ldmSeqSpace; + size_t const cctxSpace = ZSTD_cwksp_alloc_size(sizeof(ZSTD_CCtx)); - DEBUGLOG(5, "sizeof(ZSTD_CCtx) : %u", (U32)sizeof(ZSTD_CCtx)); - DEBUGLOG(5, "estimate workSpace : %u", (U32)neededSpace); - return sizeof(ZSTD_CCtx) + neededSpace; + DEBUGLOG(5, "sizeof(ZSTD_CCtx) : %u", (U32)cctxSpace); + DEBUGLOG(5, "estimate workspace : %u", (U32)neededSpace); + return cctxSpace + neededSpace; } } @@ -1128,7 +1160,8 @@ size_t ZSTD_estimateCStreamSize_usingCCtxParams(const ZSTD_CCtx_params* params) size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, (size_t)1 << cParams.windowLog); size_t const inBuffSize = ((size_t)1 << cParams.windowLog) + blockSize; size_t const outBuffSize = ZSTD_compressBound(blockSize) + 1; - size_t const streamingSize = inBuffSize + outBuffSize; + size_t const streamingSize = ZSTD_cwksp_alloc_size(inBuffSize) + + ZSTD_cwksp_alloc_size(outBuffSize); return CCtxSize + streamingSize; } @@ -1196,17 +1229,6 @@ size_t ZSTD_toFlushNow(ZSTD_CCtx* cctx) return 0; /* over-simplification; could also check if context is currently running in streaming mode, and in which case, report how many bytes are left to be flushed within output buffer */ } - - -static U32 ZSTD_equivalentCParams(ZSTD_compressionParameters cParams1, - ZSTD_compressionParameters cParams2) -{ - return (cParams1.hashLog == cParams2.hashLog) - & (cParams1.chainLog == cParams2.chainLog) - & (cParams1.strategy == cParams2.strategy) /* opt parser space */ - & ((cParams1.minMatch==3) == (cParams2.minMatch==3)); /* hashlog3 space */ -} - static void ZSTD_assertEqualCParams(ZSTD_compressionParameters cParams1, ZSTD_compressionParameters cParams2) { @@ -1221,71 +1243,6 @@ static void ZSTD_assertEqualCParams(ZSTD_compressionParameters cParams1, assert(cParams1.strategy == cParams2.strategy); } -/** The parameters are equivalent if ldm is not enabled in both sets or - * all the parameters are equivalent. */ -static U32 ZSTD_equivalentLdmParams(ldmParams_t ldmParams1, - ldmParams_t ldmParams2) -{ - return (!ldmParams1.enableLdm && !ldmParams2.enableLdm) || - (ldmParams1.enableLdm == ldmParams2.enableLdm && - ldmParams1.hashLog == ldmParams2.hashLog && - ldmParams1.bucketSizeLog == ldmParams2.bucketSizeLog && - ldmParams1.minMatchLength == ldmParams2.minMatchLength && - ldmParams1.hashRateLog == ldmParams2.hashRateLog); -} - -typedef enum { ZSTDb_not_buffered, ZSTDb_buffered } ZSTD_buffered_policy_e; - -/* ZSTD_sufficientBuff() : - * check internal buffers exist for streaming if buffPol == ZSTDb_buffered . - * Note : they are assumed to be correctly sized if ZSTD_equivalentCParams()==1 */ -static U32 ZSTD_sufficientBuff(size_t bufferSize1, size_t maxNbSeq1, - size_t maxNbLit1, - ZSTD_buffered_policy_e buffPol2, - ZSTD_compressionParameters cParams2, - U64 pledgedSrcSize) -{ - size_t const windowSize2 = MAX(1, (size_t)MIN(((U64)1 << cParams2.windowLog), pledgedSrcSize)); - size_t const blockSize2 = MIN(ZSTD_BLOCKSIZE_MAX, windowSize2); - size_t const maxNbSeq2 = blockSize2 / ((cParams2.minMatch == 3) ? 3 : 4); - size_t const maxNbLit2 = blockSize2; - size_t const neededBufferSize2 = (buffPol2==ZSTDb_buffered) ? windowSize2 + blockSize2 : 0; - DEBUGLOG(4, "ZSTD_sufficientBuff: is neededBufferSize2=%u <= bufferSize1=%u", - (U32)neededBufferSize2, (U32)bufferSize1); - DEBUGLOG(4, "ZSTD_sufficientBuff: is maxNbSeq2=%u <= maxNbSeq1=%u", - (U32)maxNbSeq2, (U32)maxNbSeq1); - DEBUGLOG(4, "ZSTD_sufficientBuff: is maxNbLit2=%u <= maxNbLit1=%u", - (U32)maxNbLit2, (U32)maxNbLit1); - return (maxNbLit2 <= maxNbLit1) - & (maxNbSeq2 <= maxNbSeq1) - & (neededBufferSize2 <= bufferSize1); -} - -/** Equivalence for resetCCtx purposes */ -static U32 ZSTD_equivalentParams(ZSTD_CCtx_params params1, - ZSTD_CCtx_params params2, - size_t buffSize1, - size_t maxNbSeq1, size_t maxNbLit1, - ZSTD_buffered_policy_e buffPol2, - U64 pledgedSrcSize) -{ - DEBUGLOG(4, "ZSTD_equivalentParams: pledgedSrcSize=%u", (U32)pledgedSrcSize); - if (!ZSTD_equivalentCParams(params1.cParams, params2.cParams)) { - DEBUGLOG(4, "ZSTD_equivalentCParams() == 0"); - return 0; - } - if (!ZSTD_equivalentLdmParams(params1.ldmParams, params2.ldmParams)) { - DEBUGLOG(4, "ZSTD_equivalentLdmParams() == 0"); - return 0; - } - if (!ZSTD_sufficientBuff(buffSize1, maxNbSeq1, maxNbLit1, buffPol2, - params2.cParams, pledgedSrcSize)) { - DEBUGLOG(4, "ZSTD_sufficientBuff() == 0"); - return 0; - } - return 1; -} - static void ZSTD_reset_compressedBlockState(ZSTD_compressedBlockState_t* bs) { int i; @@ -1311,87 +1268,104 @@ static void ZSTD_invalidateMatchState(ZSTD_matchState_t* ms) ms->dictMatchState = NULL; } -/*! ZSTD_continueCCtx() : - * reuse CCtx without reset (note : requires no dictionary) */ -static size_t ZSTD_continueCCtx(ZSTD_CCtx* cctx, ZSTD_CCtx_params params, U64 pledgedSrcSize) -{ - size_t const windowSize = MAX(1, (size_t)MIN(((U64)1 << params.cParams.windowLog), pledgedSrcSize)); - size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, windowSize); - DEBUGLOG(4, "ZSTD_continueCCtx: re-use context in place"); +/** + * Indicates whether this compression proceeds directly from user-provided + * source buffer to user-provided destination buffer (ZSTDb_not_buffered), or + * whether the context needs to buffer the input/output (ZSTDb_buffered). + */ +typedef enum { + ZSTDb_not_buffered, + ZSTDb_buffered +} ZSTD_buffered_policy_e; - cctx->blockSize = blockSize; /* previous block size could be different even for same windowLog, due to pledgedSrcSize */ - cctx->appliedParams = params; - cctx->blockState.matchState.cParams = params.cParams; - cctx->pledgedSrcSizePlusOne = pledgedSrcSize+1; - cctx->consumedSrcSize = 0; - cctx->producedCSize = 0; - if (pledgedSrcSize == ZSTD_CONTENTSIZE_UNKNOWN) - cctx->appliedParams.fParams.contentSizeFlag = 0; - DEBUGLOG(4, "pledged content size : %u ; flag : %u", - (U32)pledgedSrcSize, cctx->appliedParams.fParams.contentSizeFlag); - cctx->stage = ZSTDcs_init; - cctx->dictID = 0; - if (params.ldmParams.enableLdm) - ZSTD_window_clear(&cctx->ldmState.window); - ZSTD_referenceExternalSequences(cctx, NULL, 0); - ZSTD_invalidateMatchState(&cctx->blockState.matchState); - ZSTD_reset_compressedBlockState(cctx->blockState.prevCBlock); - XXH64_reset(&cctx->xxhState, 0); - return 0; -} +/** + * Controls, for this matchState reset, whether the tables need to be cleared / + * prepared for the coming compression (ZSTDcrp_makeClean), or whether the + * tables can be left unclean (ZSTDcrp_leaveDirty), because we know that a + * subsequent operation will overwrite the table space anyways (e.g., copying + * the matchState contents in from a CDict). + */ +typedef enum { + ZSTDcrp_makeClean, + ZSTDcrp_leaveDirty +} ZSTD_compResetPolicy_e; -typedef enum { ZSTDcrp_continue, ZSTDcrp_noMemset } ZSTD_compResetPolicy_e; +/** + * Controls, for this matchState reset, whether indexing can continue where it + * left off (ZSTDirp_continue), or whether it needs to be restarted from zero + * (ZSTDirp_reset). + */ +typedef enum { + ZSTDirp_continue, + ZSTDirp_reset +} ZSTD_indexResetPolicy_e; -typedef enum { ZSTD_resetTarget_CDict, ZSTD_resetTarget_CCtx } ZSTD_resetTarget_e; +typedef enum { + ZSTD_resetTarget_CDict, + ZSTD_resetTarget_CCtx +} ZSTD_resetTarget_e; -static void* +static size_t ZSTD_reset_matchState(ZSTD_matchState_t* ms, - void* ptr, + ZSTD_cwksp* ws, const ZSTD_compressionParameters* cParams, - ZSTD_compResetPolicy_e const crp, ZSTD_resetTarget_e const forWho) + const ZSTD_compResetPolicy_e crp, + const ZSTD_indexResetPolicy_e forceResetIndex, + const ZSTD_resetTarget_e forWho) { size_t const chainSize = (cParams->strategy == ZSTD_fast) ? 0 : ((size_t)1 << cParams->chainLog); size_t const hSize = ((size_t)1) << cParams->hashLog; U32 const hashLog3 = ((forWho == ZSTD_resetTarget_CCtx) && cParams->minMatch==3) ? MIN(ZSTD_HASHLOG3_MAX, cParams->windowLog) : 0; - size_t const h3Size = ((size_t)1) << hashLog3; - size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32); - - assert(((size_t)ptr & 3) == 0); + size_t const h3Size = hashLog3 ? ((size_t)1) << hashLog3 : 0; + + DEBUGLOG(4, "reset indices : %u", forceResetIndex == ZSTDirp_reset); + if (forceResetIndex == ZSTDirp_reset) { + memset(&ms->window, 0, sizeof(ms->window)); + ms->window.dictLimit = 1; /* start from 1, so that 1st position is valid */ + ms->window.lowLimit = 1; /* it ensures first and later CCtx usages compress the same */ + ms->window.nextSrc = ms->window.base + 1; /* see issue #1241 */ + ZSTD_cwksp_mark_tables_dirty(ws); + } ms->hashLog3 = hashLog3; - memset(&ms->window, 0, sizeof(ms->window)); - ms->window.dictLimit = 1; /* start from 1, so that 1st position is valid */ - ms->window.lowLimit = 1; /* it ensures first and later CCtx usages compress the same */ - ms->window.nextSrc = ms->window.base + 1; /* see issue #1241 */ + ZSTD_invalidateMatchState(ms); + assert(!ZSTD_cwksp_reserve_failed(ws)); /* check that allocation hasn't already failed */ + + ZSTD_cwksp_clear_tables(ws); + + DEBUGLOG(5, "reserving table space"); + /* table Space */ + ms->hashTable = (U32*)ZSTD_cwksp_reserve_table(ws, hSize * sizeof(U32)); + ms->chainTable = (U32*)ZSTD_cwksp_reserve_table(ws, chainSize * sizeof(U32)); + ms->hashTable3 = (U32*)ZSTD_cwksp_reserve_table(ws, h3Size * sizeof(U32)); + RETURN_ERROR_IF(ZSTD_cwksp_reserve_failed(ws), memory_allocation, + "failed a workspace allocation in ZSTD_reset_matchState"); + + DEBUGLOG(4, "reset table : %u", crp!=ZSTDcrp_leaveDirty); + if (crp!=ZSTDcrp_leaveDirty) { + /* reset tables only */ + ZSTD_cwksp_clean_tables(ws); + } + /* opt parser space */ if ((forWho == ZSTD_resetTarget_CCtx) && (cParams->strategy >= ZSTD_btopt)) { DEBUGLOG(4, "reserving optimal parser space"); - ms->opt.litFreq = (unsigned*)ptr; - ms->opt.litLengthFreq = ms->opt.litFreq + (1<<Litbits); - ms->opt.matchLengthFreq = ms->opt.litLengthFreq + (MaxLL+1); - ms->opt.offCodeFreq = ms->opt.matchLengthFreq + (MaxML+1); - ptr = ms->opt.offCodeFreq + (MaxOff+1); - ms->opt.matchTable = (ZSTD_match_t*)ptr; - ptr = ms->opt.matchTable + ZSTD_OPT_NUM+1; - ms->opt.priceTable = (ZSTD_optimal_t*)ptr; - ptr = ms->opt.priceTable + ZSTD_OPT_NUM+1; + ms->opt.litFreq = (unsigned*)ZSTD_cwksp_reserve_aligned(ws, (1<<Litbits) * sizeof(unsigned)); + ms->opt.litLengthFreq = (unsigned*)ZSTD_cwksp_reserve_aligned(ws, (MaxLL+1) * sizeof(unsigned)); + ms->opt.matchLengthFreq = (unsigned*)ZSTD_cwksp_reserve_aligned(ws, (MaxML+1) * sizeof(unsigned)); + ms->opt.offCodeFreq = (unsigned*)ZSTD_cwksp_reserve_aligned(ws, (MaxOff+1) * sizeof(unsigned)); + ms->opt.matchTable = (ZSTD_match_t*)ZSTD_cwksp_reserve_aligned(ws, (ZSTD_OPT_NUM+1) * sizeof(ZSTD_match_t)); + ms->opt.priceTable = (ZSTD_optimal_t*)ZSTD_cwksp_reserve_aligned(ws, (ZSTD_OPT_NUM+1) * sizeof(ZSTD_optimal_t)); } - /* table Space */ - DEBUGLOG(4, "reset table : %u", crp!=ZSTDcrp_noMemset); - assert(((size_t)ptr & 3) == 0); /* ensure ptr is properly aligned */ - if (crp!=ZSTDcrp_noMemset) memset(ptr, 0, tableSpace); /* reset tables only */ - ms->hashTable = (U32*)(ptr); - ms->chainTable = ms->hashTable + hSize; - ms->hashTable3 = ms->chainTable + chainSize; - ptr = ms->hashTable3 + h3Size; - ms->cParams = *cParams; - assert(((size_t)ptr & 3) == 0); - return ptr; + RETURN_ERROR_IF(ZSTD_cwksp_reserve_failed(ws), memory_allocation, + "failed a workspace allocation in ZSTD_reset_matchState"); + + return 0; } /* ZSTD_indexTooCloseToMax() : @@ -1407,13 +1381,6 @@ static int ZSTD_indexTooCloseToMax(ZSTD_window_t w) return (size_t)(w.nextSrc - w.base) > (ZSTD_CURRENT_MAX - ZSTD_INDEXOVERFLOW_MARGIN); } -#define ZSTD_WORKSPACETOOLARGE_FACTOR 3 /* define "workspace is too large" as this number of times larger than needed */ -#define ZSTD_WORKSPACETOOLARGE_MAXDURATION 128 /* when workspace is continuously too large - * during at least this number of times, - * context's memory usage is considered wasteful, - * because it's sized to handle a worst case scenario which rarely happens. - * In which case, resize it down to free some memory */ - /*! ZSTD_resetCCtx_internal() : note : `params` are assumed fully validated at this stage */ static size_t ZSTD_resetCCtx_internal(ZSTD_CCtx* zc, @@ -1422,30 +1389,12 @@ static size_t ZSTD_resetCCtx_internal(ZSTD_CCtx* zc, ZSTD_compResetPolicy_e const crp, ZSTD_buffered_policy_e const zbuff) { + ZSTD_cwksp* const ws = &zc->workspace; DEBUGLOG(4, "ZSTD_resetCCtx_internal: pledgedSrcSize=%u, wlog=%u", (U32)pledgedSrcSize, params.cParams.windowLog); assert(!ZSTD_isError(ZSTD_checkCParams(params.cParams))); - if (crp == ZSTDcrp_continue) { - if (ZSTD_equivalentParams(zc->appliedParams, params, - zc->inBuffSize, - zc->seqStore.maxNbSeq, zc->seqStore.maxNbLit, - zbuff, pledgedSrcSize) ) { - DEBUGLOG(4, "ZSTD_equivalentParams()==1 -> consider continue mode"); - zc->workSpaceOversizedDuration += (zc->workSpaceOversizedDuration > 0); /* if it was too large, it still is */ - if (zc->workSpaceOversizedDuration <= ZSTD_WORKSPACETOOLARGE_MAXDURATION) { - DEBUGLOG(4, "continue mode confirmed (wLog1=%u, blockSize1=%zu)", - zc->appliedParams.cParams.windowLog, zc->blockSize); - if (ZSTD_indexTooCloseToMax(zc->blockState.matchState.window)) { - /* prefer a reset, faster than a rescale */ - ZSTD_reset_matchState(&zc->blockState.matchState, - zc->entropyWorkspace + HUF_WORKSPACE_SIZE_U32, - ¶ms.cParams, - crp, ZSTD_resetTarget_CCtx); - } - return ZSTD_continueCCtx(zc, params, pledgedSrcSize); - } } } - DEBUGLOG(4, "ZSTD_equivalentParams()==0 -> reset CCtx"); + zc->isFirstBlock = 1; if (params.ldmParams.enableLdm) { /* Adjust long distance matching parameters */ @@ -1459,58 +1408,74 @@ static size_t ZSTD_resetCCtx_internal(ZSTD_CCtx* zc, size_t const blockSize = MIN(ZSTD_BLOCKSIZE_MAX, windowSize); U32 const divider = (params.cParams.minMatch==3) ? 3 : 4; size_t const maxNbSeq = blockSize / divider; - size_t const tokenSpace = WILDCOPY_OVERLENGTH + blockSize + 11*maxNbSeq; + size_t const tokenSpace = ZSTD_cwksp_alloc_size(WILDCOPY_OVERLENGTH + blockSize) + + ZSTD_cwksp_alloc_size(maxNbSeq * sizeof(seqDef)) + + 3 * ZSTD_cwksp_alloc_size(maxNbSeq * sizeof(BYTE)); size_t const buffOutSize = (zbuff==ZSTDb_buffered) ? ZSTD_compressBound(blockSize)+1 : 0; size_t const buffInSize = (zbuff==ZSTDb_buffered) ? windowSize + blockSize : 0; size_t const matchStateSize = ZSTD_sizeof_matchState(¶ms.cParams, /* forCCtx */ 1); size_t const maxNbLdmSeq = ZSTD_ldm_getMaxNbSeq(params.ldmParams, blockSize); - void* ptr; /* used to partition workSpace */ - /* Check if workSpace is large enough, alloc a new one if needed */ - { size_t const entropySpace = HUF_WORKSPACE_SIZE; - size_t const blockStateSpace = 2 * sizeof(ZSTD_compressedBlockState_t); - size_t const bufferSpace = buffInSize + buffOutSize; + ZSTD_indexResetPolicy_e needsIndexReset = ZSTDirp_continue; + + if (ZSTD_indexTooCloseToMax(zc->blockState.matchState.window)) { + needsIndexReset = ZSTDirp_reset; + } + + ZSTD_cwksp_bump_oversized_duration(ws, 0); + + /* Check if workspace is large enough, alloc a new one if needed */ + { size_t const cctxSpace = zc->staticSize ? ZSTD_cwksp_alloc_size(sizeof(ZSTD_CCtx)) : 0; + size_t const entropySpace = ZSTD_cwksp_alloc_size(HUF_WORKSPACE_SIZE); + size_t const blockStateSpace = 2 * ZSTD_cwksp_alloc_size(sizeof(ZSTD_compressedBlockState_t)); + size_t const bufferSpace = ZSTD_cwksp_alloc_size(buffInSize) + ZSTD_cwksp_alloc_size(buffOutSize); size_t const ldmSpace = ZSTD_ldm_getTableSize(params.ldmParams); - size_t const ldmSeqSpace = maxNbLdmSeq * sizeof(rawSeq); + size_t const ldmSeqSpace = ZSTD_cwksp_alloc_size(maxNbLdmSeq * sizeof(rawSeq)); - size_t const neededSpace = entropySpace + blockStateSpace + ldmSpace + - ldmSeqSpace + matchStateSize + tokenSpace + - bufferSpace; + size_t const neededSpace = + cctxSpace + + entropySpace + + blockStateSpace + + ldmSpace + + ldmSeqSpace + + matchStateSize + + tokenSpace + + bufferSpace; - int const workSpaceTooSmall = zc->workSpaceSize < neededSpace; - int const workSpaceTooLarge = zc->workSpaceSize > ZSTD_WORKSPACETOOLARGE_FACTOR * neededSpace; - int const workSpaceWasteful = workSpaceTooLarge && (zc->workSpaceOversizedDuration > ZSTD_WORKSPACETOOLARGE_MAXDURATION); - zc->workSpaceOversizedDuration = workSpaceTooLarge ? zc->workSpaceOversizedDuration+1 : 0; + int const workspaceTooSmall = ZSTD_cwksp_sizeof(ws) < neededSpace; + int const workspaceWasteful = ZSTD_cwksp_check_wasteful(ws, neededSpace); DEBUGLOG(4, "Need %zuKB workspace, including %zuKB for match state, and %zuKB for buffers", neededSpace>>10, matchStateSize>>10, bufferSpace>>10); DEBUGLOG(4, "windowSize: %zu - blockSize: %zu", windowSize, blockSize); - if (workSpaceTooSmall || workSpaceWasteful) { - DEBUGLOG(4, "Resize workSpaceSize from %zuKB to %zuKB", - zc->workSpaceSize >> 10, + if (workspaceTooSmall || workspaceWasteful) { + DEBUGLOG(4, "Resize workspaceSize from %zuKB to %zuKB", + ZSTD_cwksp_sizeof(ws) >> 10, neededSpace >> 10); RETURN_ERROR_IF(zc->staticSize, memory_allocation, "static cctx : no resize"); - zc->workSpaceSize = 0; - ZSTD_free(zc->workSpace, zc->customMem); - zc->workSpace = ZSTD_malloc(neededSpace, zc->customMem); - RETURN_ERROR_IF(zc->workSpace == NULL, memory_allocation); - zc->workSpaceSize = neededSpace; - zc->workSpaceOversizedDuration = 0; + needsIndexReset = ZSTDirp_reset; + + ZSTD_cwksp_free(ws, zc->customMem); + FORWARD_IF_ERROR(ZSTD_cwksp_create(ws, neededSpace, zc->customMem)); + DEBUGLOG(5, "reserving object space"); /* Statically sized space. * entropyWorkspace never moves, * though prev/next block swap places */ - assert(((size_t)zc->workSpace & 3) == 0); /* ensure correct alignment */ - assert(zc->workSpaceSize >= 2 * sizeof(ZSTD_compressedBlockState_t)); - zc->blockState.prevCBlock = (ZSTD_compressedBlockState_t*)zc->workSpace; - zc->blockState.nextCBlock = zc->blockState.prevCBlock + 1; - ptr = zc->blockState.nextCBlock + 1; - zc->entropyWorkspace = (U32*)ptr; + assert(ZSTD_cwksp_check_available(ws, 2 * sizeof(ZSTD_compressedBlockState_t))); + zc->blockState.prevCBlock = (ZSTD_compressedBlockState_t*) ZSTD_cwksp_reserve_object(ws, sizeof(ZSTD_compressedBlockState_t)); + RETURN_ERROR_IF(zc->blockState.prevCBlock == NULL, memory_allocation, "couldn't allocate prevCBlock"); + zc->blockState.nextCBlock = (ZSTD_compressedBlockState_t*) ZSTD_cwksp_reserve_object(ws, sizeof(ZSTD_compressedBlockState_t)); + RETURN_ERROR_IF(zc->blockState.nextCBlock == NULL, memory_allocation, "couldn't allocate nextCBlock"); + zc->entropyWorkspace = (U32*) ZSTD_cwksp_reserve_object(ws, HUF_WORKSPACE_SIZE); + RETURN_ERROR_IF(zc->blockState.nextCBlock == NULL, memory_allocation, "couldn't allocate entropyWorkspace"); } } + ZSTD_cwksp_clear(ws); + /* init params */ zc->appliedParams = params; zc->blockState.matchState.cParams = params.cParams; @@ -1529,58 +1494,58 @@ static size_t ZSTD_resetCCtx_internal(ZSTD_CCtx* zc, ZSTD_reset_compressedBlockState(zc->blockState.prevCBlock); - ptr = ZSTD_reset_matchState(&zc->blockState.matchState, - zc->entropyWorkspace + HUF_WORKSPACE_SIZE_U32, - ¶ms.cParams, - crp, ZSTD_resetTarget_CCtx); - - /* ldm hash table */ - /* initialize bucketOffsets table later for pointer alignment */ - if (params.ldmParams.enableLdm) { - size_t const ldmHSize = ((size_t)1) << params.ldmParams.hashLog; - memset(ptr, 0, ldmHSize * sizeof(ldmEntry_t)); - assert(((size_t)ptr & 3) == 0); /* ensure ptr is properly aligned */ - zc->ldmState.hashTable = (ldmEntry_t*)ptr; - ptr = zc->ldmState.hashTable + ldmHSize; - zc->ldmSequences = (rawSeq*)ptr; - ptr = zc->ldmSequences + maxNbLdmSeq; - zc->maxNbLdmSequences = maxNbLdmSeq; - - memset(&zc->ldmState.window, 0, sizeof(zc->ldmState.window)); - } - assert(((size_t)ptr & 3) == 0); /* ensure ptr is properly aligned */ - - /* sequences storage */ - zc->seqStore.maxNbSeq = maxNbSeq; - zc->seqStore.sequencesStart = (seqDef*)ptr; - ptr = zc->seqStore.sequencesStart + maxNbSeq; - zc->seqStore.llCode = (BYTE*) ptr; - zc->seqStore.mlCode = zc->seqStore.llCode + maxNbSeq; - zc->seqStore.ofCode = zc->seqStore.mlCode + maxNbSeq; - zc->seqStore.litStart = zc->seqStore.ofCode + maxNbSeq; /* ZSTD_wildcopy() is used to copy into the literals buffer, * so we have to oversize the buffer by WILDCOPY_OVERLENGTH bytes. */ + zc->seqStore.litStart = ZSTD_cwksp_reserve_buffer(ws, blockSize + WILDCOPY_OVERLENGTH); zc->seqStore.maxNbLit = blockSize; - ptr = zc->seqStore.litStart + blockSize + WILDCOPY_OVERLENGTH; + + /* buffers */ + zc->inBuffSize = buffInSize; + zc->inBuff = (char*)ZSTD_cwksp_reserve_buffer(ws, buffInSize); + zc->outBuffSize = buffOutSize; + zc->outBuff = (char*)ZSTD_cwksp_reserve_buffer(ws, buffOutSize); /* ldm bucketOffsets table */ if (params.ldmParams.enableLdm) { + /* TODO: avoid memset? */ size_t const ldmBucketSize = ((size_t)1) << (params.ldmParams.hashLog - params.ldmParams.bucketSizeLog); - memset(ptr, 0, ldmBucketSize); - zc->ldmState.bucketOffsets = (BYTE*)ptr; - ptr = zc->ldmState.bucketOffsets + ldmBucketSize; - ZSTD_window_clear(&zc->ldmState.window); + zc->ldmState.bucketOffsets = ZSTD_cwksp_reserve_buffer(ws, ldmBucketSize); + memset(zc->ldmState.bucketOffsets, 0, ldmBucketSize); } + + /* sequences storage */ ZSTD_referenceExternalSequences(zc, NULL, 0); + zc->seqStore.maxNbSeq = maxNbSeq; + zc->seqStore.llCode = ZSTD_cwksp_reserve_buffer(ws, maxNbSeq * sizeof(BYTE)); + zc->seqStore.mlCode = ZSTD_cwksp_reserve_buffer(ws, maxNbSeq * sizeof(BYTE)); + zc->seqStore.ofCode = ZSTD_cwksp_reserve_buffer(ws, maxNbSeq * sizeof(BYTE)); + zc->seqStore.sequencesStart = (seqDef*)ZSTD_cwksp_reserve_aligned(ws, maxNbSeq * sizeof(seqDef)); + + FORWARD_IF_ERROR(ZSTD_reset_matchState( + &zc->blockState.matchState, + ws, + ¶ms.cParams, + crp, + needsIndexReset, + ZSTD_resetTarget_CCtx)); - /* buffers */ - zc->inBuffSize = buffInSize; - zc->inBuff = (char*)ptr; - zc->outBuffSize = buffOutSize; - zc->outBuff = zc->inBuff + buffInSize; + /* ldm hash table */ + if (params.ldmParams.enableLdm) { + /* TODO: avoid memset? */ + size_t const ldmHSize = ((size_t)1) << params.ldmParams.hashLog; + zc->ldmState.hashTable = (ldmEntry_t*)ZSTD_cwksp_reserve_aligned(ws, ldmHSize * sizeof(ldmEntry_t)); + memset(zc->ldmState.hashTable, 0, ldmHSize * sizeof(ldmEntry_t)); + zc->ldmSequences = (rawSeq*)ZSTD_cwksp_reserve_aligned(ws, maxNbLdmSeq * sizeof(rawSeq)); + zc->maxNbLdmSequences = maxNbLdmSeq; + + memset(&zc->ldmState.window, 0, sizeof(zc->ldmState.window)); + ZSTD_window_clear(&zc->ldmState.window); + } + + DEBUGLOG(3, "wksp: finished allocating, %zd bytes remain available", ZSTD_cwksp_available_space(ws)); return 0; } @@ -1614,15 +1579,15 @@ static const size_t attachDictSizeCutoffs[ZSTD_STRATEGY_MAX+1] = { }; static int ZSTD_shouldAttachDict(const ZSTD_CDict* cdict, - ZSTD_CCtx_params params, + const ZSTD_CCtx_params* params, U64 pledgedSrcSize) { size_t cutoff = attachDictSizeCutoffs[cdict->matchState.cParams.strategy]; return ( pledgedSrcSize <= cutoff || pledgedSrcSize == ZSTD_CONTENTSIZE_UNKNOWN - || params.attachDictPref == ZSTD_dictForceAttach ) - && params.attachDictPref != ZSTD_dictForceCopy - && !params.forceWindow; /* dictMatchState isn't correctly + || params->attachDictPref == ZSTD_dictForceAttach ) + && params->attachDictPref != ZSTD_dictForceCopy + && !params->forceWindow; /* dictMatchState isn't correctly * handled in _enforceMaxDist */ } @@ -1640,8 +1605,8 @@ ZSTD_resetCCtx_byAttachingCDict(ZSTD_CCtx* cctx, * has its own tables. */ params.cParams = ZSTD_adjustCParams_internal(*cdict_cParams, pledgedSrcSize, 0); params.cParams.windowLog = windowLog; - ZSTD_resetCCtx_internal(cctx, params, pledgedSrcSize, - ZSTDcrp_continue, zbuff); + FORWARD_IF_ERROR(ZSTD_resetCCtx_internal(cctx, params, pledgedSrcSize, + ZSTDcrp_makeClean, zbuff)); assert(cctx->appliedParams.cParams.strategy == cdict_cParams->strategy); } @@ -1689,30 +1654,36 @@ static size_t ZSTD_resetCCtx_byCopyingCDict(ZSTD_CCtx* cctx, /* Copy only compression parameters related to tables. */ params.cParams = *cdict_cParams; params.cParams.windowLog = windowLog; - ZSTD_resetCCtx_internal(cctx, params, pledgedSrcSize, - ZSTDcrp_noMemset, zbuff); + FORWARD_IF_ERROR(ZSTD_resetCCtx_internal(cctx, params, pledgedSrcSize, + ZSTDcrp_leaveDirty, zbuff)); assert(cctx->appliedParams.cParams.strategy == cdict_cParams->strategy); assert(cctx->appliedParams.cParams.hashLog == cdict_cParams->hashLog); assert(cctx->appliedParams.cParams.chainLog == cdict_cParams->chainLog); } + ZSTD_cwksp_mark_tables_dirty(&cctx->workspace); + /* copy tables */ { size_t const chainSize = (cdict_cParams->strategy == ZSTD_fast) ? 0 : ((size_t)1 << cdict_cParams->chainLog); size_t const hSize = (size_t)1 << cdict_cParams->hashLog; - size_t const tableSpace = (chainSize + hSize) * sizeof(U32); - assert((U32*)cctx->blockState.matchState.chainTable == (U32*)cctx->blockState.matchState.hashTable + hSize); /* chainTable must follow hashTable */ - assert((U32*)cctx->blockState.matchState.hashTable3 == (U32*)cctx->blockState.matchState.chainTable + chainSize); - assert((U32*)cdict->matchState.chainTable == (U32*)cdict->matchState.hashTable + hSize); /* chainTable must follow hashTable */ - assert((U32*)cdict->matchState.hashTable3 == (U32*)cdict->matchState.chainTable + chainSize); - memcpy(cctx->blockState.matchState.hashTable, cdict->matchState.hashTable, tableSpace); /* presumes all tables follow each other */ + + memcpy(cctx->blockState.matchState.hashTable, + cdict->matchState.hashTable, + hSize * sizeof(U32)); + memcpy(cctx->blockState.matchState.chainTable, + cdict->matchState.chainTable, + chainSize * sizeof(U32)); } /* Zero the hashTable3, since the cdict never fills it */ - { size_t const h3Size = (size_t)1 << cctx->blockState.matchState.hashLog3; + { int const h3log = cctx->blockState.matchState.hashLog3; + size_t const h3Size = h3log ? ((size_t)1 << h3log) : 0; assert(cdict->matchState.hashLog3 == 0); memset(cctx->blockState.matchState.hashTable3, 0, h3Size * sizeof(U32)); } + ZSTD_cwksp_mark_tables_clean(&cctx->workspace); + /* copy dictionary offsets */ { ZSTD_matchState_t const* srcMatchState = &cdict->matchState; ZSTD_matchState_t* dstMatchState = &cctx->blockState.matchState; @@ -1734,7 +1705,7 @@ static size_t ZSTD_resetCCtx_byCopyingCDict(ZSTD_CCtx* cctx, * in-place. We decide here which strategy to use. */ static size_t ZSTD_resetCCtx_usingCDict(ZSTD_CCtx* cctx, const ZSTD_CDict* cdict, - ZSTD_CCtx_params params, + const ZSTD_CCtx_params* params, U64 pledgedSrcSize, ZSTD_buffered_policy_e zbuff) { @@ -1744,10 +1715,10 @@ static size_t ZSTD_resetCCtx_usingCDict(ZSTD_CCtx* cctx, if (ZSTD_shouldAttachDict(cdict, params, pledgedSrcSize)) { return ZSTD_resetCCtx_byAttachingCDict( - cctx, cdict, params, pledgedSrcSize, zbuff); + cctx, cdict, *params, pledgedSrcSize, zbuff); } else { return ZSTD_resetCCtx_byCopyingCDict( - cctx, cdict, params, pledgedSrcSize, zbuff); + cctx, cdict, *params, pledgedSrcSize, zbuff); } } @@ -1773,7 +1744,7 @@ static size_t ZSTD_copyCCtx_internal(ZSTD_CCtx* dstCCtx, params.cParams = srcCCtx->appliedParams.cParams; params.fParams = fParams; ZSTD_resetCCtx_internal(dstCCtx, params, pledgedSrcSize, - ZSTDcrp_noMemset, zbuff); + ZSTDcrp_leaveDirty, zbuff); assert(dstCCtx->appliedParams.cParams.windowLog == srcCCtx->appliedParams.cParams.windowLog); assert(dstCCtx->appliedParams.cParams.strategy == srcCCtx->appliedParams.cParams.strategy); assert(dstCCtx->appliedParams.cParams.hashLog == srcCCtx->appliedParams.cParams.hashLog); @@ -1781,16 +1752,27 @@ static size_t ZSTD_copyCCtx_internal(ZSTD_CCtx* dstCCtx, assert(dstCCtx->blockState.matchState.hashLog3 == srcCCtx->blockState.matchState.hashLog3); } + ZSTD_cwksp_mark_tables_dirty(&dstCCtx->workspace); + /* copy tables */ { size_t const chainSize = (srcCCtx->appliedParams.cParams.strategy == ZSTD_fast) ? 0 : ((size_t)1 << srcCCtx->appliedParams.cParams.chainLog); size_t const hSize = (size_t)1 << srcCCtx->appliedParams.cParams.hashLog; - size_t const h3Size = (size_t)1 << srcCCtx->blockState.matchState.hashLog3; - size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32); - assert((U32*)dstCCtx->blockState.matchState.chainTable == (U32*)dstCCtx->blockState.matchState.hashTable + hSize); /* chainTable must follow hashTable */ - assert((U32*)dstCCtx->blockState.matchState.hashTable3 == (U32*)dstCCtx->blockState.matchState.chainTable + chainSize); - memcpy(dstCCtx->blockState.matchState.hashTable, srcCCtx->blockState.matchState.hashTable, tableSpace); /* presumes all tables follow each other */ + int const h3log = srcCCtx->blockState.matchState.hashLog3; + size_t const h3Size = h3log ? ((size_t)1 << h3log) : 0; + + memcpy(dstCCtx->blockState.matchState.hashTable, + srcCCtx->blockState.matchState.hashTable, + hSize * sizeof(U32)); + memcpy(dstCCtx->blockState.matchState.chainTable, + srcCCtx->blockState.matchState.chainTable, + chainSize * sizeof(U32)); + memcpy(dstCCtx->blockState.matchState.hashTable3, + srcCCtx->blockState.matchState.hashTable3, + h3Size * sizeof(U32)); } + ZSTD_cwksp_mark_tables_clean(&dstCCtx->workspace); + /* copy dictionary offsets */ { const ZSTD_matchState_t* srcMatchState = &srcCCtx->blockState.matchState; @@ -1841,6 +1823,20 @@ ZSTD_reduceTable_internal (U32* const table, U32 const size, U32 const reducerVa int rowNb; assert((size & (ZSTD_ROWSIZE-1)) == 0); /* multiple of ZSTD_ROWSIZE */ assert(size < (1U<<31)); /* can be casted to int */ + +#if defined (MEMORY_SANITIZER) && !defined (ZSTD_MSAN_DONT_POISON_WORKSPACE) + /* To validate that the table re-use logic is sound, and that we don't + * access table space that we haven't cleaned, we re-"poison" the table + * space every time we mark it dirty. + * + * This function however is intended to operate on those dirty tables and + * re-clean them. So when this function is used correctly, we can unpoison + * the memory it operated on. This introduces a blind spot though, since + * if we now try to operate on __actually__ poisoned memory, we will not + * detect that. */ + __msan_unpoison(table, size * sizeof(U32)); +#endif + for (rowNb=0 ; rowNb < nbRows ; rowNb++) { int column; for (column=0; column<ZSTD_ROWSIZE; column++) { @@ -1903,155 +1899,6 @@ static size_t ZSTD_noCompressBlock (void* dst, size_t dstCapacity, const void* s return ZSTD_blockHeaderSize + srcSize; } -static size_t ZSTD_noCompressLiterals (void* dst, size_t dstCapacity, const void* src, size_t srcSize) -{ - BYTE* const ostart = (BYTE* const)dst; - U32 const flSize = 1 + (srcSize>31) + (srcSize>4095); - - RETURN_ERROR_IF(srcSize + flSize > dstCapacity, dstSize_tooSmall); - - switch(flSize) - { - case 1: /* 2 - 1 - 5 */ - ostart[0] = (BYTE)((U32)set_basic + (srcSize<<3)); - break; - case 2: /* 2 - 2 - 12 */ - MEM_writeLE16(ostart, (U16)((U32)set_basic + (1<<2) + (srcSize<<4))); - break; - case 3: /* 2 - 2 - 20 */ - MEM_writeLE32(ostart, (U32)((U32)set_basic + (3<<2) + (srcSize<<4))); - break; - default: /* not necessary : flSize is {1,2,3} */ - assert(0); - } - - memcpy(ostart + flSize, src, srcSize); - return srcSize + flSize; -} - -static size_t ZSTD_compressRleLiteralsBlock (void* dst, size_t dstCapacity, const void* src, size_t srcSize) -{ - BYTE* const ostart = (BYTE* const)dst; - U32 const flSize = 1 + (srcSize>31) + (srcSize>4095); - - (void)dstCapacity; /* dstCapacity already guaranteed to be >=4, hence large enough */ - - switch(flSize) - { - case 1: /* 2 - 1 - 5 */ - ostart[0] = (BYTE)((U32)set_rle + (srcSize<<3)); - break; - case 2: /* 2 - 2 - 12 */ - MEM_writeLE16(ostart, (U16)((U32)set_rle + (1<<2) + (srcSize<<4))); - break; - case 3: /* 2 - 2 - 20 */ - MEM_writeLE32(ostart, (U32)((U32)set_rle + (3<<2) + (srcSize<<4))); - break; - default: /* not necessary : flSize is {1,2,3} */ - assert(0); - } - - ostart[flSize] = *(const BYTE*)src; - return flSize+1; -} - - -/* ZSTD_minGain() : - * minimum compression required - * to generate a compress block or a compressed literals section. - * note : use same formula for both situations */ -static size_t ZSTD_minGain(size_t srcSize, ZSTD_strategy strat) -{ - U32 const minlog = (strat>=ZSTD_btultra) ? (U32)(strat) - 1 : 6; - ZSTD_STATIC_ASSERT(ZSTD_btultra == 8); - assert(ZSTD_cParam_withinBounds(ZSTD_c_strategy, strat)); - return (srcSize >> minlog) + 2; -} - -static size_t ZSTD_compressLiterals (ZSTD_hufCTables_t const* prevHuf, - ZSTD_hufCTables_t* nextHuf, - ZSTD_strategy strategy, int disableLiteralCompression, - void* dst, size_t dstCapacity, - const void* src, size_t srcSize, - void* workspace, size_t wkspSize, - const int bmi2) -{ - size_t const minGain = ZSTD_minGain(srcSize, strategy); - size_t const lhSize = 3 + (srcSize >= 1 KB) + (srcSize >= 16 KB); - BYTE* const ostart = (BYTE*)dst; - U32 singleStream = srcSize < 256; - symbolEncodingType_e hType = set_compressed; - size_t cLitSize; - - DEBUGLOG(5,"ZSTD_compressLiterals (disableLiteralCompression=%i)", - disableLiteralCompression); - - /* Prepare nextEntropy assuming reusing the existing table */ - memcpy(nextHuf, prevHuf, sizeof(*prevHuf)); - - if (disableLiteralCompression) - return ZSTD_noCompressLiterals(dst, dstCapacity, src, srcSize); - - /* small ? don't even attempt compression (speed opt) */ -# define COMPRESS_LITERALS_SIZE_MIN 63 - { size_t const minLitSize = (prevHuf->repeatMode == HUF_repeat_valid) ? 6 : COMPRESS_LITERALS_SIZE_MIN; - if (srcSize <= minLitSize) return ZSTD_noCompressLiterals(dst, dstCapacity, src, srcSize); - } - - RETURN_ERROR_IF(dstCapacity < lhSize+1, dstSize_tooSmall, "not enough space for compression"); - { HUF_repeat repeat = prevHuf->repeatMode; - int const preferRepeat = strategy < ZSTD_lazy ? srcSize <= 1024 : 0; - if (repeat == HUF_repeat_valid && lhSize == 3) singleStream = 1; - cLitSize = singleStream ? HUF_compress1X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11, - workspace, wkspSize, (HUF_CElt*)nextHuf->CTable, &repeat, preferRepeat, bmi2) - : HUF_compress4X_repeat(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11, - workspace, wkspSize, (HUF_CElt*)nextHuf->CTable, &repeat, preferRepeat, bmi2); - if (repeat != HUF_repeat_none) { - /* reused the existing table */ - hType = set_repeat; - } - } - - if ((cLitSize==0) | (cLitSize >= srcSize - minGain) | ERR_isError(cLitSize)) { - memcpy(nextHuf, prevHuf, sizeof(*prevHuf)); - return ZSTD_noCompressLiterals(dst, dstCapacity, src, srcSize); - } - if (cLitSize==1) { - memcpy(nextHuf, prevHuf, sizeof(*prevHuf)); - return ZSTD_compressRleLiteralsBlock(dst, dstCapacity, src, srcSize); - } - - if (hType == set_compressed) { - /* using a newly constructed table */ - nextHuf->repeatMode = HUF_repeat_check; - } - - /* Build header */ - switch(lhSize) - { - case 3: /* 2 - 2 - 10 - 10 */ - { U32 const lhc = hType + ((!singleStream) << 2) + ((U32)srcSize<<4) + ((U32)cLitSize<<14); - MEM_writeLE24(ostart, lhc); - break; - } - case 4: /* 2 - 2 - 14 - 14 */ - { U32 const lhc = hType + (2 << 2) + ((U32)srcSize<<4) + ((U32)cLitSize<<18); - MEM_writeLE32(ostart, lhc); - break; - } - case 5: /* 2 - 2 - 18 - 18 */ - { U32 const lhc = hType + (3 << 2) + ((U32)srcSize<<4) + ((U32)cLitSize<<22); - MEM_writeLE32(ostart, lhc); - ostart[4] = (BYTE)(cLitSize >> 10); - break; - } - default: /* not possible : lhSize is {3,4,5} */ - assert(0); - } - return lhSize+cLitSize; -} - - void ZSTD_seqToCodes(const seqStore_t* seqStorePtr) { const seqDef* const sequences = seqStorePtr->sequencesStart; @@ -2074,418 +1921,6 @@ void ZSTD_seqToCodes(const seqStore_t* seqStorePtr) mlCodeTable[seqStorePtr->longLengthPos] = MaxML; } - -/** - * -log2(x / 256) lookup table for x in [0, 256). - * If x == 0: Return 0 - * Else: Return floor(-log2(x / 256) * 256) - */ -static unsigned const kInverseProbabilityLog256[256] = { - 0, 2048, 1792, 1642, 1536, 1453, 1386, 1329, 1280, 1236, 1197, 1162, - 1130, 1100, 1073, 1047, 1024, 1001, 980, 960, 941, 923, 906, 889, - 874, 859, 844, 830, 817, 804, 791, 779, 768, 756, 745, 734, - 724, 714, 704, 694, 685, 676, 667, 658, 650, 642, 633, 626, - 618, 610, 603, 595, 588, 581, 574, 567, 561, 554, 548, 542, - 535, 529, 523, 517, 512, 506, 500, 495, 489, 484, 478, 473, - 468, 463, 458, 453, 448, 443, 438, 434, 429, 424, 420, 415, - 411, 407, 402, 398, 394, 390, 386, 382, 377, 373, 370, 366, - 362, 358, 354, 350, 347, 343, 339, 336, 332, 329, 325, 322, - 318, 315, 311, 308, 305, 302, 298, 295, 292, 289, 286, 282, - 279, 276, 273, 270, 267, 264, 261, 258, 256, 253, 250, 247, - 244, 241, 239, 236, 233, 230, 228, 225, 222, 220, 217, 215, - 212, 209, 207, 204, 202, 199, 197, 194, 192, 190, 187, 185, - 182, 180, 178, 175, 173, 171, 168, 166, 164, 162, 159, 157, - 155, 153, 151, 149, 146, 144, 142, 140, 138, 136, 134, 132, - 130, 128, 126, 123, 121, 119, 117, 115, 114, 112, 110, 108, - 106, 104, 102, 100, 98, 96, 94, 93, 91, 89, 87, 85, - 83, 82, 80, 78, 76, 74, 73, 71, 69, 67, 66, 64, - 62, 61, 59, 57, 55, 54, 52, 50, 49, 47, 46, 44, - 42, 41, 39, 37, 36, 34, 33, 31, 30, 28, 26, 25, - 23, 22, 20, 19, 17, 16, 14, 13, 11, 10, 8, 7, - 5, 4, 2, 1, -}; - - -/** - * Returns the cost in bits of encoding the distribution described by count - * using the entropy bound. - */ -static size_t ZSTD_entropyCost(unsigned const* count, unsigned const max, size_t const total) -{ - unsigned cost = 0; - unsigned s; - for (s = 0; s <= max; ++s) { - unsigned norm = (unsigned)((256 * count[s]) / total); - if (count[s] != 0 && norm == 0) - norm = 1; - assert(count[s] < total); - cost += count[s] * kInverseProbabilityLog256[norm]; - } - return cost >> 8; -} - - -/** - * Returns the cost in bits of encoding the distribution in count using the - * table described by norm. The max symbol support by norm is assumed >= max. - * norm must be valid for every symbol with non-zero probability in count. - */ -static size_t ZSTD_crossEntropyCost(short const* norm, unsigned accuracyLog, - unsigned const* count, unsigned const max) -{ - unsigned const shift = 8 - accuracyLog; - size_t cost = 0; - unsigned s; - assert(accuracyLog <= 8); - for (s = 0; s <= max; ++s) { - unsigned const normAcc = norm[s] != -1 ? norm[s] : 1; - unsigned const norm256 = normAcc << shift; - assert(norm256 > 0); - assert(norm256 < 256); - cost += count[s] * kInverseProbabilityLog256[norm256]; - } - return cost >> 8; -} - - -static unsigned ZSTD_getFSEMaxSymbolValue(FSE_CTable const* ctable) { - void const* ptr = ctable; - U16 const* u16ptr = (U16 const*)ptr; - U32 const maxSymbolValue = MEM_read16(u16ptr + 1); - return maxSymbolValue; -} - - -/** - * Returns the cost in bits of encoding the distribution in count using ctable. - * Returns an error if ctable cannot represent all the symbols in count. - */ -static size_t ZSTD_fseBitCost( - FSE_CTable const* ctable, - unsigned const* count, - unsigned const max) -{ - unsigned const kAccuracyLog = 8; - size_t cost = 0; - unsigned s; - FSE_CState_t cstate; - FSE_initCState(&cstate, ctable); - RETURN_ERROR_IF(ZSTD_getFSEMaxSymbolValue(ctable) < max, GENERIC, - "Repeat FSE_CTable has maxSymbolValue %u < %u", - ZSTD_getFSEMaxSymbolValue(ctable), max); - for (s = 0; s <= max; ++s) { - unsigned const tableLog = cstate.stateLog; - unsigned const badCost = (tableLog + 1) << kAccuracyLog; - unsigned const bitCost = FSE_bitCost(cstate.symbolTT, tableLog, s, kAccuracyLog); - if (count[s] == 0) - continue; - RETURN_ERROR_IF(bitCost >= badCost, GENERIC, - "Repeat FSE_CTable has Prob[%u] == 0", s); - cost += count[s] * bitCost; - } - return cost >> kAccuracyLog; -} - -/** - * Returns the cost in bytes of encoding the normalized count header. - * Returns an error if any of the helper functions return an error. - */ -static size_t ZSTD_NCountCost(unsigned const* count, unsigned const max, - size_t const nbSeq, unsigned const FSELog) -{ - BYTE wksp[FSE_NCOUNTBOUND]; - S16 norm[MaxSeq + 1]; - const U32 tableLog = FSE_optimalTableLog(FSELog, nbSeq, max); - FORWARD_IF_ERROR(FSE_normalizeCount(norm, tableLog, count, nbSeq, max)); - return FSE_writeNCount(wksp, sizeof(wksp), norm, max, tableLog); -} - - -typedef enum { - ZSTD_defaultDisallowed = 0, - ZSTD_defaultAllowed = 1 -} ZSTD_defaultPolicy_e; - -MEM_STATIC symbolEncodingType_e -ZSTD_selectEncodingType( - FSE_repeat* repeatMode, unsigned const* count, unsigned const max, - size_t const mostFrequent, size_t nbSeq, unsigned const FSELog, - FSE_CTable const* prevCTable, - short const* defaultNorm, U32 defaultNormLog, - ZSTD_defaultPolicy_e const isDefaultAllowed, - ZSTD_strategy const strategy) -{ - ZSTD_STATIC_ASSERT(ZSTD_defaultDisallowed == 0 && ZSTD_defaultAllowed != 0); - if (mostFrequent == nbSeq) { - *repeatMode = FSE_repeat_none; - if (isDefaultAllowed && nbSeq <= 2) { - /* Prefer set_basic over set_rle when there are 2 or less symbols, - * since RLE uses 1 byte, but set_basic uses 5-6 bits per symbol. - * If basic encoding isn't possible, always choose RLE. - */ - DEBUGLOG(5, "Selected set_basic"); - return set_basic; - } - DEBUGLOG(5, "Selected set_rle"); - return set_rle; - } - if (strategy < ZSTD_lazy) { - if (isDefaultAllowed) { - size_t const staticFse_nbSeq_max = 1000; - size_t const mult = 10 - strategy; - size_t const baseLog = 3; - size_t const dynamicFse_nbSeq_min = (((size_t)1 << defaultNormLog) * mult) >> baseLog; /* 28-36 for offset, 56-72 for lengths */ - assert(defaultNormLog >= 5 && defaultNormLog <= 6); /* xx_DEFAULTNORMLOG */ - assert(mult <= 9 && mult >= 7); - if ( (*repeatMode == FSE_repeat_valid) - && (nbSeq < staticFse_nbSeq_max) ) { - DEBUGLOG(5, "Selected set_repeat"); - return set_repeat; - } - if ( (nbSeq < dynamicFse_nbSeq_min) - || (mostFrequent < (nbSeq >> (defaultNormLog-1))) ) { - DEBUGLOG(5, "Selected set_basic"); - /* The format allows default tables to be repeated, but it isn't useful. - * When using simple heuristics to select encoding type, we don't want - * to confuse these tables with dictionaries. When running more careful - * analysis, we don't need to waste time checking both repeating tables - * and default tables. - */ - *repeatMode = FSE_repeat_none; - return set_basic; - } - } - } else { - size_t const basicCost = isDefaultAllowed ? ZSTD_crossEntropyCost(defaultNorm, defaultNormLog, count, max) : ERROR(GENERIC); - size_t const repeatCost = *repeatMode != FSE_repeat_none ? ZSTD_fseBitCost(prevCTable, count, max) : ERROR(GENERIC); - size_t const NCountCost = ZSTD_NCountCost(count, max, nbSeq, FSELog); - size_t const compressedCost = (NCountCost << 3) + ZSTD_entropyCost(count, max, nbSeq); - - if (isDefaultAllowed) { - assert(!ZSTD_isError(basicCost)); - assert(!(*repeatMode == FSE_repeat_valid && ZSTD_isError(repeatCost))); - } - assert(!ZSTD_isError(NCountCost)); - assert(compressedCost < ERROR(maxCode)); - DEBUGLOG(5, "Estimated bit costs: basic=%u\trepeat=%u\tcompressed=%u", - (unsigned)basicCost, (unsigned)repeatCost, (unsigned)compressedCost); - if (basicCost <= repeatCost && basicCost <= compressedCost) { - DEBUGLOG(5, "Selected set_basic"); - assert(isDefaultAllowed); - *repeatMode = FSE_repeat_none; - return set_basic; - } - if (repeatCost <= compressedCost) { - DEBUGLOG(5, "Selected set_repeat"); - assert(!ZSTD_isError(repeatCost)); - return set_repeat; - } - assert(compressedCost < basicCost && compressedCost < repeatCost); - } - DEBUGLOG(5, "Selected set_compressed"); - *repeatMode = FSE_repeat_check; - return set_compressed; -} - -MEM_STATIC size_t -ZSTD_buildCTable(void* dst, size_t dstCapacity, - FSE_CTable* nextCTable, U32 FSELog, symbolEncodingType_e type, - unsigned* count, U32 max, - const BYTE* codeTable, size_t nbSeq, - const S16* defaultNorm, U32 defaultNormLog, U32 defaultMax, - const FSE_CTable* prevCTable, size_t prevCTableSize, - void* workspace, size_t workspaceSize) -{ - BYTE* op = (BYTE*)dst; - const BYTE* const oend = op + dstCapacity; - DEBUGLOG(6, "ZSTD_buildCTable (dstCapacity=%u)", (unsigned)dstCapacity); - - switch (type) { - case set_rle: - FORWARD_IF_ERROR(FSE_buildCTable_rle(nextCTable, (BYTE)max)); - RETURN_ERROR_IF(dstCapacity==0, dstSize_tooSmall); - *op = codeTable[0]; - return 1; - case set_repeat: - memcpy(nextCTable, prevCTable, prevCTableSize); - return 0; - case set_basic: - FORWARD_IF_ERROR(FSE_buildCTable_wksp(nextCTable, defaultNorm, defaultMax, defaultNormLog, workspace, workspaceSize)); /* note : could be pre-calculated */ - return 0; - case set_compressed: { - S16 norm[MaxSeq + 1]; - size_t nbSeq_1 = nbSeq; - const U32 tableLog = FSE_optimalTableLog(FSELog, nbSeq, max); - if (count[codeTable[nbSeq-1]] > 1) { - count[codeTable[nbSeq-1]]--; - nbSeq_1--; - } - assert(nbSeq_1 > 1); - FORWARD_IF_ERROR(FSE_normalizeCount(norm, tableLog, count, nbSeq_1, max)); - { size_t const NCountSize = FSE_writeNCount(op, oend - op, norm, max, tableLog); /* overflow protected */ - FORWARD_IF_ERROR(NCountSize); - FORWARD_IF_ERROR(FSE_buildCTable_wksp(nextCTable, norm, max, tableLog, workspace, workspaceSize)); - return NCountSize; - } - } - default: assert(0); RETURN_ERROR(GENERIC); - } -} - -FORCE_INLINE_TEMPLATE size_t -ZSTD_encodeSequences_body( - void* dst, size_t dstCapacity, - FSE_CTable const* CTable_MatchLength, BYTE const* mlCodeTable, - FSE_CTable const* CTable_OffsetBits, BYTE const* ofCodeTable, - FSE_CTable const* CTable_LitLength, BYTE const* llCodeTable, - seqDef const* sequences, size_t nbSeq, int longOffsets) -{ - BIT_CStream_t blockStream; - FSE_CState_t stateMatchLength; - FSE_CState_t stateOffsetBits; - FSE_CState_t stateLitLength; - - RETURN_ERROR_IF( - ERR_isError(BIT_initCStream(&blockStream, dst, dstCapacity)), - dstSize_tooSmall, "not enough space remaining"); - DEBUGLOG(6, "available space for bitstream : %i (dstCapacity=%u)", - (int)(blockStream.endPtr - blockStream.startPtr), - (unsigned)dstCapacity); - - /* first symbols */ - FSE_initCState2(&stateMatchLength, CTable_MatchLength, mlCodeTable[nbSeq-1]); - FSE_initCState2(&stateOffsetBits, CTable_OffsetBits, ofCodeTable[nbSeq-1]); - FSE_initCState2(&stateLitLength, CTable_LitLength, llCodeTable[nbSeq-1]); - BIT_addBits(&blockStream, sequences[nbSeq-1].litLength, LL_bits[llCodeTable[nbSeq-1]]); - if (MEM_32bits()) BIT_flushBits(&blockStream); - BIT_addBits(&blockStream, sequences[nbSeq-1].matchLength, ML_bits[mlCodeTable[nbSeq-1]]); - if (MEM_32bits()) BIT_flushBits(&blockStream); - if (longOffsets) { - U32 const ofBits = ofCodeTable[nbSeq-1]; - int const extraBits = ofBits - MIN(ofBits, STREAM_ACCUMULATOR_MIN-1); - if (extraBits) { - BIT_addBits(&blockStream, sequences[nbSeq-1].offset, extraBits); - BIT_flushBits(&blockStream); - } - BIT_addBits(&blockStream, sequences[nbSeq-1].offset >> extraBits, - ofBits - extraBits); - } else { - BIT_addBits(&blockStream, sequences[nbSeq-1].offset, ofCodeTable[nbSeq-1]); - } - BIT_flushBits(&blockStream); - - { size_t n; - for (n=nbSeq-2 ; n<nbSeq ; n--) { /* intentional underflow */ - BYTE const llCode = llCodeTable[n]; - BYTE const ofCode = ofCodeTable[n]; - BYTE const mlCode = mlCodeTable[n]; - U32 const llBits = LL_bits[llCode]; - U32 const ofBits = ofCode; - U32 const mlBits = ML_bits[mlCode]; - DEBUGLOG(6, "encoding: litlen:%2u - matchlen:%2u - offCode:%7u", - (unsigned)sequences[n].litLength, - (unsigned)sequences[n].matchLength + MINMATCH, - (unsigned)sequences[n].offset); - /* 32b*/ /* 64b*/ - /* (7)*/ /* (7)*/ - FSE_encodeSymbol(&blockStream, &stateOffsetBits, ofCode); /* 15 */ /* 15 */ - FSE_encodeSymbol(&blockStream, &stateMatchLength, mlCode); /* 24 */ /* 24 */ - if (MEM_32bits()) BIT_flushBits(&blockStream); /* (7)*/ - FSE_encodeSymbol(&blockStream, &stateLitLength, llCode); /* 16 */ /* 33 */ - if (MEM_32bits() || (ofBits+mlBits+llBits >= 64-7-(LLFSELog+MLFSELog+OffFSELog))) - BIT_flushBits(&blockStream); /* (7)*/ - BIT_addBits(&blockStream, sequences[n].litLength, llBits); - if (MEM_32bits() && ((llBits+mlBits)>24)) BIT_flushBits(&blockStream); - BIT_addBits(&blockStream, sequences[n].matchLength, mlBits); - if (MEM_32bits() || (ofBits+mlBits+llBits > 56)) BIT_flushBits(&blockStream); - if (longOffsets) { - int const extraBits = ofBits - MIN(ofBits, STREAM_ACCUMULATOR_MIN-1); - if (extraBits) { - BIT_addBits(&blockStream, sequences[n].offset, extraBits); - BIT_flushBits(&blockStream); /* (7)*/ - } - BIT_addBits(&blockStream, sequences[n].offset >> extraBits, - ofBits - extraBits); /* 31 */ - } else { - BIT_addBits(&blockStream, sequences[n].offset, ofBits); /* 31 */ - } - BIT_flushBits(&blockStream); /* (7)*/ - DEBUGLOG(7, "remaining space : %i", (int)(blockStream.endPtr - blockStream.ptr)); - } } - - DEBUGLOG(6, "ZSTD_encodeSequences: flushing ML state with %u bits", stateMatchLength.stateLog); - FSE_flushCState(&blockStream, &stateMatchLength); - DEBUGLOG(6, "ZSTD_encodeSequences: flushing Off state with %u bits", stateOffsetBits.stateLog); - FSE_flushCState(&blockStream, &stateOffsetBits); - DEBUGLOG(6, "ZSTD_encodeSequences: flushing LL state with %u bits", stateLitLength.stateLog); - FSE_flushCState(&blockStream, &stateLitLength); - - { size_t const streamSize = BIT_closeCStream(&blockStream); - RETURN_ERROR_IF(streamSize==0, dstSize_tooSmall, "not enough space"); - return streamSize; - } -} - -static size_t -ZSTD_encodeSequences_default( - void* dst, size_t dstCapacity, - FSE_CTable const* CTable_MatchLength, BYTE const* mlCodeTable, - FSE_CTable const* CTable_OffsetBits, BYTE const* ofCodeTable, - FSE_CTable const* CTable_LitLength, BYTE const* llCodeTable, - seqDef const* sequences, size_t nbSeq, int longOffsets) -{ - return ZSTD_encodeSequences_body(dst, dstCapacity, - CTable_MatchLength, mlCodeTable, - CTable_OffsetBits, ofCodeTable, - CTable_LitLength, llCodeTable, - sequences, nbSeq, longOffsets); -} - - -#if DYNAMIC_BMI2 - -static TARGET_ATTRIBUTE("bmi2") size_t -ZSTD_encodeSequences_bmi2( - void* dst, size_t dstCapacity, - FSE_CTable const* CTable_MatchLength, BYTE const* mlCodeTable, - FSE_CTable const* CTable_OffsetBits, BYTE const* ofCodeTable, - FSE_CTable const* CTable_LitLength, BYTE const* llCodeTable, - seqDef const* sequences, size_t nbSeq, int longOffsets) -{ - return ZSTD_encodeSequences_body(dst, dstCapacity, - CTable_MatchLength, mlCodeTable, - CTable_OffsetBits, ofCodeTable, - CTable_LitLength, llCodeTable, - sequences, nbSeq, longOffsets); -} - -#endif - -static size_t ZSTD_encodeSequences( - void* dst, size_t dstCapacity, - FSE_CTable const* CTable_MatchLength, BYTE const* mlCodeTable, - FSE_CTable const* CTable_OffsetBits, BYTE const* ofCodeTable, - FSE_CTable const* CTable_LitLength, BYTE const* llCodeTable, - seqDef const* sequences, size_t nbSeq, int longOffsets, int bmi2) -{ - DEBUGLOG(5, "ZSTD_encodeSequences: dstCapacity = %u", (unsigned)dstCapacity); -#if DYNAMIC_BMI2 - if (bmi2) { - return ZSTD_encodeSequences_bmi2(dst, dstCapacity, - CTable_MatchLength, mlCodeTable, - CTable_OffsetBits, ofCodeTable, - CTable_LitLength, llCodeTable, - sequences, nbSeq, longOffsets); - } -#endif - (void)bmi2; - return ZSTD_encodeSequences_default(dst, dstCapacity, - CTable_MatchLength, mlCodeTable, - CTable_OffsetBits, ofCodeTable, - CTable_LitLength, llCodeTable, - sequences, nbSeq, longOffsets); -} - static int ZSTD_disableLiteralsCompression(const ZSTD_CCtx_params* cctxParams) { switch (cctxParams->literalCompressionMode) { @@ -2509,7 +1944,7 @@ ZSTD_compressSequences_internal(seqStore_t* seqStorePtr, ZSTD_entropyCTables_t* nextEntropy, const ZSTD_CCtx_params* cctxParams, void* dst, size_t dstCapacity, - void* workspace, size_t wkspSize, + void* entropyWorkspace, size_t entropyWkspSize, const int bmi2) { const int longOffsets = cctxParams->cParams.windowLog > STREAM_ACCUMULATOR_MIN; @@ -2526,23 +1961,23 @@ ZSTD_compressSequences_internal(seqStore_t* seqStorePtr, BYTE* const ostart = (BYTE*)dst; BYTE* const oend = ostart + dstCapacity; BYTE* op = ostart; - size_t const nbSeq = seqStorePtr->sequences - seqStorePtr->sequencesStart; + size_t const nbSeq = (size_t)(seqStorePtr->sequences - seqStorePtr->sequencesStart); BYTE* seqHead; BYTE* lastNCount = NULL; + DEBUGLOG(5, "ZSTD_compressSequences_internal (nbSeq=%zu)", nbSeq); ZSTD_STATIC_ASSERT(HUF_WORKSPACE_SIZE >= (1<<MAX(MLFSELog,LLFSELog))); - DEBUGLOG(5, "ZSTD_compressSequences_internal"); /* Compress literals */ { const BYTE* const literals = seqStorePtr->litStart; - size_t const litSize = seqStorePtr->lit - literals; + size_t const litSize = (size_t)(seqStorePtr->lit - literals); size_t const cSize = ZSTD_compressLiterals( &prevEntropy->huf, &nextEntropy->huf, cctxParams->cParams.strategy, ZSTD_disableLiteralsCompression(cctxParams), op, dstCapacity, literals, litSize, - workspace, wkspSize, + entropyWorkspace, entropyWkspSize, bmi2); FORWARD_IF_ERROR(cSize); assert(cSize <= dstCapacity); @@ -2552,17 +1987,22 @@ ZSTD_compressSequences_internal(seqStore_t* seqStorePtr, /* Sequences Header */ RETURN_ERROR_IF((oend-op) < 3 /*max nbSeq Size*/ + 1 /*seqHead*/, dstSize_tooSmall); - if (nbSeq < 0x7F) + if (nbSeq < 128) { *op++ = (BYTE)nbSeq; - else if (nbSeq < LONGNBSEQ) - op[0] = (BYTE)((nbSeq>>8) + 0x80), op[1] = (BYTE)nbSeq, op+=2; - else - op[0]=0xFF, MEM_writeLE16(op+1, (U16)(nbSeq - LONGNBSEQ)), op+=3; + } else if (nbSeq < LONGNBSEQ) { + op[0] = (BYTE)((nbSeq>>8) + 0x80); + op[1] = (BYTE)nbSeq; + op+=2; + } else { + op[0]=0xFF; + MEM_writeLE16(op+1, (U16)(nbSeq - LONGNBSEQ)); + op+=3; + } assert(op <= oend); if (nbSeq==0) { /* Copy the old tables over as if we repeated them */ memcpy(&nextEntropy->fse, &prevEntropy->fse, sizeof(prevEntropy->fse)); - return op - ostart; + return (size_t)(op - ostart); } /* seqHead : flags for FSE encoding type */ @@ -2573,7 +2013,7 @@ ZSTD_compressSequences_internal(seqStore_t* seqStorePtr, ZSTD_seqToCodes(seqStorePtr); /* build CTable for Literal Lengths */ { unsigned max = MaxLL; - size_t const mostFrequent = HIST_countFast_wksp(count, &max, llCodeTable, nbSeq, workspace, wkspSize); /* can't fail */ + size_t const mostFrequent = HIST_countFast_wksp(count, &max, llCodeTable, nbSeq, entropyWorkspace, entropyWkspSize); /* can't fail */ DEBUGLOG(5, "Building LL table"); nextEntropy->fse.litlength_repeatMode = prevEntropy->fse.litlength_repeatMode; LLtype = ZSTD_selectEncodingType(&nextEntropy->fse.litlength_repeatMode, @@ -2583,10 +2023,14 @@ ZSTD_compressSequences_internal(seqStore_t* seqStorePtr, ZSTD_defaultAllowed, strategy); assert(set_basic < set_compressed && set_rle < set_compressed); assert(!(LLtype < set_compressed && nextEntropy->fse.litlength_repeatMode != FSE_repeat_none)); /* We don't copy tables */ - { size_t const countSize = ZSTD_buildCTable(op, oend - op, CTable_LitLength, LLFSELog, (symbolEncodingType_e)LLtype, - count, max, llCodeTable, nbSeq, LL_defaultNorm, LL_defaultNormLog, MaxLL, - prevEntropy->fse.litlengthCTable, sizeof(prevEntropy->fse.litlengthCTable), - workspace, wkspSize); + { size_t const countSize = ZSTD_buildCTable( + op, (size_t)(oend - op), + CTable_LitLength, LLFSELog, (symbolEncodingType_e)LLtype, + count, max, llCodeTable, nbSeq, + LL_defaultNorm, LL_defaultNormLog, MaxLL, + prevEntropy->fse.litlengthCTable, + sizeof(prevEntropy->fse.litlengthCTable), + entropyWorkspace, entropyWkspSize); FORWARD_IF_ERROR(countSize); if (LLtype == set_compressed) lastNCount = op; @@ -2595,7 +2039,8 @@ ZSTD_compressSequences_internal(seqStore_t* seqStorePtr, } } /* build CTable for Offsets */ { unsigned max = MaxOff; - size_t const mostFrequent = HIST_countFast_wksp(count, &max, ofCodeTable, nbSeq, workspace, wkspSize); /* can't fail */ + size_t const mostFrequent = HIST_countFast_wksp( + count, &max, ofCodeTable, nbSeq, entropyWorkspace, entropyWkspSize); /* can't fail */ /* We can only use the basic table if max <= DefaultMaxOff, otherwise the offsets are too large */ ZSTD_defaultPolicy_e const defaultPolicy = (max <= DefaultMaxOff) ? ZSTD_defaultAllowed : ZSTD_defaultDisallowed; DEBUGLOG(5, "Building OF table"); @@ -2606,10 +2051,14 @@ ZSTD_compressSequences_internal(seqStore_t* seqStorePtr, OF_defaultNorm, OF_defaultNormLog, defaultPolicy, strategy); assert(!(Offtype < set_compressed && nextEntropy->fse.offcode_repeatMode != FSE_repeat_none)); /* We don't copy tables */ - { size_t const countSize = ZSTD_buildCTable(op, oend - op, CTable_OffsetBits, OffFSELog, (symbolEncodingType_e)Offtype, - count, max, ofCodeTable, nbSeq, OF_defaultNorm, OF_defaultNormLog, DefaultMaxOff, - prevEntropy->fse.offcodeCTable, sizeof(prevEntropy->fse.offcodeCTable), - workspace, wkspSize); + { size_t const countSize = ZSTD_buildCTable( + op, (size_t)(oend - op), + CTable_OffsetBits, OffFSELog, (symbolEncodingType_e)Offtype, + count, max, ofCodeTable, nbSeq, + OF_defaultNorm, OF_defaultNormLog, DefaultMaxOff, + prevEntropy->fse.offcodeCTable, + sizeof(prevEntropy->fse.offcodeCTable), + entropyWorkspace, entropyWkspSize); FORWARD_IF_ERROR(countSize); if (Offtype == set_compressed) lastNCount = op; @@ -2618,7 +2067,8 @@ ZSTD_compressSequences_internal(seqStore_t* seqStorePtr, } } /* build CTable for MatchLengths */ { unsigned max = MaxML; - size_t const mostFrequent = HIST_countFast_wksp(count, &max, mlCodeTable, nbSeq, workspace, wkspSize); /* can't fail */ + size_t const mostFrequent = HIST_countFast_wksp( + count, &max, mlCodeTable, nbSeq, entropyWorkspace, entropyWkspSize); /* can't fail */ DEBUGLOG(5, "Building ML table (remaining space : %i)", (int)(oend-op)); nextEntropy->fse.matchlength_repeatMode = prevEntropy->fse.matchlength_repeatMode; MLtype = ZSTD_selectEncodingType(&nextEntropy->fse.matchlength_repeatMode, @@ -2627,10 +2077,14 @@ ZSTD_compressSequences_internal(seqStore_t* seqStorePtr, ML_defaultNorm, ML_defaultNormLog, ZSTD_defaultAllowed, strategy); assert(!(MLtype < set_compressed && nextEntropy->fse.matchlength_repeatMode != FSE_repeat_none)); /* We don't copy tables */ - { size_t const countSize = ZSTD_buildCTable(op, oend - op, CTable_MatchLength, MLFSELog, (symbolEncodingType_e)MLtype, - count, max, mlCodeTable, nbSeq, ML_defaultNorm, ML_defaultNormLog, MaxML, - prevEntropy->fse.matchlengthCTable, sizeof(prevEntropy->fse.matchlengthCTable), - workspace, wkspSize); + { size_t const countSize = ZSTD_buildCTable( + op, (size_t)(oend - op), + CTable_MatchLength, MLFSELog, (symbolEncodingType_e)MLtype, + count, max, mlCodeTable, nbSeq, + ML_defaultNorm, ML_defaultNormLog, MaxML, + prevEntropy->fse.matchlengthCTable, + sizeof(prevEntropy->fse.matchlengthCTable), + entropyWorkspace, entropyWkspSize); FORWARD_IF_ERROR(countSize); if (MLtype == set_compressed) lastNCount = op; @@ -2641,7 +2095,7 @@ ZSTD_compressSequences_internal(seqStore_t* seqStorePtr, *seqHead = (BYTE)((LLtype<<6) + (Offtype<<4) + (MLtype<<2)); { size_t const bitstreamSize = ZSTD_encodeSequences( - op, oend - op, + op, (size_t)(oend - op), CTable_MatchLength, mlCodeTable, CTable_OffsetBits, ofCodeTable, CTable_LitLength, llCodeTable, @@ -2668,7 +2122,7 @@ ZSTD_compressSequences_internal(seqStore_t* seqStorePtr, } DEBUGLOG(5, "compressed block size : %u", (unsigned)(op - ostart)); - return op - ostart; + return (size_t)(op - ostart); } MEM_STATIC size_t @@ -2678,13 +2132,13 @@ ZSTD_compressSequences(seqStore_t* seqStorePtr, const ZSTD_CCtx_params* cctxParams, void* dst, size_t dstCapacity, size_t srcSize, - void* workspace, size_t wkspSize, + void* entropyWorkspace, size_t entropyWkspSize, int bmi2) { size_t const cSize = ZSTD_compressSequences_internal( seqStorePtr, prevEntropy, nextEntropy, cctxParams, dst, dstCapacity, - workspace, wkspSize, bmi2); + entropyWorkspace, entropyWkspSize, bmi2); if (cSize == 0) return 0; /* When srcSize <= dstCapacity, there is enough space to write a raw uncompressed block. * Since we ran out of space, block must be not compressible, so fall back to raw uncompressed block. @@ -2835,19 +2289,113 @@ static size_t ZSTD_buildSeqStore(ZSTD_CCtx* zc, const void* src, size_t srcSize) return ZSTDbss_compress; } +static void ZSTD_copyBlockSequences(ZSTD_CCtx* zc) +{ + const seqStore_t* seqStore = ZSTD_getSeqStore(zc); + const seqDef* seqs = seqStore->sequencesStart; + size_t seqsSize = seqStore->sequences - seqs; + + ZSTD_Sequence* outSeqs = &zc->seqCollector.seqStart[zc->seqCollector.seqIndex]; + size_t i; size_t position; int repIdx; + + assert(zc->seqCollector.seqIndex + 1 < zc->seqCollector.maxSequences); + for (i = 0, position = 0; i < seqsSize; ++i) { + outSeqs[i].offset = seqs[i].offset; + outSeqs[i].litLength = seqs[i].litLength; + outSeqs[i].matchLength = seqs[i].matchLength + MINMATCH; + + if (i == seqStore->longLengthPos) { + if (seqStore->longLengthID == 1) { + outSeqs[i].litLength += 0x10000; + } else if (seqStore->longLengthID == 2) { + outSeqs[i].matchLength += 0x10000; + } + } + + if (outSeqs[i].offset <= ZSTD_REP_NUM) { + outSeqs[i].rep = outSeqs[i].offset; + repIdx = (unsigned int)i - outSeqs[i].offset; + + if (outSeqs[i].litLength == 0) { + if (outSeqs[i].offset < 3) { + --repIdx; + } else { + repIdx = (unsigned int)i - 1; + } + ++outSeqs[i].rep; + } + assert(repIdx >= -3); + outSeqs[i].offset = repIdx >= 0 ? outSeqs[repIdx].offset : repStartValue[-repIdx - 1]; + if (outSeqs[i].rep == 4) { + --outSeqs[i].offset; + } + } else { + outSeqs[i].offset -= ZSTD_REP_NUM; + } + + position += outSeqs[i].litLength; + outSeqs[i].matchPos = (unsigned int)position; + position += outSeqs[i].matchLength; + } + zc->seqCollector.seqIndex += seqsSize; +} + +size_t ZSTD_getSequences(ZSTD_CCtx* zc, ZSTD_Sequence* outSeqs, + size_t outSeqsSize, const void* src, size_t srcSize) +{ + const size_t dstCapacity = ZSTD_compressBound(srcSize); + void* dst = ZSTD_malloc(dstCapacity, ZSTD_defaultCMem); + SeqCollector seqCollector; + + RETURN_ERROR_IF(dst == NULL, memory_allocation); + + seqCollector.collectSequences = 1; + seqCollector.seqStart = outSeqs; + seqCollector.seqIndex = 0; + seqCollector.maxSequences = outSeqsSize; + zc->seqCollector = seqCollector; + + ZSTD_compress2(zc, dst, dstCapacity, src, srcSize); + ZSTD_free(dst, ZSTD_defaultCMem); + return zc->seqCollector.seqIndex; +} + +/* Returns true if the given block is a RLE block */ +static int ZSTD_isRLE(const BYTE *ip, size_t length) { + size_t i; + if (length < 2) return 1; + for (i = 1; i < length; ++i) { + if (ip[0] != ip[i]) return 0; + } + return 1; +} + static size_t ZSTD_compressBlock_internal(ZSTD_CCtx* zc, void* dst, size_t dstCapacity, - const void* src, size_t srcSize) + const void* src, size_t srcSize, U32 frame) { + /* This the upper bound for the length of an rle block. + * This isn't the actual upper bound. Finding the real threshold + * needs further investigation. + */ + const U32 rleMaxLength = 25; size_t cSize; + const BYTE* ip = (const BYTE*)src; + BYTE* op = (BYTE*)dst; DEBUGLOG(5, "ZSTD_compressBlock_internal (dstCapacity=%u, dictLimit=%u, nextToUpdate=%u)", - (unsigned)dstCapacity, (unsigned)zc->blockState.matchState.window.dictLimit, (unsigned)zc->blockState.matchState.nextToUpdate); + (unsigned)dstCapacity, (unsigned)zc->blockState.matchState.window.dictLimit, + (unsigned)zc->blockState.matchState.nextToUpdate); { const size_t bss = ZSTD_buildSeqStore(zc, src, srcSize); FORWARD_IF_ERROR(bss); if (bss == ZSTDbss_noCompress) { cSize = 0; goto out; } } + if (zc->seqCollector.collectSequences) { + ZSTD_copyBlockSequences(zc); + return 0; + } + /* encode sequences and literals */ cSize = ZSTD_compressSequences(&zc->seqStore, &zc->blockState.prevCBlock->entropy, &zc->blockState.nextCBlock->entropy, @@ -2857,8 +2405,21 @@ static size_t ZSTD_compressBlock_internal(ZSTD_CCtx* zc, zc->entropyWorkspace, HUF_WORKSPACE_SIZE /* statically allocated in resetCCtx */, zc->bmi2); + if (frame && + /* We don't want to emit our first block as a RLE even if it qualifies because + * doing so will cause the decoder (cli only) to throw a "should consume all input error." + * This is only an issue for zstd <= v1.4.3 + */ + !zc->isFirstBlock && + cSize < rleMaxLength && + ZSTD_isRLE(ip, srcSize)) + { + cSize = 1; + op[0] = ip[0]; + } + out: - if (!ZSTD_isError(cSize) && cSize != 0) { + if (!ZSTD_isError(cSize) && cSize > 1) { /* confirm repcodes and entropy tables when emitting a compressed block */ ZSTD_compressedBlockState_t* const tmp = zc->blockState.prevCBlock; zc->blockState.prevCBlock = zc->blockState.nextCBlock; @@ -2875,7 +2436,11 @@ out: } -static void ZSTD_overflowCorrectIfNeeded(ZSTD_matchState_t* ms, ZSTD_CCtx_params const* params, void const* ip, void const* iend) +static void ZSTD_overflowCorrectIfNeeded(ZSTD_matchState_t* ms, + ZSTD_cwksp* ws, + ZSTD_CCtx_params const* params, + void const* ip, + void const* iend) { if (ZSTD_window_needOverflowCorrection(ms->window, iend)) { U32 const maxDist = (U32)1 << params->cParams.windowLog; @@ -2884,7 +2449,9 @@ static void ZSTD_overflowCorrectIfNeeded(ZSTD_matchState_t* ms, ZSTD_CCtx_params ZSTD_STATIC_ASSERT(ZSTD_CHAINLOG_MAX <= 30); ZSTD_STATIC_ASSERT(ZSTD_WINDOWLOG_MAX_32 <= 30); ZSTD_STATIC_ASSERT(ZSTD_WINDOWLOG_MAX <= 31); + ZSTD_cwksp_mark_tables_dirty(ws); ZSTD_reduceIndex(ms, params, correction); + ZSTD_cwksp_mark_tables_clean(ws); if (ms->nextToUpdate < correction) ms->nextToUpdate = 0; else ms->nextToUpdate -= correction; /* invalidate dictionaries on overflow correction */ @@ -2893,7 +2460,6 @@ static void ZSTD_overflowCorrectIfNeeded(ZSTD_matchState_t* ms, ZSTD_CCtx_params } } - /*! ZSTD_compress_frameChunk() : * Compress a chunk of data into one or multiple blocks. * All blocks will be terminated, all input will be consumed. @@ -2927,7 +2493,8 @@ static size_t ZSTD_compress_frameChunk (ZSTD_CCtx* cctx, "not enough space to store compressed block"); if (remaining < blockSize) blockSize = remaining; - ZSTD_overflowCorrectIfNeeded(ms, &cctx->appliedParams, ip, ip + blockSize); + ZSTD_overflowCorrectIfNeeded( + ms, &cctx->workspace, &cctx->appliedParams, ip, ip + blockSize); ZSTD_checkDictValidity(&ms->window, ip + blockSize, maxDist, &ms->loadedDictEnd, &ms->dictMatchState); /* Ensure hash/chain table insertion resumes no sooner than lowlimit */ @@ -2935,15 +2502,16 @@ static size_t ZSTD_compress_frameChunk (ZSTD_CCtx* cctx, { size_t cSize = ZSTD_compressBlock_internal(cctx, op+ZSTD_blockHeaderSize, dstCapacity-ZSTD_blockHeaderSize, - ip, blockSize); + ip, blockSize, 1 /* frame */); FORWARD_IF_ERROR(cSize); - if (cSize == 0) { /* block is not compressible */ cSize = ZSTD_noCompressBlock(op, dstCapacity, ip, blockSize, lastBlock); FORWARD_IF_ERROR(cSize); } else { - U32 const cBlockHeader24 = lastBlock + (((U32)bt_compressed)<<1) + (U32)(cSize << 3); - MEM_writeLE24(op, cBlockHeader24); + const U32 cBlockHeader = cSize == 1 ? + lastBlock + (((U32)bt_rle)<<1) + (U32)(blockSize << 3) : + lastBlock + (((U32)bt_compressed)<<1) + (U32)(cSize << 3); + MEM_writeLE24(op, cBlockHeader); cSize += ZSTD_blockHeaderSize; } @@ -2953,6 +2521,7 @@ static size_t ZSTD_compress_frameChunk (ZSTD_CCtx* cctx, op += cSize; assert(dstCapacity >= cSize); dstCapacity -= cSize; + cctx->isFirstBlock = 0; DEBUGLOG(5, "ZSTD_compress_frameChunk: adding a block of size %u", (unsigned)cSize); } } @@ -2963,25 +2532,25 @@ static size_t ZSTD_compress_frameChunk (ZSTD_CCtx* cctx, static size_t ZSTD_writeFrameHeader(void* dst, size_t dstCapacity, - ZSTD_CCtx_params params, U64 pledgedSrcSize, U32 dictID) + const ZSTD_CCtx_params* params, U64 pledgedSrcSize, U32 dictID) { BYTE* const op = (BYTE*)dst; U32 const dictIDSizeCodeLength = (dictID>0) + (dictID>=256) + (dictID>=65536); /* 0-3 */ - U32 const dictIDSizeCode = params.fParams.noDictIDFlag ? 0 : dictIDSizeCodeLength; /* 0-3 */ - U32 const checksumFlag = params.fParams.checksumFlag>0; - U32 const windowSize = (U32)1 << params.cParams.windowLog; - U32 const singleSegment = params.fParams.contentSizeFlag && (windowSize >= pledgedSrcSize); - BYTE const windowLogByte = (BYTE)((params.cParams.windowLog - ZSTD_WINDOWLOG_ABSOLUTEMIN) << 3); - U32 const fcsCode = params.fParams.contentSizeFlag ? + U32 const dictIDSizeCode = params->fParams.noDictIDFlag ? 0 : dictIDSizeCodeLength; /* 0-3 */ + U32 const checksumFlag = params->fParams.checksumFlag>0; + U32 const windowSize = (U32)1 << params->cParams.windowLog; + U32 const singleSegment = params->fParams.contentSizeFlag && (windowSize >= pledgedSrcSize); + BYTE const windowLogByte = (BYTE)((params->cParams.windowLog - ZSTD_WINDOWLOG_ABSOLUTEMIN) << 3); + U32 const fcsCode = params->fParams.contentSizeFlag ? (pledgedSrcSize>=256) + (pledgedSrcSize>=65536+256) + (pledgedSrcSize>=0xFFFFFFFFU) : 0; /* 0-3 */ BYTE const frameHeaderDescriptionByte = (BYTE)(dictIDSizeCode + (checksumFlag<<2) + (singleSegment<<5) + (fcsCode<<6) ); size_t pos=0; - assert(!(params.fParams.contentSizeFlag && pledgedSrcSize == ZSTD_CONTENTSIZE_UNKNOWN)); + assert(!(params->fParams.contentSizeFlag && pledgedSrcSize == ZSTD_CONTENTSIZE_UNKNOWN)); RETURN_ERROR_IF(dstCapacity < ZSTD_FRAMEHEADERSIZE_MAX, dstSize_tooSmall); DEBUGLOG(4, "ZSTD_writeFrameHeader : dictIDFlag : %u ; dictID : %u ; dictIDSizeCode : %u", - !params.fParams.noDictIDFlag, (unsigned)dictID, (unsigned)dictIDSizeCode); + !params->fParams.noDictIDFlag, (unsigned)dictID, (unsigned)dictIDSizeCode); - if (params.format == ZSTD_f_zstd1) { + if (params->format == ZSTD_f_zstd1) { MEM_writeLE32(dst, ZSTD_MAGICNUMBER); pos = 4; } @@ -3047,7 +2616,7 @@ static size_t ZSTD_compressContinue_internal (ZSTD_CCtx* cctx, "missing init (ZSTD_compressBegin)"); if (frame && (cctx->stage==ZSTDcs_init)) { - fhSize = ZSTD_writeFrameHeader(dst, dstCapacity, cctx->appliedParams, + fhSize = ZSTD_writeFrameHeader(dst, dstCapacity, &cctx->appliedParams, cctx->pledgedSrcSizePlusOne-1, cctx->dictID); FORWARD_IF_ERROR(fhSize); assert(fhSize <= dstCapacity); @@ -3067,13 +2636,15 @@ static size_t ZSTD_compressContinue_internal (ZSTD_CCtx* cctx, if (!frame) { /* overflow check and correction for block mode */ - ZSTD_overflowCorrectIfNeeded(ms, &cctx->appliedParams, src, (BYTE const*)src + srcSize); + ZSTD_overflowCorrectIfNeeded( + ms, &cctx->workspace, &cctx->appliedParams, + src, (BYTE const*)src + srcSize); } DEBUGLOG(5, "ZSTD_compressContinue_internal (blockSize=%u)", (unsigned)cctx->blockSize); { size_t const cSize = frame ? ZSTD_compress_frameChunk (cctx, dst, dstCapacity, src, srcSize, lastFrameChunk) : - ZSTD_compressBlock_internal (cctx, dst, dstCapacity, src, srcSize); + ZSTD_compressBlock_internal (cctx, dst, dstCapacity, src, srcSize, 0 /* frame */); FORWARD_IF_ERROR(cSize); cctx->consumedSrcSize += srcSize; cctx->producedCSize += (cSize + fhSize); @@ -3109,8 +2680,9 @@ size_t ZSTD_getBlockSize(const ZSTD_CCtx* cctx) size_t ZSTD_compressBlock(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize) { - size_t const blockSizeMax = ZSTD_getBlockSize(cctx); - RETURN_ERROR_IF(srcSize > blockSizeMax, srcSize_wrong); + DEBUGLOG(5, "ZSTD_compressBlock: srcSize = %u", (unsigned)srcSize); + { size_t const blockSizeMax = ZSTD_getBlockSize(cctx); + RETURN_ERROR_IF(srcSize > blockSizeMax, srcSize_wrong); } return ZSTD_compressContinue_internal(cctx, dst, dstCapacity, src, srcSize, 0 /* frame mode */, 0 /* last chunk */); } @@ -3119,6 +2691,7 @@ size_t ZSTD_compressBlock(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const * @return : 0, or an error code */ static size_t ZSTD_loadDictionaryContent(ZSTD_matchState_t* ms, + ZSTD_cwksp* ws, ZSTD_CCtx_params const* params, const void* src, size_t srcSize, ZSTD_dictTableLoadMethod_e dtlm) @@ -3135,11 +2708,11 @@ static size_t ZSTD_loadDictionaryContent(ZSTD_matchState_t* ms, if (srcSize <= HASH_READ_SIZE) return 0; while (iend - ip > HASH_READ_SIZE) { - size_t const remaining = iend - ip; + size_t const remaining = (size_t)(iend - ip); size_t const chunk = MIN(remaining, ZSTD_CHUNKSIZE_MAX); const BYTE* const ichunk = ip + chunk; - ZSTD_overflowCorrectIfNeeded(ms, params, ip, ichunk); + ZSTD_overflowCorrectIfNeeded(ms, ws, params, ip, ichunk); switch(params->cParams.strategy) { @@ -3198,10 +2771,11 @@ static size_t ZSTD_checkDictNCount(short* normalizedCounter, unsigned dictMaxSym /*! ZSTD_loadZstdDictionary() : * @return : dictID, or an error code * assumptions : magic number supposed already checked - * dictSize supposed > 8 + * dictSize supposed >= 8 */ static size_t ZSTD_loadZstdDictionary(ZSTD_compressedBlockState_t* bs, ZSTD_matchState_t* ms, + ZSTD_cwksp* ws, ZSTD_CCtx_params const* params, const void* dict, size_t dictSize, ZSTD_dictTableLoadMethod_e dtlm, @@ -3214,7 +2788,7 @@ static size_t ZSTD_loadZstdDictionary(ZSTD_compressedBlockState_t* bs, size_t dictID; ZSTD_STATIC_ASSERT(HUF_WORKSPACE_SIZE >= (1<<MAX(MLFSELog,LLFSELog))); - assert(dictSize > 8); + assert(dictSize >= 8); assert(MEM_readLE32(dictPtr) == ZSTD_MAGIC_DICTIONARY); dictPtr += 4; /* skip magic number */ @@ -3297,7 +2871,8 @@ static size_t ZSTD_loadZstdDictionary(ZSTD_compressedBlockState_t* bs, bs->entropy.fse.offcode_repeatMode = FSE_repeat_valid; bs->entropy.fse.matchlength_repeatMode = FSE_repeat_valid; bs->entropy.fse.litlength_repeatMode = FSE_repeat_valid; - FORWARD_IF_ERROR(ZSTD_loadDictionaryContent(ms, params, dictPtr, dictContentSize, dtlm)); + FORWARD_IF_ERROR(ZSTD_loadDictionaryContent( + ms, ws, params, dictPtr, dictContentSize, dtlm)); return dictID; } } @@ -3307,6 +2882,7 @@ static size_t ZSTD_loadZstdDictionary(ZSTD_compressedBlockState_t* bs, static size_t ZSTD_compress_insertDictionary(ZSTD_compressedBlockState_t* bs, ZSTD_matchState_t* ms, + ZSTD_cwksp* ws, const ZSTD_CCtx_params* params, const void* dict, size_t dictSize, ZSTD_dictContentType_e dictContentType, @@ -3314,27 +2890,35 @@ ZSTD_compress_insertDictionary(ZSTD_compressedBlockState_t* bs, void* workspace) { DEBUGLOG(4, "ZSTD_compress_insertDictionary (dictSize=%u)", (U32)dictSize); - if ((dict==NULL) || (dictSize<=8)) return 0; + if ((dict==NULL) || (dictSize<8)) { + RETURN_ERROR_IF(dictContentType == ZSTD_dct_fullDict, dictionary_wrong); + return 0; + } ZSTD_reset_compressedBlockState(bs); /* dict restricted modes */ if (dictContentType == ZSTD_dct_rawContent) - return ZSTD_loadDictionaryContent(ms, params, dict, dictSize, dtlm); + return ZSTD_loadDictionaryContent(ms, ws, params, dict, dictSize, dtlm); if (MEM_readLE32(dict) != ZSTD_MAGIC_DICTIONARY) { if (dictContentType == ZSTD_dct_auto) { DEBUGLOG(4, "raw content dictionary detected"); - return ZSTD_loadDictionaryContent(ms, params, dict, dictSize, dtlm); + return ZSTD_loadDictionaryContent( + ms, ws, params, dict, dictSize, dtlm); } RETURN_ERROR_IF(dictContentType == ZSTD_dct_fullDict, dictionary_wrong); assert(0); /* impossible */ } /* dict as full zstd dictionary */ - return ZSTD_loadZstdDictionary(bs, ms, params, dict, dictSize, dtlm, workspace); + return ZSTD_loadZstdDictionary( + bs, ms, ws, params, dict, dictSize, dtlm, workspace); } +#define ZSTD_USE_CDICT_PARAMS_SRCSIZE_CUTOFF (128 KB) +#define ZSTD_USE_CDICT_PARAMS_DICTSIZE_MULTIPLIER (6) + /*! ZSTD_compressBegin_internal() : * @return : 0, or an error code */ static size_t ZSTD_compressBegin_internal(ZSTD_CCtx* cctx, @@ -3342,23 +2926,34 @@ static size_t ZSTD_compressBegin_internal(ZSTD_CCtx* cctx, ZSTD_dictContentType_e dictContentType, ZSTD_dictTableLoadMethod_e dtlm, const ZSTD_CDict* cdict, - ZSTD_CCtx_params params, U64 pledgedSrcSize, + const ZSTD_CCtx_params* params, U64 pledgedSrcSize, ZSTD_buffered_policy_e zbuff) { - DEBUGLOG(4, "ZSTD_compressBegin_internal: wlog=%u", params.cParams.windowLog); + DEBUGLOG(4, "ZSTD_compressBegin_internal: wlog=%u", params->cParams.windowLog); /* params are supposed to be fully validated at this point */ - assert(!ZSTD_isError(ZSTD_checkCParams(params.cParams))); + assert(!ZSTD_isError(ZSTD_checkCParams(params->cParams))); assert(!((dict) && (cdict))); /* either dict or cdict, not both */ - - if (cdict && cdict->dictContentSize>0) { + if ( (cdict) + && (cdict->dictContentSize > 0) + && ( pledgedSrcSize < ZSTD_USE_CDICT_PARAMS_SRCSIZE_CUTOFF + || pledgedSrcSize < cdict->dictContentSize * ZSTD_USE_CDICT_PARAMS_DICTSIZE_MULTIPLIER + || pledgedSrcSize == ZSTD_CONTENTSIZE_UNKNOWN + || cdict->compressionLevel == 0) + && (params->attachDictPref != ZSTD_dictForceLoad) ) { return ZSTD_resetCCtx_usingCDict(cctx, cdict, params, pledgedSrcSize, zbuff); } - FORWARD_IF_ERROR( ZSTD_resetCCtx_internal(cctx, params, pledgedSrcSize, - ZSTDcrp_continue, zbuff) ); - { size_t const dictID = ZSTD_compress_insertDictionary( - cctx->blockState.prevCBlock, &cctx->blockState.matchState, - ¶ms, dict, dictSize, dictContentType, dtlm, cctx->entropyWorkspace); + FORWARD_IF_ERROR( ZSTD_resetCCtx_internal(cctx, *params, pledgedSrcSize, + ZSTDcrp_makeClean, zbuff) ); + { size_t const dictID = cdict ? + ZSTD_compress_insertDictionary( + cctx->blockState.prevCBlock, &cctx->blockState.matchState, + &cctx->workspace, params, cdict->dictContent, cdict->dictContentSize, + dictContentType, dtlm, cctx->entropyWorkspace) + : ZSTD_compress_insertDictionary( + cctx->blockState.prevCBlock, &cctx->blockState.matchState, + &cctx->workspace, params, dict, dictSize, + dictContentType, dtlm, cctx->entropyWorkspace); FORWARD_IF_ERROR(dictID); assert(dictID <= UINT_MAX); cctx->dictID = (U32)dictID; @@ -3371,12 +2966,12 @@ size_t ZSTD_compressBegin_advanced_internal(ZSTD_CCtx* cctx, ZSTD_dictContentType_e dictContentType, ZSTD_dictTableLoadMethod_e dtlm, const ZSTD_CDict* cdict, - ZSTD_CCtx_params params, + const ZSTD_CCtx_params* params, unsigned long long pledgedSrcSize) { - DEBUGLOG(4, "ZSTD_compressBegin_advanced_internal: wlog=%u", params.cParams.windowLog); + DEBUGLOG(4, "ZSTD_compressBegin_advanced_internal: wlog=%u", params->cParams.windowLog); /* compression parameters verification and optimization */ - FORWARD_IF_ERROR( ZSTD_checkCParams(params.cParams) ); + FORWARD_IF_ERROR( ZSTD_checkCParams(params->cParams) ); return ZSTD_compressBegin_internal(cctx, dict, dictSize, dictContentType, dtlm, cdict, @@ -3391,21 +2986,21 @@ size_t ZSTD_compressBegin_advanced(ZSTD_CCtx* cctx, ZSTD_parameters params, unsigned long long pledgedSrcSize) { ZSTD_CCtx_params const cctxParams = - ZSTD_assignParamsToCCtxParams(cctx->requestedParams, params); + ZSTD_assignParamsToCCtxParams(&cctx->requestedParams, params); return ZSTD_compressBegin_advanced_internal(cctx, dict, dictSize, ZSTD_dct_auto, ZSTD_dtlm_fast, NULL /*cdict*/, - cctxParams, pledgedSrcSize); + &cctxParams, pledgedSrcSize); } size_t ZSTD_compressBegin_usingDict(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, int compressionLevel) { ZSTD_parameters const params = ZSTD_getParams(compressionLevel, ZSTD_CONTENTSIZE_UNKNOWN, dictSize); ZSTD_CCtx_params const cctxParams = - ZSTD_assignParamsToCCtxParams(cctx->requestedParams, params); + ZSTD_assignParamsToCCtxParams(&cctx->requestedParams, params); DEBUGLOG(4, "ZSTD_compressBegin_usingDict (dictSize=%u)", (unsigned)dictSize); return ZSTD_compressBegin_internal(cctx, dict, dictSize, ZSTD_dct_auto, ZSTD_dtlm_fast, NULL, - cctxParams, ZSTD_CONTENTSIZE_UNKNOWN, ZSTDb_not_buffered); + &cctxParams, ZSTD_CONTENTSIZE_UNKNOWN, ZSTDb_not_buffered); } size_t ZSTD_compressBegin(ZSTD_CCtx* cctx, int compressionLevel) @@ -3428,7 +3023,7 @@ static size_t ZSTD_writeEpilogue(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity) /* special case : empty frame */ if (cctx->stage == ZSTDcs_init) { - fhSize = ZSTD_writeFrameHeader(dst, dstCapacity, cctx->appliedParams, 0, 0); + fhSize = ZSTD_writeFrameHeader(dst, dstCapacity, &cctx->appliedParams, 0, 0); FORWARD_IF_ERROR(fhSize); dstCapacity -= fhSize; op += fhSize; @@ -3489,13 +3084,13 @@ static size_t ZSTD_compress_internal (ZSTD_CCtx* cctx, ZSTD_parameters params) { ZSTD_CCtx_params const cctxParams = - ZSTD_assignParamsToCCtxParams(cctx->requestedParams, params); + ZSTD_assignParamsToCCtxParams(&cctx->requestedParams, params); DEBUGLOG(4, "ZSTD_compress_internal"); return ZSTD_compress_advanced_internal(cctx, dst, dstCapacity, src, srcSize, dict, dictSize, - cctxParams); + &cctxParams); } size_t ZSTD_compress_advanced (ZSTD_CCtx* cctx, @@ -3519,7 +3114,7 @@ size_t ZSTD_compress_advanced_internal( void* dst, size_t dstCapacity, const void* src, size_t srcSize, const void* dict,size_t dictSize, - ZSTD_CCtx_params params) + const ZSTD_CCtx_params* params) { DEBUGLOG(4, "ZSTD_compress_advanced_internal (srcSize:%u)", (unsigned)srcSize); FORWARD_IF_ERROR( ZSTD_compressBegin_internal(cctx, @@ -3535,9 +3130,9 @@ size_t ZSTD_compress_usingDict(ZSTD_CCtx* cctx, int compressionLevel) { ZSTD_parameters const params = ZSTD_getParams(compressionLevel, srcSize + (!srcSize), dict ? dictSize : 0); - ZSTD_CCtx_params cctxParams = ZSTD_assignParamsToCCtxParams(cctx->requestedParams, params); + ZSTD_CCtx_params cctxParams = ZSTD_assignParamsToCCtxParams(&cctx->requestedParams, params); assert(params.fParams.contentSizeFlag == 1); - return ZSTD_compress_advanced_internal(cctx, dst, dstCapacity, src, srcSize, dict, dictSize, cctxParams); + return ZSTD_compress_advanced_internal(cctx, dst, dstCapacity, src, srcSize, dict, dictSize, &cctxParams); } size_t ZSTD_compressCCtx(ZSTD_CCtx* cctx, @@ -3572,8 +3167,11 @@ size_t ZSTD_estimateCDictSize_advanced( ZSTD_dictLoadMethod_e dictLoadMethod) { DEBUGLOG(5, "sizeof(ZSTD_CDict) : %u", (unsigned)sizeof(ZSTD_CDict)); - return sizeof(ZSTD_CDict) + HUF_WORKSPACE_SIZE + ZSTD_sizeof_matchState(&cParams, /* forCCtx */ 0) - + (dictLoadMethod == ZSTD_dlm_byRef ? 0 : dictSize); + return ZSTD_cwksp_alloc_size(sizeof(ZSTD_CDict)) + + ZSTD_cwksp_alloc_size(HUF_WORKSPACE_SIZE) + + ZSTD_sizeof_matchState(&cParams, /* forCCtx */ 0) + + (dictLoadMethod == ZSTD_dlm_byRef ? 0 + : ZSTD_cwksp_alloc_size(ZSTD_cwksp_align(dictSize, sizeof(void *)))); } size_t ZSTD_estimateCDictSize(size_t dictSize, int compressionLevel) @@ -3586,7 +3184,9 @@ size_t ZSTD_sizeof_CDict(const ZSTD_CDict* cdict) { if (cdict==NULL) return 0; /* support sizeof on NULL */ DEBUGLOG(5, "sizeof(*cdict) : %u", (unsigned)sizeof(*cdict)); - return cdict->workspaceSize + (cdict->dictBuffer ? cdict->dictContentSize : 0) + sizeof(*cdict); + /* cdict may be in the workspace */ + return (cdict->workspace.workspace == cdict ? 0 : sizeof(*cdict)) + + ZSTD_cwksp_sizeof(&cdict->workspace); } static size_t ZSTD_initCDict_internal( @@ -3600,28 +3200,29 @@ static size_t ZSTD_initCDict_internal( assert(!ZSTD_checkCParams(cParams)); cdict->matchState.cParams = cParams; if ((dictLoadMethod == ZSTD_dlm_byRef) || (!dictBuffer) || (!dictSize)) { - cdict->dictBuffer = NULL; cdict->dictContent = dictBuffer; } else { - void* const internalBuffer = ZSTD_malloc(dictSize, cdict->customMem); - cdict->dictBuffer = internalBuffer; - cdict->dictContent = internalBuffer; + void *internalBuffer = ZSTD_cwksp_reserve_object(&cdict->workspace, ZSTD_cwksp_align(dictSize, sizeof(void*))); RETURN_ERROR_IF(!internalBuffer, memory_allocation); + cdict->dictContent = internalBuffer; memcpy(internalBuffer, dictBuffer, dictSize); } cdict->dictContentSize = dictSize; + cdict->entropyWorkspace = (U32*)ZSTD_cwksp_reserve_object(&cdict->workspace, HUF_WORKSPACE_SIZE); + + /* Reset the state to no dictionary */ ZSTD_reset_compressedBlockState(&cdict->cBlockState); - { void* const end = ZSTD_reset_matchState(&cdict->matchState, - (U32*)cdict->workspace + HUF_WORKSPACE_SIZE_U32, - &cParams, - ZSTDcrp_continue, ZSTD_resetTarget_CDict); - assert(end == (char*)cdict->workspace + cdict->workspaceSize); - (void)end; - } + FORWARD_IF_ERROR(ZSTD_reset_matchState( + &cdict->matchState, + &cdict->workspace, + &cParams, + ZSTDcrp_makeClean, + ZSTDirp_reset, + ZSTD_resetTarget_CDict)); /* (Maybe) load the dictionary - * Skips loading the dictionary if it is <= 8 bytes. + * Skips loading the dictionary if it is < 8 bytes. */ { ZSTD_CCtx_params params; memset(¶ms, 0, sizeof(params)); @@ -3629,9 +3230,9 @@ static size_t ZSTD_initCDict_internal( params.fParams.contentSizeFlag = 1; params.cParams = cParams; { size_t const dictID = ZSTD_compress_insertDictionary( - &cdict->cBlockState, &cdict->matchState, ¶ms, - cdict->dictContent, cdict->dictContentSize, - dictContentType, ZSTD_dtlm_full, cdict->workspace); + &cdict->cBlockState, &cdict->matchState, &cdict->workspace, + ¶ms, cdict->dictContent, cdict->dictContentSize, + dictContentType, ZSTD_dtlm_full, cdict->entropyWorkspace); FORWARD_IF_ERROR(dictID); assert(dictID <= (size_t)(U32)-1); cdict->dictID = (U32)dictID; @@ -3649,18 +3250,29 @@ ZSTD_CDict* ZSTD_createCDict_advanced(const void* dictBuffer, size_t dictSize, DEBUGLOG(3, "ZSTD_createCDict_advanced, mode %u", (unsigned)dictContentType); if (!customMem.customAlloc ^ !customMem.customFree) return NULL; - { ZSTD_CDict* const cdict = (ZSTD_CDict*)ZSTD_malloc(sizeof(ZSTD_CDict), customMem); - size_t const workspaceSize = HUF_WORKSPACE_SIZE + ZSTD_sizeof_matchState(&cParams, /* forCCtx */ 0); + { size_t const workspaceSize = + ZSTD_cwksp_alloc_size(sizeof(ZSTD_CDict)) + + ZSTD_cwksp_alloc_size(HUF_WORKSPACE_SIZE) + + ZSTD_sizeof_matchState(&cParams, /* forCCtx */ 0) + + (dictLoadMethod == ZSTD_dlm_byRef ? 0 + : ZSTD_cwksp_alloc_size(ZSTD_cwksp_align(dictSize, sizeof(void*)))); void* const workspace = ZSTD_malloc(workspaceSize, customMem); + ZSTD_cwksp ws; + ZSTD_CDict* cdict; - if (!cdict || !workspace) { - ZSTD_free(cdict, customMem); + if (!workspace) { ZSTD_free(workspace, customMem); return NULL; } + + ZSTD_cwksp_init(&ws, workspace, workspaceSize); + + cdict = (ZSTD_CDict*)ZSTD_cwksp_reserve_object(&ws, sizeof(ZSTD_CDict)); + assert(cdict != NULL); + ZSTD_cwksp_move(&cdict->workspace, &ws); cdict->customMem = customMem; - cdict->workspace = workspace; - cdict->workspaceSize = workspaceSize; + cdict->compressionLevel = 0; /* signals advanced API usage */ + if (ZSTD_isError( ZSTD_initCDict_internal(cdict, dictBuffer, dictSize, dictLoadMethod, dictContentType, @@ -3676,9 +3288,12 @@ ZSTD_CDict* ZSTD_createCDict_advanced(const void* dictBuffer, size_t dictSize, ZSTD_CDict* ZSTD_createCDict(const void* dict, size_t dictSize, int compressionLevel) { ZSTD_compressionParameters cParams = ZSTD_getCParams(compressionLevel, 0, dictSize); - return ZSTD_createCDict_advanced(dict, dictSize, - ZSTD_dlm_byCopy, ZSTD_dct_auto, - cParams, ZSTD_defaultCMem); + ZSTD_CDict* cdict = ZSTD_createCDict_advanced(dict, dictSize, + ZSTD_dlm_byCopy, ZSTD_dct_auto, + cParams, ZSTD_defaultCMem); + if (cdict) + cdict->compressionLevel = compressionLevel == 0 ? ZSTD_CLEVEL_DEFAULT : compressionLevel; + return cdict; } ZSTD_CDict* ZSTD_createCDict_byReference(const void* dict, size_t dictSize, int compressionLevel) @@ -3693,9 +3308,11 @@ size_t ZSTD_freeCDict(ZSTD_CDict* cdict) { if (cdict==NULL) return 0; /* support free on NULL */ { ZSTD_customMem const cMem = cdict->customMem; - ZSTD_free(cdict->workspace, cMem); - ZSTD_free(cdict->dictBuffer, cMem); - ZSTD_free(cdict, cMem); + int cdictInWorkspace = ZSTD_cwksp_owns_buffer(&cdict->workspace, cdict); + ZSTD_cwksp_free(&cdict->workspace, cMem); + if (!cdictInWorkspace) { + ZSTD_free(cdict, cMem); + } return 0; } } @@ -3721,28 +3338,30 @@ const ZSTD_CDict* ZSTD_initStaticCDict( ZSTD_compressionParameters cParams) { size_t const matchStateSize = ZSTD_sizeof_matchState(&cParams, /* forCCtx */ 0); - size_t const neededSize = sizeof(ZSTD_CDict) + (dictLoadMethod == ZSTD_dlm_byRef ? 0 : dictSize) - + HUF_WORKSPACE_SIZE + matchStateSize; - ZSTD_CDict* const cdict = (ZSTD_CDict*) workspace; - void* ptr; + size_t const neededSize = ZSTD_cwksp_alloc_size(sizeof(ZSTD_CDict)) + + (dictLoadMethod == ZSTD_dlm_byRef ? 0 + : ZSTD_cwksp_alloc_size(ZSTD_cwksp_align(dictSize, sizeof(void*)))) + + ZSTD_cwksp_alloc_size(HUF_WORKSPACE_SIZE) + + matchStateSize; + ZSTD_CDict* cdict; + if ((size_t)workspace & 7) return NULL; /* 8-aligned */ + + { + ZSTD_cwksp ws; + ZSTD_cwksp_init(&ws, workspace, workspaceSize); + cdict = (ZSTD_CDict*)ZSTD_cwksp_reserve_object(&ws, sizeof(ZSTD_CDict)); + if (cdict == NULL) return NULL; + ZSTD_cwksp_move(&cdict->workspace, &ws); + } + DEBUGLOG(4, "(workspaceSize < neededSize) : (%u < %u) => %u", (unsigned)workspaceSize, (unsigned)neededSize, (unsigned)(workspaceSize < neededSize)); if (workspaceSize < neededSize) return NULL; - if (dictLoadMethod == ZSTD_dlm_byCopy) { - memcpy(cdict+1, dict, dictSize); - dict = cdict+1; - ptr = (char*)workspace + sizeof(ZSTD_CDict) + dictSize; - } else { - ptr = cdict+1; - } - cdict->workspace = ptr; - cdict->workspaceSize = HUF_WORKSPACE_SIZE + matchStateSize; - if (ZSTD_isError( ZSTD_initCDict_internal(cdict, dict, dictSize, - ZSTD_dlm_byRef, dictContentType, + dictLoadMethod, dictContentType, cParams) )) return NULL; @@ -3764,7 +3383,15 @@ size_t ZSTD_compressBegin_usingCDict_advanced( DEBUGLOG(4, "ZSTD_compressBegin_usingCDict_advanced"); RETURN_ERROR_IF(cdict==NULL, dictionary_wrong); { ZSTD_CCtx_params params = cctx->requestedParams; - params.cParams = ZSTD_getCParamsFromCDict(cdict); + params.cParams = ( pledgedSrcSize < ZSTD_USE_CDICT_PARAMS_SRCSIZE_CUTOFF + || pledgedSrcSize < cdict->dictContentSize * ZSTD_USE_CDICT_PARAMS_DICTSIZE_MULTIPLIER + || pledgedSrcSize == ZSTD_CONTENTSIZE_UNKNOWN + || cdict->compressionLevel == 0 ) + && (params.attachDictPref != ZSTD_dictForceLoad) ? + ZSTD_getCParamsFromCDict(cdict) + : ZSTD_getCParams(cdict->compressionLevel, + pledgedSrcSize, + cdict->dictContentSize); /* Increase window log to fit the entire dictionary and source if the * source size is known. Limit the increase to 19, which is the * window log for compression level 1 with the largest source size. @@ -3778,7 +3405,7 @@ size_t ZSTD_compressBegin_usingCDict_advanced( return ZSTD_compressBegin_internal(cctx, NULL, 0, ZSTD_dct_auto, ZSTD_dtlm_fast, cdict, - params, pledgedSrcSize, + ¶ms, pledgedSrcSize, ZSTDb_not_buffered); } } @@ -3869,7 +3496,7 @@ static size_t ZSTD_resetCStream_internal(ZSTD_CStream* cctx, FORWARD_IF_ERROR( ZSTD_compressBegin_internal(cctx, dict, dictSize, dictContentType, ZSTD_dtlm_fast, cdict, - params, pledgedSrcSize, + ¶ms, pledgedSrcSize, ZSTDb_buffered) ); cctx->inToCompress = 0; @@ -3903,13 +3530,14 @@ size_t ZSTD_resetCStream(ZSTD_CStream* zcs, unsigned long long pss) * Assumption 2 : either dict, or cdict, is defined, not both */ size_t ZSTD_initCStream_internal(ZSTD_CStream* zcs, const void* dict, size_t dictSize, const ZSTD_CDict* cdict, - ZSTD_CCtx_params params, unsigned long long pledgedSrcSize) + const ZSTD_CCtx_params* params, + unsigned long long pledgedSrcSize) { DEBUGLOG(4, "ZSTD_initCStream_internal"); FORWARD_IF_ERROR( ZSTD_CCtx_reset(zcs, ZSTD_reset_session_only) ); FORWARD_IF_ERROR( ZSTD_CCtx_setPledgedSrcSize(zcs, pledgedSrcSize) ); - assert(!ZSTD_isError(ZSTD_checkCParams(params.cParams))); - zcs->requestedParams = params; + assert(!ZSTD_isError(ZSTD_checkCParams(params->cParams))); + zcs->requestedParams = *params; assert(!((dict) && (cdict))); /* either dict or cdict, not both */ if (dict) { FORWARD_IF_ERROR( ZSTD_CCtx_loadDictionary(zcs, dict, dictSize) ); @@ -3948,7 +3576,7 @@ size_t ZSTD_initCStream_usingCDict(ZSTD_CStream* zcs, const ZSTD_CDict* cdict) /* ZSTD_initCStream_advanced() : * pledgedSrcSize must be exact. * if srcSize is not known at init time, use value ZSTD_CONTENTSIZE_UNKNOWN. - * dict is loaded with default parameters ZSTD_dm_auto and ZSTD_dlm_byCopy. */ + * dict is loaded with default parameters ZSTD_dct_auto and ZSTD_dlm_byCopy. */ size_t ZSTD_initCStream_advanced(ZSTD_CStream* zcs, const void* dict, size_t dictSize, ZSTD_parameters params, unsigned long long pss) @@ -3962,7 +3590,7 @@ size_t ZSTD_initCStream_advanced(ZSTD_CStream* zcs, FORWARD_IF_ERROR( ZSTD_CCtx_reset(zcs, ZSTD_reset_session_only) ); FORWARD_IF_ERROR( ZSTD_CCtx_setPledgedSrcSize(zcs, pledgedSrcSize) ); FORWARD_IF_ERROR( ZSTD_checkCParams(params.cParams) ); - zcs->requestedParams = ZSTD_assignParamsToCCtxParams(zcs->requestedParams, params); + zcs->requestedParams = ZSTD_assignParamsToCCtxParams(&zcs->requestedParams, params); FORWARD_IF_ERROR( ZSTD_CCtx_loadDictionary(zcs, dict, dictSize) ); return 0; } @@ -4212,7 +3840,7 @@ size_t ZSTD_compressStream2( ZSTD_CCtx* cctx, if (cctx->mtctx == NULL) { DEBUGLOG(4, "ZSTD_compressStream2: creating new mtctx for nbWorkers=%u", params.nbWorkers); - cctx->mtctx = ZSTDMT_createCCtx_advanced(params.nbWorkers, cctx->customMem); + cctx->mtctx = ZSTDMT_createCCtx_advanced((U32)params.nbWorkers, cctx->customMem); RETURN_ERROR_IF(cctx->mtctx == NULL, memory_allocation); } /* mt compression */ @@ -4340,8 +3968,8 @@ static const ZSTD_compressionParameters ZSTD_defaultCParameters[4][ZSTD_MAX_CLEV { 19, 12, 13, 1, 6, 1, ZSTD_fast }, /* base for negative levels */ { 19, 13, 14, 1, 7, 0, ZSTD_fast }, /* level 1 */ { 20, 15, 16, 1, 6, 0, ZSTD_fast }, /* level 2 */ - { 21, 16, 17, 1, 5, 1, ZSTD_dfast }, /* level 3 */ - { 21, 18, 18, 1, 5, 1, ZSTD_dfast }, /* level 4 */ + { 21, 16, 17, 1, 5, 0, ZSTD_dfast }, /* level 3 */ + { 21, 18, 18, 1, 5, 0, ZSTD_dfast }, /* level 4 */ { 21, 18, 19, 2, 5, 2, ZSTD_greedy }, /* level 5 */ { 21, 19, 19, 3, 5, 4, ZSTD_greedy }, /* level 6 */ { 21, 19, 19, 3, 5, 8, ZSTD_lazy }, /* level 7 */ @@ -4365,8 +3993,8 @@ static const ZSTD_compressionParameters ZSTD_defaultCParameters[4][ZSTD_MAX_CLEV /* W, C, H, S, L, T, strat */ { 18, 12, 13, 1, 5, 1, ZSTD_fast }, /* base for negative levels */ { 18, 13, 14, 1, 6, 0, ZSTD_fast }, /* level 1 */ - { 18, 14, 14, 1, 5, 1, ZSTD_dfast }, /* level 2 */ - { 18, 16, 16, 1, 4, 1, ZSTD_dfast }, /* level 3 */ + { 18, 14, 14, 1, 5, 0, ZSTD_dfast }, /* level 2 */ + { 18, 16, 16, 1, 4, 0, ZSTD_dfast }, /* level 3 */ { 18, 16, 17, 2, 5, 2, ZSTD_greedy }, /* level 4.*/ { 18, 18, 18, 3, 5, 2, ZSTD_greedy }, /* level 5.*/ { 18, 18, 19, 3, 5, 4, ZSTD_lazy }, /* level 6.*/ @@ -4392,8 +4020,8 @@ static const ZSTD_compressionParameters ZSTD_defaultCParameters[4][ZSTD_MAX_CLEV { 17, 12, 12, 1, 5, 1, ZSTD_fast }, /* base for negative levels */ { 17, 12, 13, 1, 6, 0, ZSTD_fast }, /* level 1 */ { 17, 13, 15, 1, 5, 0, ZSTD_fast }, /* level 2 */ - { 17, 15, 16, 2, 5, 1, ZSTD_dfast }, /* level 3 */ - { 17, 17, 17, 2, 4, 1, ZSTD_dfast }, /* level 4 */ + { 17, 15, 16, 2, 5, 0, ZSTD_dfast }, /* level 3 */ + { 17, 17, 17, 2, 4, 0, ZSTD_dfast }, /* level 4 */ { 17, 16, 17, 3, 4, 2, ZSTD_greedy }, /* level 5 */ { 17, 17, 17, 3, 4, 4, ZSTD_lazy }, /* level 6 */ { 17, 17, 17, 3, 4, 8, ZSTD_lazy2 }, /* level 7 */ @@ -4418,7 +4046,7 @@ static const ZSTD_compressionParameters ZSTD_defaultCParameters[4][ZSTD_MAX_CLEV { 14, 12, 13, 1, 5, 1, ZSTD_fast }, /* base for negative levels */ { 14, 14, 15, 1, 5, 0, ZSTD_fast }, /* level 1 */ { 14, 14, 15, 1, 4, 0, ZSTD_fast }, /* level 2 */ - { 14, 14, 15, 2, 4, 1, ZSTD_dfast }, /* level 3 */ + { 14, 14, 15, 2, 4, 0, ZSTD_dfast }, /* level 3 */ { 14, 14, 14, 4, 4, 2, ZSTD_greedy }, /* level 4 */ { 14, 14, 14, 3, 4, 4, ZSTD_lazy }, /* level 5.*/ { 14, 14, 14, 4, 4, 8, ZSTD_lazy2 }, /* level 6 */ diff --git a/thirdparty/zstd/compress/zstd_compress_internal.h b/thirdparty/zstd/compress/zstd_compress_internal.h index 5495899be3..14036f873f 100644 --- a/thirdparty/zstd/compress/zstd_compress_internal.h +++ b/thirdparty/zstd/compress/zstd_compress_internal.h @@ -19,6 +19,7 @@ * Dependencies ***************************************/ #include "zstd_internal.h" +#include "zstd_cwksp.h" #ifdef ZSTD_MULTITHREAD # include "zstdmt_compress.h" #endif @@ -134,9 +135,15 @@ typedef struct { typedef struct ZSTD_matchState_t ZSTD_matchState_t; struct ZSTD_matchState_t { ZSTD_window_t window; /* State for window round buffer management */ - U32 loadedDictEnd; /* index of end of dictionary, within context's referential. When dict referential is copied into active context (i.e. not attached), effectively same value as dictSize, since referential starts from zero */ + U32 loadedDictEnd; /* index of end of dictionary, within context's referential. + * When loadedDictEnd != 0, a dictionary is in use, and still valid. + * This relies on a mechanism to set loadedDictEnd=0 when dictionary is no longer within distance. + * Such mechanism is provided within ZSTD_window_enforceMaxDist() and ZSTD_checkDictValidity(). + * When dict referential is copied into active context (i.e. not attached), + * loadedDictEnd == dictSize, since referential starts from zero. + */ U32 nextToUpdate; /* index from which to continue table update */ - U32 hashLog3; /* dispatch table : larger == faster, more memory */ + U32 hashLog3; /* dispatch table for matches of len==3 : larger == faster, more memory */ U32* hashTable; U32* hashTable3; U32* chainTable; @@ -186,6 +193,13 @@ typedef struct { size_t capacity; /* The capacity starting from `seq` pointer */ } rawSeqStore_t; +typedef struct { + int collectSequences; + ZSTD_Sequence* seqStart; + size_t seqIndex; + size_t maxSequences; +} SeqCollector; + struct ZSTD_CCtx_params_s { ZSTD_format_e format; ZSTD_compressionParameters cParams; @@ -197,6 +211,9 @@ struct ZSTD_CCtx_params_s { size_t targetCBlockSize; /* Tries to fit compressed block size to be around targetCBlockSize. * No target when targetCBlockSize == 0. * There is no guarantee on compressed block size */ + int srcSizeHint; /* User's best guess of source size. + * Hint is not valid when srcSizeHint == 0. + * There is no guarantee that hint is close to actual source size */ ZSTD_dictAttachPref_e attachDictPref; ZSTD_literalCompressionMode_e literalCompressionMode; @@ -222,9 +239,7 @@ struct ZSTD_CCtx_s { ZSTD_CCtx_params appliedParams; U32 dictID; - int workSpaceOversizedDuration; - void* workSpace; - size_t workSpaceSize; + ZSTD_cwksp workspace; /* manages buffer for dynamic allocations */ size_t blockSize; unsigned long long pledgedSrcSizePlusOne; /* this way, 0 (default) == unknown */ unsigned long long consumedSrcSize; @@ -232,6 +247,8 @@ struct ZSTD_CCtx_s { XXH64_state_t xxhState; ZSTD_customMem customMem; size_t staticSize; + SeqCollector seqCollector; + int isFirstBlock; seqStore_t seqStore; /* sequences storage ptrs */ ldmState_t ldmState; /* long distance matching state */ @@ -307,26 +324,81 @@ MEM_STATIC U32 ZSTD_MLcode(U32 mlBase) return (mlBase > 127) ? ZSTD_highbit32(mlBase) + ML_deltaCode : ML_Code[mlBase]; } +/* ZSTD_cParam_withinBounds: + * @return 1 if value is within cParam bounds, + * 0 otherwise */ +MEM_STATIC int ZSTD_cParam_withinBounds(ZSTD_cParameter cParam, int value) +{ + ZSTD_bounds const bounds = ZSTD_cParam_getBounds(cParam); + if (ZSTD_isError(bounds.error)) return 0; + if (value < bounds.lowerBound) return 0; + if (value > bounds.upperBound) return 0; + return 1; +} + +/* ZSTD_minGain() : + * minimum compression required + * to generate a compress block or a compressed literals section. + * note : use same formula for both situations */ +MEM_STATIC size_t ZSTD_minGain(size_t srcSize, ZSTD_strategy strat) +{ + U32 const minlog = (strat>=ZSTD_btultra) ? (U32)(strat) - 1 : 6; + ZSTD_STATIC_ASSERT(ZSTD_btultra == 8); + assert(ZSTD_cParam_withinBounds(ZSTD_c_strategy, strat)); + return (srcSize >> minlog) + 2; +} + +/*! ZSTD_safecopyLiterals() : + * memcpy() function that won't read beyond more than WILDCOPY_OVERLENGTH bytes past ilimit_w. + * Only called when the sequence ends past ilimit_w, so it only needs to be optimized for single + * large copies. + */ +static void ZSTD_safecopyLiterals(BYTE* op, BYTE const* ip, BYTE const* const iend, BYTE const* ilimit_w) { + assert(iend > ilimit_w); + if (ip <= ilimit_w) { + ZSTD_wildcopy(op, ip, ilimit_w - ip, ZSTD_no_overlap); + op += ilimit_w - ip; + ip = ilimit_w; + } + while (ip < iend) *op++ = *ip++; +} + /*! ZSTD_storeSeq() : - * Store a sequence (literal length, literals, offset code and match length code) into seqStore_t. - * `offsetCode` : distance to match + 3 (values 1-3 are repCodes). + * Store a sequence (litlen, litPtr, offCode and mlBase) into seqStore_t. + * `offCode` : distance to match + ZSTD_REP_MOVE (values <= ZSTD_REP_MOVE are repCodes). * `mlBase` : matchLength - MINMATCH + * Allowed to overread literals up to litLimit. */ -MEM_STATIC void ZSTD_storeSeq(seqStore_t* seqStorePtr, size_t litLength, const void* literals, U32 offsetCode, size_t mlBase) +HINT_INLINE UNUSED_ATTR +void ZSTD_storeSeq(seqStore_t* seqStorePtr, size_t litLength, const BYTE* literals, const BYTE* litLimit, U32 offCode, size_t mlBase) { + BYTE const* const litLimit_w = litLimit - WILDCOPY_OVERLENGTH; + BYTE const* const litEnd = literals + litLength; #if defined(DEBUGLEVEL) && (DEBUGLEVEL >= 6) static const BYTE* g_start = NULL; if (g_start==NULL) g_start = (const BYTE*)literals; /* note : index only works for compression within a single segment */ { U32 const pos = (U32)((const BYTE*)literals - g_start); DEBUGLOG(6, "Cpos%7u :%3u literals, match%4u bytes at offCode%7u", - pos, (U32)litLength, (U32)mlBase+MINMATCH, (U32)offsetCode); + pos, (U32)litLength, (U32)mlBase+MINMATCH, (U32)offCode); } #endif assert((size_t)(seqStorePtr->sequences - seqStorePtr->sequencesStart) < seqStorePtr->maxNbSeq); /* copy Literals */ assert(seqStorePtr->maxNbLit <= 128 KB); assert(seqStorePtr->lit + litLength <= seqStorePtr->litStart + seqStorePtr->maxNbLit); - ZSTD_wildcopy(seqStorePtr->lit, literals, litLength, ZSTD_no_overlap); + assert(literals + litLength <= litLimit); + if (litEnd <= litLimit_w) { + /* Common case we can use wildcopy. + * First copy 16 bytes, because literals are likely short. + */ + assert(WILDCOPY_OVERLENGTH >= 16); + ZSTD_copy16(seqStorePtr->lit, literals); + if (litLength > 16) { + ZSTD_wildcopy(seqStorePtr->lit+16, literals+16, (ptrdiff_t)litLength-16, ZSTD_no_overlap); + } + } else { + ZSTD_safecopyLiterals(seqStorePtr->lit, literals, litEnd, litLimit_w); + } seqStorePtr->lit += litLength; /* literal Length */ @@ -338,7 +410,7 @@ MEM_STATIC void ZSTD_storeSeq(seqStore_t* seqStorePtr, size_t litLength, const v seqStorePtr->sequences[0].litLength = (U16)litLength; /* match offset */ - seqStorePtr->sequences[0].offset = offsetCode + 1; + seqStorePtr->sequences[0].offset = offCode + 1; /* match Length */ if (mlBase>0xFFFF) { @@ -739,24 +811,37 @@ ZSTD_window_enforceMaxDist(ZSTD_window_t* window, /* Similar to ZSTD_window_enforceMaxDist(), * but only invalidates dictionary - * when input progresses beyond window size. */ + * when input progresses beyond window size. + * assumption : loadedDictEndPtr and dictMatchStatePtr are valid (non NULL) + * loadedDictEnd uses same referential as window->base + * maxDist is the window size */ MEM_STATIC void -ZSTD_checkDictValidity(ZSTD_window_t* window, +ZSTD_checkDictValidity(const ZSTD_window_t* window, const void* blockEnd, U32 maxDist, U32* loadedDictEndPtr, const ZSTD_matchState_t** dictMatchStatePtr) { - U32 const blockEndIdx = (U32)((BYTE const*)blockEnd - window->base); - U32 const loadedDictEnd = (loadedDictEndPtr != NULL) ? *loadedDictEndPtr : 0; - DEBUGLOG(5, "ZSTD_checkDictValidity: blockEndIdx=%u, maxDist=%u, loadedDictEnd=%u", - (unsigned)blockEndIdx, (unsigned)maxDist, (unsigned)loadedDictEnd); - - if (loadedDictEnd && (blockEndIdx > maxDist + loadedDictEnd)) { - /* On reaching window size, dictionaries are invalidated */ - if (loadedDictEndPtr) *loadedDictEndPtr = 0; - if (dictMatchStatePtr) *dictMatchStatePtr = NULL; - } + assert(loadedDictEndPtr != NULL); + assert(dictMatchStatePtr != NULL); + { U32 const blockEndIdx = (U32)((BYTE const*)blockEnd - window->base); + U32 const loadedDictEnd = *loadedDictEndPtr; + DEBUGLOG(5, "ZSTD_checkDictValidity: blockEndIdx=%u, maxDist=%u, loadedDictEnd=%u", + (unsigned)blockEndIdx, (unsigned)maxDist, (unsigned)loadedDictEnd); + assert(blockEndIdx >= loadedDictEnd); + + if (blockEndIdx > loadedDictEnd + maxDist) { + /* On reaching window size, dictionaries are invalidated. + * For simplification, if window size is reached anywhere within next block, + * the dictionary is invalidated for the full block. + */ + DEBUGLOG(6, "invalidating dictionary for current block (distance > windowSize)"); + *loadedDictEndPtr = 0; + *dictMatchStatePtr = NULL; + } else { + if (*loadedDictEndPtr != 0) { + DEBUGLOG(6, "dictionary considered valid for current block"); + } } } } /** @@ -798,6 +883,17 @@ MEM_STATIC U32 ZSTD_window_update(ZSTD_window_t* window, return contiguous; } +MEM_STATIC U32 ZSTD_getLowestMatchIndex(const ZSTD_matchState_t* ms, U32 current, unsigned windowLog) +{ + U32 const maxDistance = 1U << windowLog; + U32 const lowestValid = ms->window.lowLimit; + U32 const withinWindow = (current - lowestValid > maxDistance) ? current - maxDistance : lowestValid; + U32 const isDictionary = (ms->loadedDictEnd != 0); + U32 const matchLowest = isDictionary ? lowestValid : withinWindow; + return matchLowest; +} + + /* debug functions */ #if (DEBUGLEVEL>=2) @@ -856,7 +952,7 @@ ZSTD_compressionParameters ZSTD_getCParamsFromCCtxParams( size_t ZSTD_initCStream_internal(ZSTD_CStream* zcs, const void* dict, size_t dictSize, const ZSTD_CDict* cdict, - ZSTD_CCtx_params params, unsigned long long pledgedSrcSize); + const ZSTD_CCtx_params* params, unsigned long long pledgedSrcSize); void ZSTD_resetSeqStore(seqStore_t* ssPtr); @@ -871,7 +967,7 @@ size_t ZSTD_compressBegin_advanced_internal(ZSTD_CCtx* cctx, ZSTD_dictContentType_e dictContentType, ZSTD_dictTableLoadMethod_e dtlm, const ZSTD_CDict* cdict, - ZSTD_CCtx_params params, + const ZSTD_CCtx_params* params, unsigned long long pledgedSrcSize); /* ZSTD_compress_advanced_internal() : @@ -880,7 +976,7 @@ size_t ZSTD_compress_advanced_internal(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, const void* dict,size_t dictSize, - ZSTD_CCtx_params params); + const ZSTD_CCtx_params* params); /* ZSTD_writeLastEmptyBlock() : diff --git a/thirdparty/zstd/compress/zstd_compress_literals.c b/thirdparty/zstd/compress/zstd_compress_literals.c new file mode 100644 index 0000000000..6c13331182 --- /dev/null +++ b/thirdparty/zstd/compress/zstd_compress_literals.c @@ -0,0 +1,154 @@ +/* + * Copyright (c) 2016-present, Yann Collet, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under both the BSD-style license (found in the + * LICENSE file in the root directory of this source tree) and the GPLv2 (found + * in the COPYING file in the root directory of this source tree). + * You may select, at your option, one of the above-listed licenses. + */ + + /*-************************************* + * Dependencies + ***************************************/ +#include "zstd_compress_literals.h" + +size_t ZSTD_noCompressLiterals (void* dst, size_t dstCapacity, const void* src, size_t srcSize) +{ + BYTE* const ostart = (BYTE* const)dst; + U32 const flSize = 1 + (srcSize>31) + (srcSize>4095); + + RETURN_ERROR_IF(srcSize + flSize > dstCapacity, dstSize_tooSmall); + + switch(flSize) + { + case 1: /* 2 - 1 - 5 */ + ostart[0] = (BYTE)((U32)set_basic + (srcSize<<3)); + break; + case 2: /* 2 - 2 - 12 */ + MEM_writeLE16(ostart, (U16)((U32)set_basic + (1<<2) + (srcSize<<4))); + break; + case 3: /* 2 - 2 - 20 */ + MEM_writeLE32(ostart, (U32)((U32)set_basic + (3<<2) + (srcSize<<4))); + break; + default: /* not necessary : flSize is {1,2,3} */ + assert(0); + } + + memcpy(ostart + flSize, src, srcSize); + return srcSize + flSize; +} + +size_t ZSTD_compressRleLiteralsBlock (void* dst, size_t dstCapacity, const void* src, size_t srcSize) +{ + BYTE* const ostart = (BYTE* const)dst; + U32 const flSize = 1 + (srcSize>31) + (srcSize>4095); + + (void)dstCapacity; /* dstCapacity already guaranteed to be >=4, hence large enough */ + + switch(flSize) + { + case 1: /* 2 - 1 - 5 */ + ostart[0] = (BYTE)((U32)set_rle + (srcSize<<3)); + break; + case 2: /* 2 - 2 - 12 */ + MEM_writeLE16(ostart, (U16)((U32)set_rle + (1<<2) + (srcSize<<4))); + break; + case 3: /* 2 - 2 - 20 */ + MEM_writeLE32(ostart, (U32)((U32)set_rle + (3<<2) + (srcSize<<4))); + break; + default: /* not necessary : flSize is {1,2,3} */ + assert(0); + } + + ostart[flSize] = *(const BYTE*)src; + return flSize+1; +} + +size_t ZSTD_compressLiterals (ZSTD_hufCTables_t const* prevHuf, + ZSTD_hufCTables_t* nextHuf, + ZSTD_strategy strategy, int disableLiteralCompression, + void* dst, size_t dstCapacity, + const void* src, size_t srcSize, + void* entropyWorkspace, size_t entropyWorkspaceSize, + const int bmi2) +{ + size_t const minGain = ZSTD_minGain(srcSize, strategy); + size_t const lhSize = 3 + (srcSize >= 1 KB) + (srcSize >= 16 KB); + BYTE* const ostart = (BYTE*)dst; + U32 singleStream = srcSize < 256; + symbolEncodingType_e hType = set_compressed; + size_t cLitSize; + + DEBUGLOG(5,"ZSTD_compressLiterals (disableLiteralCompression=%i)", + disableLiteralCompression); + + /* Prepare nextEntropy assuming reusing the existing table */ + memcpy(nextHuf, prevHuf, sizeof(*prevHuf)); + + if (disableLiteralCompression) + return ZSTD_noCompressLiterals(dst, dstCapacity, src, srcSize); + + /* small ? don't even attempt compression (speed opt) */ +# define COMPRESS_LITERALS_SIZE_MIN 63 + { size_t const minLitSize = (prevHuf->repeatMode == HUF_repeat_valid) ? 6 : COMPRESS_LITERALS_SIZE_MIN; + if (srcSize <= minLitSize) return ZSTD_noCompressLiterals(dst, dstCapacity, src, srcSize); + } + + RETURN_ERROR_IF(dstCapacity < lhSize+1, dstSize_tooSmall, "not enough space for compression"); + { HUF_repeat repeat = prevHuf->repeatMode; + int const preferRepeat = strategy < ZSTD_lazy ? srcSize <= 1024 : 0; + if (repeat == HUF_repeat_valid && lhSize == 3) singleStream = 1; + cLitSize = singleStream ? + HUF_compress1X_repeat( + ostart+lhSize, dstCapacity-lhSize, src, srcSize, + 255, 11, entropyWorkspace, entropyWorkspaceSize, + (HUF_CElt*)nextHuf->CTable, &repeat, preferRepeat, bmi2) : + HUF_compress4X_repeat( + ostart+lhSize, dstCapacity-lhSize, src, srcSize, + 255, 11, entropyWorkspace, entropyWorkspaceSize, + (HUF_CElt*)nextHuf->CTable, &repeat, preferRepeat, bmi2); + if (repeat != HUF_repeat_none) { + /* reused the existing table */ + hType = set_repeat; + } + } + + if ((cLitSize==0) | (cLitSize >= srcSize - minGain) | ERR_isError(cLitSize)) { + memcpy(nextHuf, prevHuf, sizeof(*prevHuf)); + return ZSTD_noCompressLiterals(dst, dstCapacity, src, srcSize); + } + if (cLitSize==1) { + memcpy(nextHuf, prevHuf, sizeof(*prevHuf)); + return ZSTD_compressRleLiteralsBlock(dst, dstCapacity, src, srcSize); + } + + if (hType == set_compressed) { + /* using a newly constructed table */ + nextHuf->repeatMode = HUF_repeat_check; + } + + /* Build header */ + switch(lhSize) + { + case 3: /* 2 - 2 - 10 - 10 */ + { U32 const lhc = hType + ((!singleStream) << 2) + ((U32)srcSize<<4) + ((U32)cLitSize<<14); + MEM_writeLE24(ostart, lhc); + break; + } + case 4: /* 2 - 2 - 14 - 14 */ + { U32 const lhc = hType + (2 << 2) + ((U32)srcSize<<4) + ((U32)cLitSize<<18); + MEM_writeLE32(ostart, lhc); + break; + } + case 5: /* 2 - 2 - 18 - 18 */ + { U32 const lhc = hType + (3 << 2) + ((U32)srcSize<<4) + ((U32)cLitSize<<22); + MEM_writeLE32(ostart, lhc); + ostart[4] = (BYTE)(cLitSize >> 10); + break; + } + default: /* not possible : lhSize is {3,4,5} */ + assert(0); + } + return lhSize+cLitSize; +} diff --git a/thirdparty/zstd/compress/zstd_compress_literals.h b/thirdparty/zstd/compress/zstd_compress_literals.h new file mode 100644 index 0000000000..97273d7cfd --- /dev/null +++ b/thirdparty/zstd/compress/zstd_compress_literals.h @@ -0,0 +1,29 @@ +/* + * Copyright (c) 2016-present, Yann Collet, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under both the BSD-style license (found in the + * LICENSE file in the root directory of this source tree) and the GPLv2 (found + * in the COPYING file in the root directory of this source tree). + * You may select, at your option, one of the above-listed licenses. + */ + +#ifndef ZSTD_COMPRESS_LITERALS_H +#define ZSTD_COMPRESS_LITERALS_H + +#include "zstd_compress_internal.h" /* ZSTD_hufCTables_t, ZSTD_minGain() */ + + +size_t ZSTD_noCompressLiterals (void* dst, size_t dstCapacity, const void* src, size_t srcSize); + +size_t ZSTD_compressRleLiteralsBlock (void* dst, size_t dstCapacity, const void* src, size_t srcSize); + +size_t ZSTD_compressLiterals (ZSTD_hufCTables_t const* prevHuf, + ZSTD_hufCTables_t* nextHuf, + ZSTD_strategy strategy, int disableLiteralCompression, + void* dst, size_t dstCapacity, + const void* src, size_t srcSize, + void* entropyWorkspace, size_t entropyWorkspaceSize, + const int bmi2); + +#endif /* ZSTD_COMPRESS_LITERALS_H */ diff --git a/thirdparty/zstd/compress/zstd_compress_sequences.c b/thirdparty/zstd/compress/zstd_compress_sequences.c new file mode 100644 index 0000000000..0ff7a26823 --- /dev/null +++ b/thirdparty/zstd/compress/zstd_compress_sequences.c @@ -0,0 +1,415 @@ +/* + * Copyright (c) 2016-present, Yann Collet, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under both the BSD-style license (found in the + * LICENSE file in the root directory of this source tree) and the GPLv2 (found + * in the COPYING file in the root directory of this source tree). + * You may select, at your option, one of the above-listed licenses. + */ + + /*-************************************* + * Dependencies + ***************************************/ +#include "zstd_compress_sequences.h" + +/** + * -log2(x / 256) lookup table for x in [0, 256). + * If x == 0: Return 0 + * Else: Return floor(-log2(x / 256) * 256) + */ +static unsigned const kInverseProbabilityLog256[256] = { + 0, 2048, 1792, 1642, 1536, 1453, 1386, 1329, 1280, 1236, 1197, 1162, + 1130, 1100, 1073, 1047, 1024, 1001, 980, 960, 941, 923, 906, 889, + 874, 859, 844, 830, 817, 804, 791, 779, 768, 756, 745, 734, + 724, 714, 704, 694, 685, 676, 667, 658, 650, 642, 633, 626, + 618, 610, 603, 595, 588, 581, 574, 567, 561, 554, 548, 542, + 535, 529, 523, 517, 512, 506, 500, 495, 489, 484, 478, 473, + 468, 463, 458, 453, 448, 443, 438, 434, 429, 424, 420, 415, + 411, 407, 402, 398, 394, 390, 386, 382, 377, 373, 370, 366, + 362, 358, 354, 350, 347, 343, 339, 336, 332, 329, 325, 322, + 318, 315, 311, 308, 305, 302, 298, 295, 292, 289, 286, 282, + 279, 276, 273, 270, 267, 264, 261, 258, 256, 253, 250, 247, + 244, 241, 239, 236, 233, 230, 228, 225, 222, 220, 217, 215, + 212, 209, 207, 204, 202, 199, 197, 194, 192, 190, 187, 185, + 182, 180, 178, 175, 173, 171, 168, 166, 164, 162, 159, 157, + 155, 153, 151, 149, 146, 144, 142, 140, 138, 136, 134, 132, + 130, 128, 126, 123, 121, 119, 117, 115, 114, 112, 110, 108, + 106, 104, 102, 100, 98, 96, 94, 93, 91, 89, 87, 85, + 83, 82, 80, 78, 76, 74, 73, 71, 69, 67, 66, 64, + 62, 61, 59, 57, 55, 54, 52, 50, 49, 47, 46, 44, + 42, 41, 39, 37, 36, 34, 33, 31, 30, 28, 26, 25, + 23, 22, 20, 19, 17, 16, 14, 13, 11, 10, 8, 7, + 5, 4, 2, 1, +}; + +static unsigned ZSTD_getFSEMaxSymbolValue(FSE_CTable const* ctable) { + void const* ptr = ctable; + U16 const* u16ptr = (U16 const*)ptr; + U32 const maxSymbolValue = MEM_read16(u16ptr + 1); + return maxSymbolValue; +} + +/** + * Returns the cost in bytes of encoding the normalized count header. + * Returns an error if any of the helper functions return an error. + */ +static size_t ZSTD_NCountCost(unsigned const* count, unsigned const max, + size_t const nbSeq, unsigned const FSELog) +{ + BYTE wksp[FSE_NCOUNTBOUND]; + S16 norm[MaxSeq + 1]; + const U32 tableLog = FSE_optimalTableLog(FSELog, nbSeq, max); + FORWARD_IF_ERROR(FSE_normalizeCount(norm, tableLog, count, nbSeq, max)); + return FSE_writeNCount(wksp, sizeof(wksp), norm, max, tableLog); +} + +/** + * Returns the cost in bits of encoding the distribution described by count + * using the entropy bound. + */ +static size_t ZSTD_entropyCost(unsigned const* count, unsigned const max, size_t const total) +{ + unsigned cost = 0; + unsigned s; + for (s = 0; s <= max; ++s) { + unsigned norm = (unsigned)((256 * count[s]) / total); + if (count[s] != 0 && norm == 0) + norm = 1; + assert(count[s] < total); + cost += count[s] * kInverseProbabilityLog256[norm]; + } + return cost >> 8; +} + +/** + * Returns the cost in bits of encoding the distribution in count using ctable. + * Returns an error if ctable cannot represent all the symbols in count. + */ +static size_t ZSTD_fseBitCost( + FSE_CTable const* ctable, + unsigned const* count, + unsigned const max) +{ + unsigned const kAccuracyLog = 8; + size_t cost = 0; + unsigned s; + FSE_CState_t cstate; + FSE_initCState(&cstate, ctable); + RETURN_ERROR_IF(ZSTD_getFSEMaxSymbolValue(ctable) < max, GENERIC, + "Repeat FSE_CTable has maxSymbolValue %u < %u", + ZSTD_getFSEMaxSymbolValue(ctable), max); + for (s = 0; s <= max; ++s) { + unsigned const tableLog = cstate.stateLog; + unsigned const badCost = (tableLog + 1) << kAccuracyLog; + unsigned const bitCost = FSE_bitCost(cstate.symbolTT, tableLog, s, kAccuracyLog); + if (count[s] == 0) + continue; + RETURN_ERROR_IF(bitCost >= badCost, GENERIC, + "Repeat FSE_CTable has Prob[%u] == 0", s); + cost += count[s] * bitCost; + } + return cost >> kAccuracyLog; +} + +/** + * Returns the cost in bits of encoding the distribution in count using the + * table described by norm. The max symbol support by norm is assumed >= max. + * norm must be valid for every symbol with non-zero probability in count. + */ +static size_t ZSTD_crossEntropyCost(short const* norm, unsigned accuracyLog, + unsigned const* count, unsigned const max) +{ + unsigned const shift = 8 - accuracyLog; + size_t cost = 0; + unsigned s; + assert(accuracyLog <= 8); + for (s = 0; s <= max; ++s) { + unsigned const normAcc = norm[s] != -1 ? norm[s] : 1; + unsigned const norm256 = normAcc << shift; + assert(norm256 > 0); + assert(norm256 < 256); + cost += count[s] * kInverseProbabilityLog256[norm256]; + } + return cost >> 8; +} + +symbolEncodingType_e +ZSTD_selectEncodingType( + FSE_repeat* repeatMode, unsigned const* count, unsigned const max, + size_t const mostFrequent, size_t nbSeq, unsigned const FSELog, + FSE_CTable const* prevCTable, + short const* defaultNorm, U32 defaultNormLog, + ZSTD_defaultPolicy_e const isDefaultAllowed, + ZSTD_strategy const strategy) +{ + ZSTD_STATIC_ASSERT(ZSTD_defaultDisallowed == 0 && ZSTD_defaultAllowed != 0); + if (mostFrequent == nbSeq) { + *repeatMode = FSE_repeat_none; + if (isDefaultAllowed && nbSeq <= 2) { + /* Prefer set_basic over set_rle when there are 2 or less symbols, + * since RLE uses 1 byte, but set_basic uses 5-6 bits per symbol. + * If basic encoding isn't possible, always choose RLE. + */ + DEBUGLOG(5, "Selected set_basic"); + return set_basic; + } + DEBUGLOG(5, "Selected set_rle"); + return set_rle; + } + if (strategy < ZSTD_lazy) { + if (isDefaultAllowed) { + size_t const staticFse_nbSeq_max = 1000; + size_t const mult = 10 - strategy; + size_t const baseLog = 3; + size_t const dynamicFse_nbSeq_min = (((size_t)1 << defaultNormLog) * mult) >> baseLog; /* 28-36 for offset, 56-72 for lengths */ + assert(defaultNormLog >= 5 && defaultNormLog <= 6); /* xx_DEFAULTNORMLOG */ + assert(mult <= 9 && mult >= 7); + if ( (*repeatMode == FSE_repeat_valid) + && (nbSeq < staticFse_nbSeq_max) ) { + DEBUGLOG(5, "Selected set_repeat"); + return set_repeat; + } + if ( (nbSeq < dynamicFse_nbSeq_min) + || (mostFrequent < (nbSeq >> (defaultNormLog-1))) ) { + DEBUGLOG(5, "Selected set_basic"); + /* The format allows default tables to be repeated, but it isn't useful. + * When using simple heuristics to select encoding type, we don't want + * to confuse these tables with dictionaries. When running more careful + * analysis, we don't need to waste time checking both repeating tables + * and default tables. + */ + *repeatMode = FSE_repeat_none; + return set_basic; + } + } + } else { + size_t const basicCost = isDefaultAllowed ? ZSTD_crossEntropyCost(defaultNorm, defaultNormLog, count, max) : ERROR(GENERIC); + size_t const repeatCost = *repeatMode != FSE_repeat_none ? ZSTD_fseBitCost(prevCTable, count, max) : ERROR(GENERIC); + size_t const NCountCost = ZSTD_NCountCost(count, max, nbSeq, FSELog); + size_t const compressedCost = (NCountCost << 3) + ZSTD_entropyCost(count, max, nbSeq); + + if (isDefaultAllowed) { + assert(!ZSTD_isError(basicCost)); + assert(!(*repeatMode == FSE_repeat_valid && ZSTD_isError(repeatCost))); + } + assert(!ZSTD_isError(NCountCost)); + assert(compressedCost < ERROR(maxCode)); + DEBUGLOG(5, "Estimated bit costs: basic=%u\trepeat=%u\tcompressed=%u", + (unsigned)basicCost, (unsigned)repeatCost, (unsigned)compressedCost); + if (basicCost <= repeatCost && basicCost <= compressedCost) { + DEBUGLOG(5, "Selected set_basic"); + assert(isDefaultAllowed); + *repeatMode = FSE_repeat_none; + return set_basic; + } + if (repeatCost <= compressedCost) { + DEBUGLOG(5, "Selected set_repeat"); + assert(!ZSTD_isError(repeatCost)); + return set_repeat; + } + assert(compressedCost < basicCost && compressedCost < repeatCost); + } + DEBUGLOG(5, "Selected set_compressed"); + *repeatMode = FSE_repeat_check; + return set_compressed; +} + +size_t +ZSTD_buildCTable(void* dst, size_t dstCapacity, + FSE_CTable* nextCTable, U32 FSELog, symbolEncodingType_e type, + unsigned* count, U32 max, + const BYTE* codeTable, size_t nbSeq, + const S16* defaultNorm, U32 defaultNormLog, U32 defaultMax, + const FSE_CTable* prevCTable, size_t prevCTableSize, + void* entropyWorkspace, size_t entropyWorkspaceSize) +{ + BYTE* op = (BYTE*)dst; + const BYTE* const oend = op + dstCapacity; + DEBUGLOG(6, "ZSTD_buildCTable (dstCapacity=%u)", (unsigned)dstCapacity); + + switch (type) { + case set_rle: + FORWARD_IF_ERROR(FSE_buildCTable_rle(nextCTable, (BYTE)max)); + RETURN_ERROR_IF(dstCapacity==0, dstSize_tooSmall); + *op = codeTable[0]; + return 1; + case set_repeat: + memcpy(nextCTable, prevCTable, prevCTableSize); + return 0; + case set_basic: + FORWARD_IF_ERROR(FSE_buildCTable_wksp(nextCTable, defaultNorm, defaultMax, defaultNormLog, entropyWorkspace, entropyWorkspaceSize)); /* note : could be pre-calculated */ + return 0; + case set_compressed: { + S16 norm[MaxSeq + 1]; + size_t nbSeq_1 = nbSeq; + const U32 tableLog = FSE_optimalTableLog(FSELog, nbSeq, max); + if (count[codeTable[nbSeq-1]] > 1) { + count[codeTable[nbSeq-1]]--; + nbSeq_1--; + } + assert(nbSeq_1 > 1); + FORWARD_IF_ERROR(FSE_normalizeCount(norm, tableLog, count, nbSeq_1, max)); + { size_t const NCountSize = FSE_writeNCount(op, oend - op, norm, max, tableLog); /* overflow protected */ + FORWARD_IF_ERROR(NCountSize); + FORWARD_IF_ERROR(FSE_buildCTable_wksp(nextCTable, norm, max, tableLog, entropyWorkspace, entropyWorkspaceSize)); + return NCountSize; + } + } + default: assert(0); RETURN_ERROR(GENERIC); + } +} + +FORCE_INLINE_TEMPLATE size_t +ZSTD_encodeSequences_body( + void* dst, size_t dstCapacity, + FSE_CTable const* CTable_MatchLength, BYTE const* mlCodeTable, + FSE_CTable const* CTable_OffsetBits, BYTE const* ofCodeTable, + FSE_CTable const* CTable_LitLength, BYTE const* llCodeTable, + seqDef const* sequences, size_t nbSeq, int longOffsets) +{ + BIT_CStream_t blockStream; + FSE_CState_t stateMatchLength; + FSE_CState_t stateOffsetBits; + FSE_CState_t stateLitLength; + + RETURN_ERROR_IF( + ERR_isError(BIT_initCStream(&blockStream, dst, dstCapacity)), + dstSize_tooSmall, "not enough space remaining"); + DEBUGLOG(6, "available space for bitstream : %i (dstCapacity=%u)", + (int)(blockStream.endPtr - blockStream.startPtr), + (unsigned)dstCapacity); + + /* first symbols */ + FSE_initCState2(&stateMatchLength, CTable_MatchLength, mlCodeTable[nbSeq-1]); + FSE_initCState2(&stateOffsetBits, CTable_OffsetBits, ofCodeTable[nbSeq-1]); + FSE_initCState2(&stateLitLength, CTable_LitLength, llCodeTable[nbSeq-1]); + BIT_addBits(&blockStream, sequences[nbSeq-1].litLength, LL_bits[llCodeTable[nbSeq-1]]); + if (MEM_32bits()) BIT_flushBits(&blockStream); + BIT_addBits(&blockStream, sequences[nbSeq-1].matchLength, ML_bits[mlCodeTable[nbSeq-1]]); + if (MEM_32bits()) BIT_flushBits(&blockStream); + if (longOffsets) { + U32 const ofBits = ofCodeTable[nbSeq-1]; + int const extraBits = ofBits - MIN(ofBits, STREAM_ACCUMULATOR_MIN-1); + if (extraBits) { + BIT_addBits(&blockStream, sequences[nbSeq-1].offset, extraBits); + BIT_flushBits(&blockStream); + } + BIT_addBits(&blockStream, sequences[nbSeq-1].offset >> extraBits, + ofBits - extraBits); + } else { + BIT_addBits(&blockStream, sequences[nbSeq-1].offset, ofCodeTable[nbSeq-1]); + } + BIT_flushBits(&blockStream); + + { size_t n; + for (n=nbSeq-2 ; n<nbSeq ; n--) { /* intentional underflow */ + BYTE const llCode = llCodeTable[n]; + BYTE const ofCode = ofCodeTable[n]; + BYTE const mlCode = mlCodeTable[n]; + U32 const llBits = LL_bits[llCode]; + U32 const ofBits = ofCode; + U32 const mlBits = ML_bits[mlCode]; + DEBUGLOG(6, "encoding: litlen:%2u - matchlen:%2u - offCode:%7u", + (unsigned)sequences[n].litLength, + (unsigned)sequences[n].matchLength + MINMATCH, + (unsigned)sequences[n].offset); + /* 32b*/ /* 64b*/ + /* (7)*/ /* (7)*/ + FSE_encodeSymbol(&blockStream, &stateOffsetBits, ofCode); /* 15 */ /* 15 */ + FSE_encodeSymbol(&blockStream, &stateMatchLength, mlCode); /* 24 */ /* 24 */ + if (MEM_32bits()) BIT_flushBits(&blockStream); /* (7)*/ + FSE_encodeSymbol(&blockStream, &stateLitLength, llCode); /* 16 */ /* 33 */ + if (MEM_32bits() || (ofBits+mlBits+llBits >= 64-7-(LLFSELog+MLFSELog+OffFSELog))) + BIT_flushBits(&blockStream); /* (7)*/ + BIT_addBits(&blockStream, sequences[n].litLength, llBits); + if (MEM_32bits() && ((llBits+mlBits)>24)) BIT_flushBits(&blockStream); + BIT_addBits(&blockStream, sequences[n].matchLength, mlBits); + if (MEM_32bits() || (ofBits+mlBits+llBits > 56)) BIT_flushBits(&blockStream); + if (longOffsets) { + int const extraBits = ofBits - MIN(ofBits, STREAM_ACCUMULATOR_MIN-1); + if (extraBits) { + BIT_addBits(&blockStream, sequences[n].offset, extraBits); + BIT_flushBits(&blockStream); /* (7)*/ + } + BIT_addBits(&blockStream, sequences[n].offset >> extraBits, + ofBits - extraBits); /* 31 */ + } else { + BIT_addBits(&blockStream, sequences[n].offset, ofBits); /* 31 */ + } + BIT_flushBits(&blockStream); /* (7)*/ + DEBUGLOG(7, "remaining space : %i", (int)(blockStream.endPtr - blockStream.ptr)); + } } + + DEBUGLOG(6, "ZSTD_encodeSequences: flushing ML state with %u bits", stateMatchLength.stateLog); + FSE_flushCState(&blockStream, &stateMatchLength); + DEBUGLOG(6, "ZSTD_encodeSequences: flushing Off state with %u bits", stateOffsetBits.stateLog); + FSE_flushCState(&blockStream, &stateOffsetBits); + DEBUGLOG(6, "ZSTD_encodeSequences: flushing LL state with %u bits", stateLitLength.stateLog); + FSE_flushCState(&blockStream, &stateLitLength); + + { size_t const streamSize = BIT_closeCStream(&blockStream); + RETURN_ERROR_IF(streamSize==0, dstSize_tooSmall, "not enough space"); + return streamSize; + } +} + +static size_t +ZSTD_encodeSequences_default( + void* dst, size_t dstCapacity, + FSE_CTable const* CTable_MatchLength, BYTE const* mlCodeTable, + FSE_CTable const* CTable_OffsetBits, BYTE const* ofCodeTable, + FSE_CTable const* CTable_LitLength, BYTE const* llCodeTable, + seqDef const* sequences, size_t nbSeq, int longOffsets) +{ + return ZSTD_encodeSequences_body(dst, dstCapacity, + CTable_MatchLength, mlCodeTable, + CTable_OffsetBits, ofCodeTable, + CTable_LitLength, llCodeTable, + sequences, nbSeq, longOffsets); +} + + +#if DYNAMIC_BMI2 + +static TARGET_ATTRIBUTE("bmi2") size_t +ZSTD_encodeSequences_bmi2( + void* dst, size_t dstCapacity, + FSE_CTable const* CTable_MatchLength, BYTE const* mlCodeTable, + FSE_CTable const* CTable_OffsetBits, BYTE const* ofCodeTable, + FSE_CTable const* CTable_LitLength, BYTE const* llCodeTable, + seqDef const* sequences, size_t nbSeq, int longOffsets) +{ + return ZSTD_encodeSequences_body(dst, dstCapacity, + CTable_MatchLength, mlCodeTable, + CTable_OffsetBits, ofCodeTable, + CTable_LitLength, llCodeTable, + sequences, nbSeq, longOffsets); +} + +#endif + +size_t ZSTD_encodeSequences( + void* dst, size_t dstCapacity, + FSE_CTable const* CTable_MatchLength, BYTE const* mlCodeTable, + FSE_CTable const* CTable_OffsetBits, BYTE const* ofCodeTable, + FSE_CTable const* CTable_LitLength, BYTE const* llCodeTable, + seqDef const* sequences, size_t nbSeq, int longOffsets, int bmi2) +{ + DEBUGLOG(5, "ZSTD_encodeSequences: dstCapacity = %u", (unsigned)dstCapacity); +#if DYNAMIC_BMI2 + if (bmi2) { + return ZSTD_encodeSequences_bmi2(dst, dstCapacity, + CTable_MatchLength, mlCodeTable, + CTable_OffsetBits, ofCodeTable, + CTable_LitLength, llCodeTable, + sequences, nbSeq, longOffsets); + } +#endif + (void)bmi2; + return ZSTD_encodeSequences_default(dst, dstCapacity, + CTable_MatchLength, mlCodeTable, + CTable_OffsetBits, ofCodeTable, + CTable_LitLength, llCodeTable, + sequences, nbSeq, longOffsets); +} diff --git a/thirdparty/zstd/compress/zstd_compress_sequences.h b/thirdparty/zstd/compress/zstd_compress_sequences.h new file mode 100644 index 0000000000..57e8e367b0 --- /dev/null +++ b/thirdparty/zstd/compress/zstd_compress_sequences.h @@ -0,0 +1,47 @@ +/* + * Copyright (c) 2016-present, Yann Collet, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under both the BSD-style license (found in the + * LICENSE file in the root directory of this source tree) and the GPLv2 (found + * in the COPYING file in the root directory of this source tree). + * You may select, at your option, one of the above-listed licenses. + */ + +#ifndef ZSTD_COMPRESS_SEQUENCES_H +#define ZSTD_COMPRESS_SEQUENCES_H + +#include "fse.h" /* FSE_repeat, FSE_CTable */ +#include "zstd_internal.h" /* symbolEncodingType_e, ZSTD_strategy */ + +typedef enum { + ZSTD_defaultDisallowed = 0, + ZSTD_defaultAllowed = 1 +} ZSTD_defaultPolicy_e; + +symbolEncodingType_e +ZSTD_selectEncodingType( + FSE_repeat* repeatMode, unsigned const* count, unsigned const max, + size_t const mostFrequent, size_t nbSeq, unsigned const FSELog, + FSE_CTable const* prevCTable, + short const* defaultNorm, U32 defaultNormLog, + ZSTD_defaultPolicy_e const isDefaultAllowed, + ZSTD_strategy const strategy); + +size_t +ZSTD_buildCTable(void* dst, size_t dstCapacity, + FSE_CTable* nextCTable, U32 FSELog, symbolEncodingType_e type, + unsigned* count, U32 max, + const BYTE* codeTable, size_t nbSeq, + const S16* defaultNorm, U32 defaultNormLog, U32 defaultMax, + const FSE_CTable* prevCTable, size_t prevCTableSize, + void* entropyWorkspace, size_t entropyWorkspaceSize); + +size_t ZSTD_encodeSequences( + void* dst, size_t dstCapacity, + FSE_CTable const* CTable_MatchLength, BYTE const* mlCodeTable, + FSE_CTable const* CTable_OffsetBits, BYTE const* ofCodeTable, + FSE_CTable const* CTable_LitLength, BYTE const* llCodeTable, + seqDef const* sequences, size_t nbSeq, int longOffsets, int bmi2); + +#endif /* ZSTD_COMPRESS_SEQUENCES_H */ diff --git a/thirdparty/zstd/compress/zstd_cwksp.h b/thirdparty/zstd/compress/zstd_cwksp.h new file mode 100644 index 0000000000..fc9765bd3f --- /dev/null +++ b/thirdparty/zstd/compress/zstd_cwksp.h @@ -0,0 +1,535 @@ +/* + * Copyright (c) 2016-present, Yann Collet, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under both the BSD-style license (found in the + * LICENSE file in the root directory of this source tree) and the GPLv2 (found + * in the COPYING file in the root directory of this source tree). + * You may select, at your option, one of the above-listed licenses. + */ + +#ifndef ZSTD_CWKSP_H +#define ZSTD_CWKSP_H + +/*-************************************* +* Dependencies +***************************************/ +#include "zstd_internal.h" + +#if defined (__cplusplus) +extern "C" { +#endif + +/*-************************************* +* Constants +***************************************/ + +/* define "workspace is too large" as this number of times larger than needed */ +#define ZSTD_WORKSPACETOOLARGE_FACTOR 3 + +/* when workspace is continuously too large + * during at least this number of times, + * context's memory usage is considered wasteful, + * because it's sized to handle a worst case scenario which rarely happens. + * In which case, resize it down to free some memory */ +#define ZSTD_WORKSPACETOOLARGE_MAXDURATION 128 + +/* Since the workspace is effectively its own little malloc implementation / + * arena, when we run under ASAN, we should similarly insert redzones between + * each internal element of the workspace, so ASAN will catch overruns that + * reach outside an object but that stay inside the workspace. + * + * This defines the size of that redzone. + */ +#ifndef ZSTD_CWKSP_ASAN_REDZONE_SIZE +#define ZSTD_CWKSP_ASAN_REDZONE_SIZE 128 +#endif + +/*-************************************* +* Structures +***************************************/ +typedef enum { + ZSTD_cwksp_alloc_objects, + ZSTD_cwksp_alloc_buffers, + ZSTD_cwksp_alloc_aligned +} ZSTD_cwksp_alloc_phase_e; + +/** + * Zstd fits all its internal datastructures into a single continuous buffer, + * so that it only needs to perform a single OS allocation (or so that a buffer + * can be provided to it and it can perform no allocations at all). This buffer + * is called the workspace. + * + * Several optimizations complicate that process of allocating memory ranges + * from this workspace for each internal datastructure: + * + * - These different internal datastructures have different setup requirements: + * + * - The static objects need to be cleared once and can then be trivially + * reused for each compression. + * + * - Various buffers don't need to be initialized at all--they are always + * written into before they're read. + * + * - The matchstate tables have a unique requirement that they don't need + * their memory to be totally cleared, but they do need the memory to have + * some bound, i.e., a guarantee that all values in the memory they've been + * allocated is less than some maximum value (which is the starting value + * for the indices that they will then use for compression). When this + * guarantee is provided to them, they can use the memory without any setup + * work. When it can't, they have to clear the area. + * + * - These buffers also have different alignment requirements. + * + * - We would like to reuse the objects in the workspace for multiple + * compressions without having to perform any expensive reallocation or + * reinitialization work. + * + * - We would like to be able to efficiently reuse the workspace across + * multiple compressions **even when the compression parameters change** and + * we need to resize some of the objects (where possible). + * + * To attempt to manage this buffer, given these constraints, the ZSTD_cwksp + * abstraction was created. It works as follows: + * + * Workspace Layout: + * + * [ ... workspace ... ] + * [objects][tables ... ->] free space [<- ... aligned][<- ... buffers] + * + * The various objects that live in the workspace are divided into the + * following categories, and are allocated separately: + * + * - Static objects: this is optionally the enclosing ZSTD_CCtx or ZSTD_CDict, + * so that literally everything fits in a single buffer. Note: if present, + * this must be the first object in the workspace, since ZSTD_free{CCtx, + * CDict}() rely on a pointer comparison to see whether one or two frees are + * required. + * + * - Fixed size objects: these are fixed-size, fixed-count objects that are + * nonetheless "dynamically" allocated in the workspace so that we can + * control how they're initialized separately from the broader ZSTD_CCtx. + * Examples: + * - Entropy Workspace + * - 2 x ZSTD_compressedBlockState_t + * - CDict dictionary contents + * + * - Tables: these are any of several different datastructures (hash tables, + * chain tables, binary trees) that all respect a common format: they are + * uint32_t arrays, all of whose values are between 0 and (nextSrc - base). + * Their sizes depend on the cparams. + * + * - Aligned: these buffers are used for various purposes that require 4 byte + * alignment, but don't require any initialization before they're used. + * + * - Buffers: these buffers are used for various purposes that don't require + * any alignment or initialization before they're used. This means they can + * be moved around at no cost for a new compression. + * + * Allocating Memory: + * + * The various types of objects must be allocated in order, so they can be + * correctly packed into the workspace buffer. That order is: + * + * 1. Objects + * 2. Buffers + * 3. Aligned + * 4. Tables + * + * Attempts to reserve objects of different types out of order will fail. + */ +typedef struct { + void* workspace; + void* workspaceEnd; + + void* objectEnd; + void* tableEnd; + void* tableValidEnd; + void* allocStart; + + int allocFailed; + int workspaceOversizedDuration; + ZSTD_cwksp_alloc_phase_e phase; +} ZSTD_cwksp; + +/*-************************************* +* Functions +***************************************/ + +MEM_STATIC size_t ZSTD_cwksp_available_space(ZSTD_cwksp* ws); + +MEM_STATIC void ZSTD_cwksp_assert_internal_consistency(ZSTD_cwksp* ws) { + (void)ws; + assert(ws->workspace <= ws->objectEnd); + assert(ws->objectEnd <= ws->tableEnd); + assert(ws->objectEnd <= ws->tableValidEnd); + assert(ws->tableEnd <= ws->allocStart); + assert(ws->tableValidEnd <= ws->allocStart); + assert(ws->allocStart <= ws->workspaceEnd); +} + +/** + * Align must be a power of 2. + */ +MEM_STATIC size_t ZSTD_cwksp_align(size_t size, size_t const align) { + size_t const mask = align - 1; + assert((align & mask) == 0); + return (size + mask) & ~mask; +} + +/** + * Use this to determine how much space in the workspace we will consume to + * allocate this object. (Normally it should be exactly the size of the object, + * but under special conditions, like ASAN, where we pad each object, it might + * be larger.) + * + * Since tables aren't currently redzoned, you don't need to call through this + * to figure out how much space you need for the matchState tables. Everything + * else is though. + */ +MEM_STATIC size_t ZSTD_cwksp_alloc_size(size_t size) { +#if defined (ADDRESS_SANITIZER) && !defined (ZSTD_ASAN_DONT_POISON_WORKSPACE) + return size + 2 * ZSTD_CWKSP_ASAN_REDZONE_SIZE; +#else + return size; +#endif +} + +MEM_STATIC void ZSTD_cwksp_internal_advance_phase( + ZSTD_cwksp* ws, ZSTD_cwksp_alloc_phase_e phase) { + assert(phase >= ws->phase); + if (phase > ws->phase) { + if (ws->phase < ZSTD_cwksp_alloc_buffers && + phase >= ZSTD_cwksp_alloc_buffers) { + ws->tableValidEnd = ws->objectEnd; + } + if (ws->phase < ZSTD_cwksp_alloc_aligned && + phase >= ZSTD_cwksp_alloc_aligned) { + /* If unaligned allocations down from a too-large top have left us + * unaligned, we need to realign our alloc ptr. Technically, this + * can consume space that is unaccounted for in the neededSpace + * calculation. However, I believe this can only happen when the + * workspace is too large, and specifically when it is too large + * by a larger margin than the space that will be consumed. */ + /* TODO: cleaner, compiler warning friendly way to do this??? */ + ws->allocStart = (BYTE*)ws->allocStart - ((size_t)ws->allocStart & (sizeof(U32)-1)); + if (ws->allocStart < ws->tableValidEnd) { + ws->tableValidEnd = ws->allocStart; + } + } + ws->phase = phase; + } +} + +/** + * Returns whether this object/buffer/etc was allocated in this workspace. + */ +MEM_STATIC int ZSTD_cwksp_owns_buffer(const ZSTD_cwksp* ws, const void* ptr) { + return (ptr != NULL) && (ws->workspace <= ptr) && (ptr <= ws->workspaceEnd); +} + +/** + * Internal function. Do not use directly. + */ +MEM_STATIC void* ZSTD_cwksp_reserve_internal( + ZSTD_cwksp* ws, size_t bytes, ZSTD_cwksp_alloc_phase_e phase) { + void* alloc; + void* bottom = ws->tableEnd; + ZSTD_cwksp_internal_advance_phase(ws, phase); + alloc = (BYTE *)ws->allocStart - bytes; + +#if defined (ADDRESS_SANITIZER) && !defined (ZSTD_ASAN_DONT_POISON_WORKSPACE) + /* over-reserve space */ + alloc = (BYTE *)alloc - 2 * ZSTD_CWKSP_ASAN_REDZONE_SIZE; +#endif + + DEBUGLOG(5, "cwksp: reserving %p %zd bytes, %zd bytes remaining", + alloc, bytes, ZSTD_cwksp_available_space(ws) - bytes); + ZSTD_cwksp_assert_internal_consistency(ws); + assert(alloc >= bottom); + if (alloc < bottom) { + DEBUGLOG(4, "cwksp: alloc failed!"); + ws->allocFailed = 1; + return NULL; + } + if (alloc < ws->tableValidEnd) { + ws->tableValidEnd = alloc; + } + ws->allocStart = alloc; + +#if defined (ADDRESS_SANITIZER) && !defined (ZSTD_ASAN_DONT_POISON_WORKSPACE) + /* Move alloc so there's ZSTD_CWKSP_ASAN_REDZONE_SIZE unused space on + * either size. */ + alloc = (BYTE *)alloc + ZSTD_CWKSP_ASAN_REDZONE_SIZE; + __asan_unpoison_memory_region(alloc, bytes); +#endif + + return alloc; +} + +/** + * Reserves and returns unaligned memory. + */ +MEM_STATIC BYTE* ZSTD_cwksp_reserve_buffer(ZSTD_cwksp* ws, size_t bytes) { + return (BYTE*)ZSTD_cwksp_reserve_internal(ws, bytes, ZSTD_cwksp_alloc_buffers); +} + +/** + * Reserves and returns memory sized on and aligned on sizeof(unsigned). + */ +MEM_STATIC void* ZSTD_cwksp_reserve_aligned(ZSTD_cwksp* ws, size_t bytes) { + assert((bytes & (sizeof(U32)-1)) == 0); + return ZSTD_cwksp_reserve_internal(ws, ZSTD_cwksp_align(bytes, sizeof(U32)), ZSTD_cwksp_alloc_aligned); +} + +/** + * Aligned on sizeof(unsigned). These buffers have the special property that + * their values remain constrained, allowing us to re-use them without + * memset()-ing them. + */ +MEM_STATIC void* ZSTD_cwksp_reserve_table(ZSTD_cwksp* ws, size_t bytes) { + const ZSTD_cwksp_alloc_phase_e phase = ZSTD_cwksp_alloc_aligned; + void* alloc = ws->tableEnd; + void* end = (BYTE *)alloc + bytes; + void* top = ws->allocStart; + + DEBUGLOG(5, "cwksp: reserving %p table %zd bytes, %zd bytes remaining", + alloc, bytes, ZSTD_cwksp_available_space(ws) - bytes); + assert((bytes & (sizeof(U32)-1)) == 0); + ZSTD_cwksp_internal_advance_phase(ws, phase); + ZSTD_cwksp_assert_internal_consistency(ws); + assert(end <= top); + if (end > top) { + DEBUGLOG(4, "cwksp: table alloc failed!"); + ws->allocFailed = 1; + return NULL; + } + ws->tableEnd = end; + +#if defined (ADDRESS_SANITIZER) && !defined (ZSTD_ASAN_DONT_POISON_WORKSPACE) + __asan_unpoison_memory_region(alloc, bytes); +#endif + + return alloc; +} + +/** + * Aligned on sizeof(void*). + */ +MEM_STATIC void* ZSTD_cwksp_reserve_object(ZSTD_cwksp* ws, size_t bytes) { + size_t roundedBytes = ZSTD_cwksp_align(bytes, sizeof(void*)); + void* alloc = ws->objectEnd; + void* end = (BYTE*)alloc + roundedBytes; + +#if defined (ADDRESS_SANITIZER) && !defined (ZSTD_ASAN_DONT_POISON_WORKSPACE) + /* over-reserve space */ + end = (BYTE *)end + 2 * ZSTD_CWKSP_ASAN_REDZONE_SIZE; +#endif + + DEBUGLOG(5, + "cwksp: reserving %p object %zd bytes (rounded to %zd), %zd bytes remaining", + alloc, bytes, roundedBytes, ZSTD_cwksp_available_space(ws) - roundedBytes); + assert(((size_t)alloc & (sizeof(void*)-1)) == 0); + assert((bytes & (sizeof(void*)-1)) == 0); + ZSTD_cwksp_assert_internal_consistency(ws); + /* we must be in the first phase, no advance is possible */ + if (ws->phase != ZSTD_cwksp_alloc_objects || end > ws->workspaceEnd) { + DEBUGLOG(4, "cwksp: object alloc failed!"); + ws->allocFailed = 1; + return NULL; + } + ws->objectEnd = end; + ws->tableEnd = end; + ws->tableValidEnd = end; + +#if defined (ADDRESS_SANITIZER) && !defined (ZSTD_ASAN_DONT_POISON_WORKSPACE) + /* Move alloc so there's ZSTD_CWKSP_ASAN_REDZONE_SIZE unused space on + * either size. */ + alloc = (BYTE *)alloc + ZSTD_CWKSP_ASAN_REDZONE_SIZE; + __asan_unpoison_memory_region(alloc, bytes); +#endif + + return alloc; +} + +MEM_STATIC void ZSTD_cwksp_mark_tables_dirty(ZSTD_cwksp* ws) { + DEBUGLOG(4, "cwksp: ZSTD_cwksp_mark_tables_dirty"); + +#if defined (MEMORY_SANITIZER) && !defined (ZSTD_MSAN_DONT_POISON_WORKSPACE) + /* To validate that the table re-use logic is sound, and that we don't + * access table space that we haven't cleaned, we re-"poison" the table + * space every time we mark it dirty. */ + { + size_t size = (BYTE*)ws->tableValidEnd - (BYTE*)ws->objectEnd; + assert(__msan_test_shadow(ws->objectEnd, size) == -1); + __msan_poison(ws->objectEnd, size); + } +#endif + + assert(ws->tableValidEnd >= ws->objectEnd); + assert(ws->tableValidEnd <= ws->allocStart); + ws->tableValidEnd = ws->objectEnd; + ZSTD_cwksp_assert_internal_consistency(ws); +} + +MEM_STATIC void ZSTD_cwksp_mark_tables_clean(ZSTD_cwksp* ws) { + DEBUGLOG(4, "cwksp: ZSTD_cwksp_mark_tables_clean"); + assert(ws->tableValidEnd >= ws->objectEnd); + assert(ws->tableValidEnd <= ws->allocStart); + if (ws->tableValidEnd < ws->tableEnd) { + ws->tableValidEnd = ws->tableEnd; + } + ZSTD_cwksp_assert_internal_consistency(ws); +} + +/** + * Zero the part of the allocated tables not already marked clean. + */ +MEM_STATIC void ZSTD_cwksp_clean_tables(ZSTD_cwksp* ws) { + DEBUGLOG(4, "cwksp: ZSTD_cwksp_clean_tables"); + assert(ws->tableValidEnd >= ws->objectEnd); + assert(ws->tableValidEnd <= ws->allocStart); + if (ws->tableValidEnd < ws->tableEnd) { + memset(ws->tableValidEnd, 0, (BYTE*)ws->tableEnd - (BYTE*)ws->tableValidEnd); + } + ZSTD_cwksp_mark_tables_clean(ws); +} + +/** + * Invalidates table allocations. + * All other allocations remain valid. + */ +MEM_STATIC void ZSTD_cwksp_clear_tables(ZSTD_cwksp* ws) { + DEBUGLOG(4, "cwksp: clearing tables!"); + +#if defined (ADDRESS_SANITIZER) && !defined (ZSTD_ASAN_DONT_POISON_WORKSPACE) + { + size_t size = (BYTE*)ws->tableValidEnd - (BYTE*)ws->objectEnd; + __asan_poison_memory_region(ws->objectEnd, size); + } +#endif + + ws->tableEnd = ws->objectEnd; + ZSTD_cwksp_assert_internal_consistency(ws); +} + +/** + * Invalidates all buffer, aligned, and table allocations. + * Object allocations remain valid. + */ +MEM_STATIC void ZSTD_cwksp_clear(ZSTD_cwksp* ws) { + DEBUGLOG(4, "cwksp: clearing!"); + +#if defined (MEMORY_SANITIZER) && !defined (ZSTD_MSAN_DONT_POISON_WORKSPACE) + /* To validate that the context re-use logic is sound, and that we don't + * access stuff that this compression hasn't initialized, we re-"poison" + * the workspace (or at least the non-static, non-table parts of it) + * every time we start a new compression. */ + { + size_t size = (BYTE*)ws->workspaceEnd - (BYTE*)ws->tableValidEnd; + __msan_poison(ws->tableValidEnd, size); + } +#endif + +#if defined (ADDRESS_SANITIZER) && !defined (ZSTD_ASAN_DONT_POISON_WORKSPACE) + { + size_t size = (BYTE*)ws->workspaceEnd - (BYTE*)ws->objectEnd; + __asan_poison_memory_region(ws->objectEnd, size); + } +#endif + + ws->tableEnd = ws->objectEnd; + ws->allocStart = ws->workspaceEnd; + ws->allocFailed = 0; + if (ws->phase > ZSTD_cwksp_alloc_buffers) { + ws->phase = ZSTD_cwksp_alloc_buffers; + } + ZSTD_cwksp_assert_internal_consistency(ws); +} + +/** + * The provided workspace takes ownership of the buffer [start, start+size). + * Any existing values in the workspace are ignored (the previously managed + * buffer, if present, must be separately freed). + */ +MEM_STATIC void ZSTD_cwksp_init(ZSTD_cwksp* ws, void* start, size_t size) { + DEBUGLOG(4, "cwksp: init'ing workspace with %zd bytes", size); + assert(((size_t)start & (sizeof(void*)-1)) == 0); /* ensure correct alignment */ + ws->workspace = start; + ws->workspaceEnd = (BYTE*)start + size; + ws->objectEnd = ws->workspace; + ws->tableValidEnd = ws->objectEnd; + ws->phase = ZSTD_cwksp_alloc_objects; + ZSTD_cwksp_clear(ws); + ws->workspaceOversizedDuration = 0; + ZSTD_cwksp_assert_internal_consistency(ws); +} + +MEM_STATIC size_t ZSTD_cwksp_create(ZSTD_cwksp* ws, size_t size, ZSTD_customMem customMem) { + void* workspace = ZSTD_malloc(size, customMem); + DEBUGLOG(4, "cwksp: creating new workspace with %zd bytes", size); + RETURN_ERROR_IF(workspace == NULL, memory_allocation); + ZSTD_cwksp_init(ws, workspace, size); + return 0; +} + +MEM_STATIC void ZSTD_cwksp_free(ZSTD_cwksp* ws, ZSTD_customMem customMem) { + void *ptr = ws->workspace; + DEBUGLOG(4, "cwksp: freeing workspace"); + memset(ws, 0, sizeof(ZSTD_cwksp)); + ZSTD_free(ptr, customMem); +} + +/** + * Moves the management of a workspace from one cwksp to another. The src cwksp + * is left in an invalid state (src must be re-init()'ed before its used again). + */ +MEM_STATIC void ZSTD_cwksp_move(ZSTD_cwksp* dst, ZSTD_cwksp* src) { + *dst = *src; + memset(src, 0, sizeof(ZSTD_cwksp)); +} + +MEM_STATIC size_t ZSTD_cwksp_sizeof(const ZSTD_cwksp* ws) { + return (size_t)((BYTE*)ws->workspaceEnd - (BYTE*)ws->workspace); +} + +MEM_STATIC int ZSTD_cwksp_reserve_failed(const ZSTD_cwksp* ws) { + return ws->allocFailed; +} + +/*-************************************* +* Functions Checking Free Space +***************************************/ + +MEM_STATIC size_t ZSTD_cwksp_available_space(ZSTD_cwksp* ws) { + return (size_t)((BYTE*)ws->allocStart - (BYTE*)ws->tableEnd); +} + +MEM_STATIC int ZSTD_cwksp_check_available(ZSTD_cwksp* ws, size_t additionalNeededSpace) { + return ZSTD_cwksp_available_space(ws) >= additionalNeededSpace; +} + +MEM_STATIC int ZSTD_cwksp_check_too_large(ZSTD_cwksp* ws, size_t additionalNeededSpace) { + return ZSTD_cwksp_check_available( + ws, additionalNeededSpace * ZSTD_WORKSPACETOOLARGE_FACTOR); +} + +MEM_STATIC int ZSTD_cwksp_check_wasteful(ZSTD_cwksp* ws, size_t additionalNeededSpace) { + return ZSTD_cwksp_check_too_large(ws, additionalNeededSpace) + && ws->workspaceOversizedDuration > ZSTD_WORKSPACETOOLARGE_MAXDURATION; +} + +MEM_STATIC void ZSTD_cwksp_bump_oversized_duration( + ZSTD_cwksp* ws, size_t additionalNeededSpace) { + if (ZSTD_cwksp_check_too_large(ws, additionalNeededSpace)) { + ws->workspaceOversizedDuration++; + } else { + ws->workspaceOversizedDuration = 0; + } +} + +#if defined (__cplusplus) +} +#endif + +#endif /* ZSTD_CWKSP_H */ diff --git a/thirdparty/zstd/compress/zstd_double_fast.c b/thirdparty/zstd/compress/zstd_double_fast.c index 5957255d90..a661a48534 100644 --- a/thirdparty/zstd/compress/zstd_double_fast.c +++ b/thirdparty/zstd/compress/zstd_double_fast.c @@ -65,6 +65,7 @@ size_t ZSTD_compressBlock_doubleFast_generic( const U32 endIndex = (U32)((size_t)(istart - base) + srcSize); const U32 lowestValid = ms->window.dictLimit; const U32 maxDistance = 1U << cParams->windowLog; + /* presumes that, if there is a dictionary, it must be using Attach mode */ const U32 prefixLowestIndex = (endIndex - lowestValid > maxDistance) ? endIndex - maxDistance : lowestValid; const BYTE* const prefixLowest = base + prefixLowestIndex; const BYTE* const iend = istart + srcSize; @@ -147,7 +148,7 @@ size_t ZSTD_compressBlock_doubleFast_generic( const BYTE* repMatchEnd = repIndex < prefixLowestIndex ? dictEnd : iend; mLength = ZSTD_count_2segments(ip+1+4, repMatch+4, iend, repMatchEnd, prefixLowest) + 4; ip++; - ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, 0, mLength-MINMATCH); + ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, iend, 0, mLength-MINMATCH); goto _match_stored; } @@ -156,7 +157,7 @@ size_t ZSTD_compressBlock_doubleFast_generic( && ((offset_1 > 0) & (MEM_read32(ip+1-offset_1) == MEM_read32(ip+1)))) { mLength = ZSTD_count(ip+1+4, ip+1+4-offset_1, iend) + 4; ip++; - ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, 0, mLength-MINMATCH); + ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, iend, 0, mLength-MINMATCH); goto _match_stored; } @@ -246,7 +247,7 @@ _match_found: offset_2 = offset_1; offset_1 = offset; - ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH); + ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, iend, offset + ZSTD_REP_MOVE, mLength-MINMATCH); _match_stored: /* match found */ @@ -277,7 +278,7 @@ _match_stored: const BYTE* const repEnd2 = repIndex2 < prefixLowestIndex ? dictEnd : iend; size_t const repLength2 = ZSTD_count_2segments(ip+4, repMatch2+4, iend, repEnd2, prefixLowest) + 4; U32 tmpOffset = offset_2; offset_2 = offset_1; offset_1 = tmpOffset; /* swap offset_2 <=> offset_1 */ - ZSTD_storeSeq(seqStore, 0, anchor, 0, repLength2-MINMATCH); + ZSTD_storeSeq(seqStore, 0, anchor, iend, 0, repLength2-MINMATCH); hashSmall[ZSTD_hashPtr(ip, hBitsS, mls)] = current2; hashLong[ZSTD_hashPtr(ip, hBitsL, 8)] = current2; ip += repLength2; @@ -296,7 +297,7 @@ _match_stored: U32 const tmpOff = offset_2; offset_2 = offset_1; offset_1 = tmpOff; /* swap offset_2 <=> offset_1 */ hashSmall[ZSTD_hashPtr(ip, hBitsS, mls)] = (U32)(ip-base); hashLong[ZSTD_hashPtr(ip, hBitsL, 8)] = (U32)(ip-base); - ZSTD_storeSeq(seqStore, 0, anchor, 0, rLength-MINMATCH); + ZSTD_storeSeq(seqStore, 0, anchor, iend, 0, rLength-MINMATCH); ip += rLength; anchor = ip; continue; /* faster when present ... (?) */ @@ -369,9 +370,7 @@ static size_t ZSTD_compressBlock_doubleFast_extDict_generic( const BYTE* const ilimit = iend - 8; const BYTE* const base = ms->window.base; const U32 endIndex = (U32)((size_t)(istart - base) + srcSize); - const U32 maxDistance = 1U << cParams->windowLog; - const U32 lowestValid = ms->window.lowLimit; - const U32 lowLimit = (endIndex - lowestValid > maxDistance) ? endIndex - maxDistance : lowestValid; + const U32 lowLimit = ZSTD_getLowestMatchIndex(ms, endIndex, cParams->windowLog); const U32 dictStartIndex = lowLimit; const U32 dictLimit = ms->window.dictLimit; const U32 prefixStartIndex = (dictLimit > lowLimit) ? dictLimit : lowLimit; @@ -412,7 +411,7 @@ static size_t ZSTD_compressBlock_doubleFast_extDict_generic( const BYTE* repMatchEnd = repIndex < prefixStartIndex ? dictEnd : iend; mLength = ZSTD_count_2segments(ip+1+4, repMatch+4, iend, repMatchEnd, prefixStart) + 4; ip++; - ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, 0, mLength-MINMATCH); + ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, iend, 0, mLength-MINMATCH); } else { if ((matchLongIndex > dictStartIndex) && (MEM_read64(matchLong) == MEM_read64(ip))) { const BYTE* const matchEnd = matchLongIndex < prefixStartIndex ? dictEnd : iend; @@ -423,7 +422,7 @@ static size_t ZSTD_compressBlock_doubleFast_extDict_generic( while (((ip>anchor) & (matchLong>lowMatchPtr)) && (ip[-1] == matchLong[-1])) { ip--; matchLong--; mLength++; } /* catch up */ offset_2 = offset_1; offset_1 = offset; - ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH); + ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, iend, offset + ZSTD_REP_MOVE, mLength-MINMATCH); } else if ((matchIndex > dictStartIndex) && (MEM_read32(match) == MEM_read32(ip))) { size_t const h3 = ZSTD_hashPtr(ip+1, hBitsL, 8); @@ -448,7 +447,7 @@ static size_t ZSTD_compressBlock_doubleFast_extDict_generic( } offset_2 = offset_1; offset_1 = offset; - ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH); + ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, iend, offset + ZSTD_REP_MOVE, mLength-MINMATCH); } else { ip += ((ip-anchor) >> kSearchStrength) + 1; @@ -480,7 +479,7 @@ static size_t ZSTD_compressBlock_doubleFast_extDict_generic( const BYTE* const repEnd2 = repIndex2 < prefixStartIndex ? dictEnd : iend; size_t const repLength2 = ZSTD_count_2segments(ip+4, repMatch2+4, iend, repEnd2, prefixStart) + 4; U32 const tmpOffset = offset_2; offset_2 = offset_1; offset_1 = tmpOffset; /* swap offset_2 <=> offset_1 */ - ZSTD_storeSeq(seqStore, 0, anchor, 0, repLength2-MINMATCH); + ZSTD_storeSeq(seqStore, 0, anchor, iend, 0, repLength2-MINMATCH); hashSmall[ZSTD_hashPtr(ip, hBitsS, mls)] = current2; hashLong[ZSTD_hashPtr(ip, hBitsL, 8)] = current2; ip += repLength2; diff --git a/thirdparty/zstd/compress/zstd_fast.c b/thirdparty/zstd/compress/zstd_fast.c index a05b8a47f1..6dbefee6b7 100644 --- a/thirdparty/zstd/compress/zstd_fast.c +++ b/thirdparty/zstd/compress/zstd_fast.c @@ -8,7 +8,7 @@ * You may select, at your option, one of the above-listed licenses. */ -#include "zstd_compress_internal.h" +#include "zstd_compress_internal.h" /* ZSTD_hashPtr, ZSTD_count, ZSTD_storeSeq */ #include "zstd_fast.h" @@ -43,8 +43,8 @@ void ZSTD_fillHashTable(ZSTD_matchState_t* ms, } -FORCE_INLINE_TEMPLATE -size_t ZSTD_compressBlock_fast_generic( +FORCE_INLINE_TEMPLATE size_t +ZSTD_compressBlock_fast_generic( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], void const* src, size_t srcSize, U32 const mls) @@ -71,10 +71,10 @@ size_t ZSTD_compressBlock_fast_generic( U32 offsetSaved = 0; /* init */ + DEBUGLOG(5, "ZSTD_compressBlock_fast_generic"); ip0 += (ip0 == prefixStart); ip1 = ip0 + 1; - { - U32 const maxRep = (U32)(ip0 - prefixStart); + { U32 const maxRep = (U32)(ip0 - prefixStart); if (offset_2 > maxRep) offsetSaved = offset_2, offset_2 = 0; if (offset_1 > maxRep) offsetSaved = offset_1, offset_1 = 0; } @@ -117,8 +117,7 @@ size_t ZSTD_compressBlock_fast_generic( match0 = match1; goto _offset; } - { - size_t const step = ((ip0-anchor) >> (kSearchStrength - 1)) + stepSize; + { size_t const step = ((size_t)(ip0-anchor) >> (kSearchStrength - 1)) + stepSize; assert(step >= 2); ip0 += step; ip1 += step; @@ -137,7 +136,7 @@ _offset: /* Requires: ip0, match0 */ _match: /* Requires: ip0, match0, offcode */ /* Count the forward length */ mLength += ZSTD_count(ip0+mLength+4, match0+mLength+4, iend) + 4; - ZSTD_storeSeq(seqStore, ip0-anchor, anchor, offcode, mLength-MINMATCH); + ZSTD_storeSeq(seqStore, (size_t)(ip0-anchor), anchor, iend, offcode, mLength-MINMATCH); /* match found */ ip0 += mLength; anchor = ip0; @@ -149,16 +148,15 @@ _match: /* Requires: ip0, match0, offcode */ hashTable[ZSTD_hashPtr(base+current0+2, hlog, mls)] = current0+2; /* here because current+2 could be > iend-8 */ hashTable[ZSTD_hashPtr(ip0-2, hlog, mls)] = (U32)(ip0-2-base); - while ( (ip0 <= ilimit) - && ( (offset_2>0) - & (MEM_read32(ip0) == MEM_read32(ip0 - offset_2)) )) { + while ( ((ip0 <= ilimit) & (offset_2>0)) /* offset_2==0 means offset_2 is invalidated */ + && (MEM_read32(ip0) == MEM_read32(ip0 - offset_2)) ) { /* store sequence */ size_t const rLength = ZSTD_count(ip0+4, ip0+4-offset_2, iend) + 4; - U32 const tmpOff = offset_2; offset_2 = offset_1; offset_1 = tmpOff; /* swap offset_2 <=> offset_1 */ + { U32 const tmpOff = offset_2; offset_2 = offset_1; offset_1 = tmpOff; } /* swap offset_2 <=> offset_1 */ hashTable[ZSTD_hashPtr(ip0, hlog, mls)] = (U32)(ip0-base); ip0 += rLength; ip1 = ip0 + 1; - ZSTD_storeSeq(seqStore, 0, anchor, 0, rLength-MINMATCH); + ZSTD_storeSeq(seqStore, 0 /*litLen*/, anchor, iend, 0 /*offCode*/, rLength-MINMATCH); anchor = ip0; continue; /* faster when present (confirmed on gcc-8) ... (?) */ } @@ -178,8 +176,7 @@ size_t ZSTD_compressBlock_fast( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], void const* src, size_t srcSize) { - ZSTD_compressionParameters const* cParams = &ms->cParams; - U32 const mls = cParams->minMatch; + U32 const mls = ms->cParams.minMatch; assert(ms->dictMatchState == NULL); switch(mls) { @@ -239,6 +236,7 @@ size_t ZSTD_compressBlock_fast_dictMatchState_generic( assert(prefixStartIndex >= (U32)(dictEnd - dictBase)); /* init */ + DEBUGLOG(5, "ZSTD_compressBlock_fast_dictMatchState_generic"); ip += (dictAndPrefixLength == 0); /* dictMatchState repCode checks don't currently handle repCode == 0 * disabling. */ @@ -263,7 +261,7 @@ size_t ZSTD_compressBlock_fast_dictMatchState_generic( const BYTE* const repMatchEnd = repIndex < prefixStartIndex ? dictEnd : iend; mLength = ZSTD_count_2segments(ip+1+4, repMatch+4, iend, repMatchEnd, prefixStart) + 4; ip++; - ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, 0, mLength-MINMATCH); + ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, iend, 0, mLength-MINMATCH); } else if ( (matchIndex <= prefixStartIndex) ) { size_t const dictHash = ZSTD_hashPtr(ip, dictHLog, mls); U32 const dictMatchIndex = dictHashTable[dictHash]; @@ -283,7 +281,7 @@ size_t ZSTD_compressBlock_fast_dictMatchState_generic( } /* catch up */ offset_2 = offset_1; offset_1 = offset; - ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH); + ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, iend, offset + ZSTD_REP_MOVE, mLength-MINMATCH); } } else if (MEM_read32(match) != MEM_read32(ip)) { /* it's not a match, and we're not going to check the dictionary */ @@ -298,7 +296,7 @@ size_t ZSTD_compressBlock_fast_dictMatchState_generic( && (ip[-1] == match[-1])) { ip--; match--; mLength++; } /* catch up */ offset_2 = offset_1; offset_1 = offset; - ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH); + ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, iend, offset + ZSTD_REP_MOVE, mLength-MINMATCH); } /* match found */ @@ -323,7 +321,7 @@ size_t ZSTD_compressBlock_fast_dictMatchState_generic( const BYTE* const repEnd2 = repIndex2 < prefixStartIndex ? dictEnd : iend; size_t const repLength2 = ZSTD_count_2segments(ip+4, repMatch2+4, iend, repEnd2, prefixStart) + 4; U32 tmpOffset = offset_2; offset_2 = offset_1; offset_1 = tmpOffset; /* swap offset_2 <=> offset_1 */ - ZSTD_storeSeq(seqStore, 0, anchor, 0, repLength2-MINMATCH); + ZSTD_storeSeq(seqStore, 0, anchor, iend, 0, repLength2-MINMATCH); hashTable[ZSTD_hashPtr(ip, hlog, mls)] = current2; ip += repLength2; anchor = ip; @@ -346,8 +344,7 @@ size_t ZSTD_compressBlock_fast_dictMatchState( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], void const* src, size_t srcSize) { - ZSTD_compressionParameters const* cParams = &ms->cParams; - U32 const mls = cParams->minMatch; + U32 const mls = ms->cParams.minMatch; assert(ms->dictMatchState != NULL); switch(mls) { @@ -379,9 +376,7 @@ static size_t ZSTD_compressBlock_fast_extDict_generic( const BYTE* ip = istart; const BYTE* anchor = istart; const U32 endIndex = (U32)((size_t)(istart - base) + srcSize); - const U32 maxDistance = 1U << cParams->windowLog; - const U32 validLow = ms->window.lowLimit; - const U32 lowLimit = (endIndex - validLow > maxDistance) ? endIndex - maxDistance : validLow; + const U32 lowLimit = ZSTD_getLowestMatchIndex(ms, endIndex, cParams->windowLog); const U32 dictStartIndex = lowLimit; const BYTE* const dictStart = dictBase + dictStartIndex; const U32 dictLimit = ms->window.dictLimit; @@ -392,6 +387,8 @@ static size_t ZSTD_compressBlock_fast_extDict_generic( const BYTE* const ilimit = iend - 8; U32 offset_1=rep[0], offset_2=rep[1]; + DEBUGLOG(5, "ZSTD_compressBlock_fast_extDict_generic"); + /* switch to "regular" variant if extDict is invalidated due to maxDistance */ if (prefixStartIndex == dictStartIndex) return ZSTD_compressBlock_fast_generic(ms, seqStore, rep, src, srcSize, mls); @@ -406,16 +403,17 @@ static size_t ZSTD_compressBlock_fast_extDict_generic( const U32 repIndex = current + 1 - offset_1; const BYTE* const repBase = repIndex < prefixStartIndex ? dictBase : base; const BYTE* const repMatch = repBase + repIndex; - size_t mLength; hashTable[h] = current; /* update hash table */ assert(offset_1 <= current +1); /* check repIndex */ if ( (((U32)((prefixStartIndex-1) - repIndex) >= 3) /* intentional underflow */ & (repIndex > dictStartIndex)) && (MEM_read32(repMatch) == MEM_read32(ip+1)) ) { - const BYTE* repMatchEnd = repIndex < prefixStartIndex ? dictEnd : iend; - mLength = ZSTD_count_2segments(ip+1+4, repMatch+4, iend, repMatchEnd, prefixStart) + 4; + const BYTE* const repMatchEnd = repIndex < prefixStartIndex ? dictEnd : iend; + size_t const rLength = ZSTD_count_2segments(ip+1 +4, repMatch +4, iend, repMatchEnd, prefixStart) + 4; ip++; - ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, 0, mLength-MINMATCH); + ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, iend, 0, rLength-MINMATCH); + ip += rLength; + anchor = ip; } else { if ( (matchIndex < dictStartIndex) || (MEM_read32(match) != MEM_read32(ip)) ) { @@ -423,21 +421,17 @@ static size_t ZSTD_compressBlock_fast_extDict_generic( ip += ((ip-anchor) >> kSearchStrength) + stepSize; continue; } - { const BYTE* matchEnd = matchIndex < prefixStartIndex ? dictEnd : iend; - const BYTE* lowMatchPtr = matchIndex < prefixStartIndex ? dictStart : prefixStart; - U32 offset; - mLength = ZSTD_count_2segments(ip+4, match+4, iend, matchEnd, prefixStart) + 4; + { const BYTE* const matchEnd = matchIndex < prefixStartIndex ? dictEnd : iend; + const BYTE* const lowMatchPtr = matchIndex < prefixStartIndex ? dictStart : prefixStart; + U32 const offset = current - matchIndex; + size_t mLength = ZSTD_count_2segments(ip+4, match+4, iend, matchEnd, prefixStart) + 4; while (((ip>anchor) & (match>lowMatchPtr)) && (ip[-1] == match[-1])) { ip--; match--; mLength++; } /* catch up */ - offset = current - matchIndex; - offset_2 = offset_1; - offset_1 = offset; - ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH); + offset_2 = offset_1; offset_1 = offset; /* update offset history */ + ZSTD_storeSeq(seqStore, (size_t)(ip-anchor), anchor, iend, offset + ZSTD_REP_MOVE, mLength-MINMATCH); + ip += mLength; + anchor = ip; } } - /* found a match : store it */ - ip += mLength; - anchor = ip; - if (ip <= ilimit) { /* Fill Table */ hashTable[ZSTD_hashPtr(base+current+2, hlog, mls)] = current+2; @@ -446,13 +440,13 @@ static size_t ZSTD_compressBlock_fast_extDict_generic( while (ip <= ilimit) { U32 const current2 = (U32)(ip-base); U32 const repIndex2 = current2 - offset_2; - const BYTE* repMatch2 = repIndex2 < prefixStartIndex ? dictBase + repIndex2 : base + repIndex2; + const BYTE* const repMatch2 = repIndex2 < prefixStartIndex ? dictBase + repIndex2 : base + repIndex2; if ( (((U32)((prefixStartIndex-1) - repIndex2) >= 3) & (repIndex2 > dictStartIndex)) /* intentional overflow */ && (MEM_read32(repMatch2) == MEM_read32(ip)) ) { const BYTE* const repEnd2 = repIndex2 < prefixStartIndex ? dictEnd : iend; size_t const repLength2 = ZSTD_count_2segments(ip+4, repMatch2+4, iend, repEnd2, prefixStart) + 4; - U32 tmpOffset = offset_2; offset_2 = offset_1; offset_1 = tmpOffset; /* swap offset_2 <=> offset_1 */ - ZSTD_storeSeq(seqStore, 0, anchor, 0, repLength2-MINMATCH); + { U32 const tmpOffset = offset_2; offset_2 = offset_1; offset_1 = tmpOffset; } /* swap offset_2 <=> offset_1 */ + ZSTD_storeSeq(seqStore, 0 /*litlen*/, anchor, iend, 0 /*offcode*/, repLength2-MINMATCH); hashTable[ZSTD_hashPtr(ip, hlog, mls)] = current2; ip += repLength2; anchor = ip; @@ -474,8 +468,7 @@ size_t ZSTD_compressBlock_fast_extDict( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], void const* src, size_t srcSize) { - ZSTD_compressionParameters const* cParams = &ms->cParams; - U32 const mls = cParams->minMatch; + U32 const mls = ms->cParams.minMatch; switch(mls) { default: /* includes case 3 */ diff --git a/thirdparty/zstd/compress/zstd_lazy.c b/thirdparty/zstd/compress/zstd_lazy.c index 94d906c01f..9ad7e03b54 100644 --- a/thirdparty/zstd/compress/zstd_lazy.c +++ b/thirdparty/zstd/compress/zstd_lazy.c @@ -242,9 +242,7 @@ ZSTD_DUBT_findBestMatch(ZSTD_matchState_t* ms, const BYTE* const base = ms->window.base; U32 const current = (U32)(ip-base); - U32 const maxDistance = 1U << cParams->windowLog; - U32 const windowValid = ms->window.lowLimit; - U32 const windowLow = (current - windowValid > maxDistance) ? current - maxDistance : windowValid; + U32 const windowLow = ZSTD_getLowestMatchIndex(ms, current, cParams->windowLog); U32* const bt = ms->chainTable; U32 const btLog = cParams->chainLog - 1; @@ -497,8 +495,10 @@ size_t ZSTD_HcFindBestMatch_generic ( const BYTE* const dictEnd = dictBase + dictLimit; const U32 current = (U32)(ip-base); const U32 maxDistance = 1U << cParams->windowLog; - const U32 lowValid = ms->window.lowLimit; - const U32 lowLimit = (current - lowValid > maxDistance) ? current - maxDistance : lowValid; + const U32 lowestValid = ms->window.lowLimit; + const U32 withinMaxDistance = (current - lowestValid > maxDistance) ? current - maxDistance : lowestValid; + const U32 isDictionary = (ms->loadedDictEnd != 0); + const U32 lowLimit = isDictionary ? lowestValid : withinMaxDistance; const U32 minChain = current > chainSize ? current - chainSize : 0; U32 nbAttempts = 1U << cParams->searchLog; size_t ml=4-1; @@ -619,12 +619,14 @@ FORCE_INLINE_TEMPLATE size_t ZSTD_HcFindBestMatch_extDict_selectMLS ( /* ******************************* * Common parser - lazy strategy *********************************/ -FORCE_INLINE_TEMPLATE -size_t ZSTD_compressBlock_lazy_generic( +typedef enum { search_hashChain, search_binaryTree } searchMethod_e; + +FORCE_INLINE_TEMPLATE size_t +ZSTD_compressBlock_lazy_generic( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], const void* src, size_t srcSize, - const U32 searchMethod, const U32 depth, + const searchMethod_e searchMethod, const U32 depth, ZSTD_dictMode_e const dictMode) { const BYTE* const istart = (const BYTE*)src; @@ -640,8 +642,10 @@ size_t ZSTD_compressBlock_lazy_generic( ZSTD_matchState_t* ms, const BYTE* ip, const BYTE* iLimit, size_t* offsetPtr); searchMax_f const searchMax = dictMode == ZSTD_dictMatchState ? - (searchMethod ? ZSTD_BtFindBestMatch_dictMatchState_selectMLS : ZSTD_HcFindBestMatch_dictMatchState_selectMLS) : - (searchMethod ? ZSTD_BtFindBestMatch_selectMLS : ZSTD_HcFindBestMatch_selectMLS); + (searchMethod==search_binaryTree ? ZSTD_BtFindBestMatch_dictMatchState_selectMLS + : ZSTD_HcFindBestMatch_dictMatchState_selectMLS) : + (searchMethod==search_binaryTree ? ZSTD_BtFindBestMatch_selectMLS + : ZSTD_HcFindBestMatch_selectMLS); U32 offset_1 = rep[0], offset_2 = rep[1], savedOffset=0; const ZSTD_matchState_t* const dms = ms->dictMatchState; @@ -806,7 +810,7 @@ size_t ZSTD_compressBlock_lazy_generic( /* store sequence */ _storeSequence: { size_t const litLength = start - anchor; - ZSTD_storeSeq(seqStore, litLength, anchor, (U32)offset, matchLength-MINMATCH); + ZSTD_storeSeq(seqStore, litLength, anchor, iend, (U32)offset, matchLength-MINMATCH); anchor = ip = start + matchLength; } @@ -824,7 +828,7 @@ _storeSequence: const BYTE* const repEnd2 = repIndex < prefixLowestIndex ? dictEnd : iend; matchLength = ZSTD_count_2segments(ip+4, repMatch+4, iend, repEnd2, prefixLowest) + 4; offset = offset_2; offset_2 = offset_1; offset_1 = (U32)offset; /* swap offset_2 <=> offset_1 */ - ZSTD_storeSeq(seqStore, 0, anchor, 0, matchLength-MINMATCH); + ZSTD_storeSeq(seqStore, 0, anchor, iend, 0, matchLength-MINMATCH); ip += matchLength; anchor = ip; continue; @@ -839,7 +843,7 @@ _storeSequence: /* store sequence */ matchLength = ZSTD_count(ip+4, ip+4-offset_2, iend) + 4; offset = offset_2; offset_2 = offset_1; offset_1 = (U32)offset; /* swap repcodes */ - ZSTD_storeSeq(seqStore, 0, anchor, 0, matchLength-MINMATCH); + ZSTD_storeSeq(seqStore, 0, anchor, iend, 0, matchLength-MINMATCH); ip += matchLength; anchor = ip; continue; /* faster when present ... (?) */ @@ -850,7 +854,7 @@ _storeSequence: rep[1] = offset_2 ? offset_2 : savedOffset; /* Return the last literals size */ - return iend - anchor; + return (size_t)(iend - anchor); } @@ -858,56 +862,56 @@ size_t ZSTD_compressBlock_btlazy2( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], void const* src, size_t srcSize) { - return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, 1, 2, ZSTD_noDict); + return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, search_binaryTree, 2, ZSTD_noDict); } size_t ZSTD_compressBlock_lazy2( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], void const* src, size_t srcSize) { - return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, 0, 2, ZSTD_noDict); + return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, search_hashChain, 2, ZSTD_noDict); } size_t ZSTD_compressBlock_lazy( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], void const* src, size_t srcSize) { - return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, 0, 1, ZSTD_noDict); + return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, search_hashChain, 1, ZSTD_noDict); } size_t ZSTD_compressBlock_greedy( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], void const* src, size_t srcSize) { - return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, 0, 0, ZSTD_noDict); + return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, search_hashChain, 0, ZSTD_noDict); } size_t ZSTD_compressBlock_btlazy2_dictMatchState( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], void const* src, size_t srcSize) { - return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, 1, 2, ZSTD_dictMatchState); + return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, search_binaryTree, 2, ZSTD_dictMatchState); } size_t ZSTD_compressBlock_lazy2_dictMatchState( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], void const* src, size_t srcSize) { - return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, 0, 2, ZSTD_dictMatchState); + return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, search_hashChain, 2, ZSTD_dictMatchState); } size_t ZSTD_compressBlock_lazy_dictMatchState( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], void const* src, size_t srcSize) { - return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, 0, 1, ZSTD_dictMatchState); + return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, search_hashChain, 1, ZSTD_dictMatchState); } size_t ZSTD_compressBlock_greedy_dictMatchState( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], void const* src, size_t srcSize) { - return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, 0, 0, ZSTD_dictMatchState); + return ZSTD_compressBlock_lazy_generic(ms, seqStore, rep, src, srcSize, search_hashChain, 0, ZSTD_dictMatchState); } @@ -916,7 +920,7 @@ size_t ZSTD_compressBlock_lazy_extDict_generic( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], const void* src, size_t srcSize, - const U32 searchMethod, const U32 depth) + const searchMethod_e searchMethod, const U32 depth) { const BYTE* const istart = (const BYTE*)src; const BYTE* ip = istart; @@ -934,7 +938,7 @@ size_t ZSTD_compressBlock_lazy_extDict_generic( typedef size_t (*searchMax_f)( ZSTD_matchState_t* ms, const BYTE* ip, const BYTE* iLimit, size_t* offsetPtr); - searchMax_f searchMax = searchMethod ? ZSTD_BtFindBestMatch_extDict_selectMLS : ZSTD_HcFindBestMatch_extDict_selectMLS; + searchMax_f searchMax = searchMethod==search_binaryTree ? ZSTD_BtFindBestMatch_extDict_selectMLS : ZSTD_HcFindBestMatch_extDict_selectMLS; U32 offset_1 = rep[0], offset_2 = rep[1]; @@ -1047,7 +1051,7 @@ size_t ZSTD_compressBlock_lazy_extDict_generic( /* store sequence */ _storeSequence: { size_t const litLength = start - anchor; - ZSTD_storeSeq(seqStore, litLength, anchor, (U32)offset, matchLength-MINMATCH); + ZSTD_storeSeq(seqStore, litLength, anchor, iend, (U32)offset, matchLength-MINMATCH); anchor = ip = start + matchLength; } @@ -1062,7 +1066,7 @@ _storeSequence: const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend; matchLength = ZSTD_count_2segments(ip+4, repMatch+4, iend, repEnd, prefixStart) + 4; offset = offset_2; offset_2 = offset_1; offset_1 = (U32)offset; /* swap offset history */ - ZSTD_storeSeq(seqStore, 0, anchor, 0, matchLength-MINMATCH); + ZSTD_storeSeq(seqStore, 0, anchor, iend, 0, matchLength-MINMATCH); ip += matchLength; anchor = ip; continue; /* faster when present ... (?) */ @@ -1075,7 +1079,7 @@ _storeSequence: rep[1] = offset_2; /* Return the last literals size */ - return iend - anchor; + return (size_t)(iend - anchor); } @@ -1083,7 +1087,7 @@ size_t ZSTD_compressBlock_greedy_extDict( ZSTD_matchState_t* ms, seqStore_t* seqStore, U32 rep[ZSTD_REP_NUM], void const* src, size_t srcSize) { - return ZSTD_compressBlock_lazy_extDict_generic(ms, seqStore, rep, src, srcSize, 0, 0); + return ZSTD_compressBlock_lazy_extDict_generic(ms, seqStore, rep, src, srcSize, search_hashChain, 0); } size_t ZSTD_compressBlock_lazy_extDict( @@ -1091,7 +1095,7 @@ size_t ZSTD_compressBlock_lazy_extDict( void const* src, size_t srcSize) { - return ZSTD_compressBlock_lazy_extDict_generic(ms, seqStore, rep, src, srcSize, 0, 1); + return ZSTD_compressBlock_lazy_extDict_generic(ms, seqStore, rep, src, srcSize, search_hashChain, 1); } size_t ZSTD_compressBlock_lazy2_extDict( @@ -1099,7 +1103,7 @@ size_t ZSTD_compressBlock_lazy2_extDict( void const* src, size_t srcSize) { - return ZSTD_compressBlock_lazy_extDict_generic(ms, seqStore, rep, src, srcSize, 0, 2); + return ZSTD_compressBlock_lazy_extDict_generic(ms, seqStore, rep, src, srcSize, search_hashChain, 2); } size_t ZSTD_compressBlock_btlazy2_extDict( @@ -1107,5 +1111,5 @@ size_t ZSTD_compressBlock_btlazy2_extDict( void const* src, size_t srcSize) { - return ZSTD_compressBlock_lazy_extDict_generic(ms, seqStore, rep, src, srcSize, 1, 2); + return ZSTD_compressBlock_lazy_extDict_generic(ms, seqStore, rep, src, srcSize, search_binaryTree, 2); } diff --git a/thirdparty/zstd/compress/zstd_ldm.c b/thirdparty/zstd/compress/zstd_ldm.c index 3dcf86e6e8..c3312ad3e3 100644 --- a/thirdparty/zstd/compress/zstd_ldm.c +++ b/thirdparty/zstd/compress/zstd_ldm.c @@ -49,9 +49,9 @@ size_t ZSTD_ldm_getTableSize(ldmParams_t params) { size_t const ldmHSize = ((size_t)1) << params.hashLog; size_t const ldmBucketSizeLog = MIN(params.bucketSizeLog, params.hashLog); - size_t const ldmBucketSize = - ((size_t)1) << (params.hashLog - ldmBucketSizeLog); - size_t const totalSize = ldmBucketSize + ldmHSize * sizeof(ldmEntry_t); + size_t const ldmBucketSize = ((size_t)1) << (params.hashLog - ldmBucketSizeLog); + size_t const totalSize = ZSTD_cwksp_alloc_size(ldmBucketSize) + + ZSTD_cwksp_alloc_size(ldmHSize * sizeof(ldmEntry_t)); return params.enableLdm ? totalSize : 0; } @@ -583,7 +583,7 @@ size_t ZSTD_ldm_blockCompress(rawSeqStore_t* rawSeqStore, rep[i] = rep[i-1]; rep[0] = sequence.offset; /* Store the sequence */ - ZSTD_storeSeq(seqStore, newLitLength, ip - newLitLength, + ZSTD_storeSeq(seqStore, newLitLength, ip - newLitLength, iend, sequence.offset + ZSTD_REP_MOVE, sequence.matchLength - MINMATCH); ip += sequence.matchLength; diff --git a/thirdparty/zstd/compress/zstd_opt.c b/thirdparty/zstd/compress/zstd_opt.c index e32e542e02..2e50fca6ff 100644 --- a/thirdparty/zstd/compress/zstd_opt.c +++ b/thirdparty/zstd/compress/zstd_opt.c @@ -552,7 +552,6 @@ U32 ZSTD_insertBtAndGetAllMatches ( { const ZSTD_compressionParameters* const cParams = &ms->cParams; U32 const sufficient_len = MIN(cParams->targetLength, ZSTD_OPT_NUM -1); - U32 const maxDistance = 1U << cParams->windowLog; const BYTE* const base = ms->window.base; U32 const current = (U32)(ip-base); U32 const hashLog = cParams->hashLog; @@ -569,8 +568,7 @@ U32 ZSTD_insertBtAndGetAllMatches ( const BYTE* const dictEnd = dictBase + dictLimit; const BYTE* const prefixStart = base + dictLimit; U32 const btLow = (btMask >= current) ? 0 : current - btMask; - U32 const windowValid = ms->window.lowLimit; - U32 const windowLow = ((current - windowValid) > maxDistance) ? current - maxDistance : windowValid; + U32 const windowLow = ZSTD_getLowestMatchIndex(ms, current, cParams->windowLog); U32 const matchLow = windowLow ? windowLow : 1; U32* smallerPtr = bt + 2*(current&btMask); U32* largerPtr = bt + 2*(current&btMask) + 1; @@ -674,19 +672,21 @@ U32 ZSTD_insertBtAndGetAllMatches ( while (nbCompares-- && (matchIndex >= matchLow)) { U32* const nextPtr = bt + 2*(matchIndex & btMask); - size_t matchLength = MIN(commonLengthSmaller, commonLengthLarger); /* guaranteed minimum nb of common bytes */ const BYTE* match; + size_t matchLength = MIN(commonLengthSmaller, commonLengthLarger); /* guaranteed minimum nb of common bytes */ assert(current > matchIndex); if ((dictMode == ZSTD_noDict) || (dictMode == ZSTD_dictMatchState) || (matchIndex+matchLength >= dictLimit)) { assert(matchIndex+matchLength >= dictLimit); /* ensure the condition is correct when !extDict */ match = base + matchIndex; + if (matchIndex >= dictLimit) assert(memcmp(match, ip, matchLength) == 0); /* ensure early section of match is equal as expected */ matchLength += ZSTD_count(ip+matchLength, match+matchLength, iLimit); } else { match = dictBase + matchIndex; + assert(memcmp(match, ip, matchLength) == 0); /* ensure early section of match is equal as expected */ matchLength += ZSTD_count_2segments(ip+matchLength, match+matchLength, iLimit, dictEnd, prefixStart); if (matchIndex+matchLength >= dictLimit) - match = base + matchIndex; /* prepare for match[matchLength] */ + match = base + matchIndex; /* prepare for match[matchLength] read */ } if (matchLength > bestLength) { @@ -1098,7 +1098,7 @@ _shortestPath: /* cur, last_pos, best_mlen, best_off have to be set */ assert(anchor + llen <= iend); ZSTD_updateStats(optStatePtr, llen, anchor, offCode, mlen); - ZSTD_storeSeq(seqStore, llen, anchor, offCode, mlen-MINMATCH); + ZSTD_storeSeq(seqStore, llen, anchor, iend, offCode, mlen-MINMATCH); anchor += advance; ip = anchor; } } diff --git a/thirdparty/zstd/compress/zstdmt_compress.c b/thirdparty/zstd/compress/zstdmt_compress.c index 9e537b8848..bc3062b530 100644 --- a/thirdparty/zstd/compress/zstdmt_compress.c +++ b/thirdparty/zstd/compress/zstdmt_compress.c @@ -668,7 +668,7 @@ static void ZSTDMT_compressionJob(void* jobDescription) /* init */ if (job->cdict) { - size_t const initError = ZSTD_compressBegin_advanced_internal(cctx, NULL, 0, ZSTD_dct_auto, ZSTD_dtlm_fast, job->cdict, jobParams, job->fullFrameSize); + size_t const initError = ZSTD_compressBegin_advanced_internal(cctx, NULL, 0, ZSTD_dct_auto, ZSTD_dtlm_fast, job->cdict, &jobParams, job->fullFrameSize); assert(job->firstJob); /* only allowed for first job */ if (ZSTD_isError(initError)) JOB_ERROR(initError); } else { /* srcStart points at reloaded section */ @@ -680,7 +680,7 @@ static void ZSTDMT_compressionJob(void* jobDescription) job->prefix.start, job->prefix.size, ZSTD_dct_rawContent, /* load dictionary in "content-only" mode (no header analysis) */ ZSTD_dtlm_fast, NULL, /*cdict*/ - jobParams, pledgedSrcSize); + &jobParams, pledgedSrcSize); if (ZSTD_isError(initError)) JOB_ERROR(initError); } } @@ -927,12 +927,18 @@ static void ZSTDMT_releaseAllJobResources(ZSTDMT_CCtx* mtctx) unsigned jobID; DEBUGLOG(3, "ZSTDMT_releaseAllJobResources"); for (jobID=0; jobID <= mtctx->jobIDMask; jobID++) { + /* Copy the mutex/cond out */ + ZSTD_pthread_mutex_t const mutex = mtctx->jobs[jobID].job_mutex; + ZSTD_pthread_cond_t const cond = mtctx->jobs[jobID].job_cond; + DEBUGLOG(4, "job%02u: release dst address %08X", jobID, (U32)(size_t)mtctx->jobs[jobID].dstBuff.start); ZSTDMT_releaseBuffer(mtctx->bufPool, mtctx->jobs[jobID].dstBuff); - mtctx->jobs[jobID].dstBuff = g_nullBuffer; - mtctx->jobs[jobID].cSize = 0; + + /* Clear the job description, but keep the mutex/cond */ + memset(&mtctx->jobs[jobID], 0, sizeof(mtctx->jobs[jobID])); + mtctx->jobs[jobID].job_mutex = mutex; + mtctx->jobs[jobID].job_cond = cond; } - memset(mtctx->jobs, 0, (mtctx->jobIDMask+1)*sizeof(ZSTDMT_jobDescription)); mtctx->inBuff.buffer = g_nullBuffer; mtctx->inBuff.filled = 0; mtctx->allJobsCompleted = 1; @@ -1028,9 +1034,9 @@ size_t ZSTDMT_getMTCtxParameter(ZSTDMT_CCtx* mtctx, ZSTDMT_parameter parameter, /* Sets parameters relevant to the compression job, * initializing others to default values. */ -static ZSTD_CCtx_params ZSTDMT_initJobCCtxParams(ZSTD_CCtx_params const params) +static ZSTD_CCtx_params ZSTDMT_initJobCCtxParams(const ZSTD_CCtx_params* params) { - ZSTD_CCtx_params jobParams = params; + ZSTD_CCtx_params jobParams = *params; /* Clear parameters related to multithreading */ jobParams.forceWindow = 0; jobParams.nbWorkers = 0; @@ -1151,16 +1157,16 @@ size_t ZSTDMT_toFlushNow(ZSTDMT_CCtx* mtctx) /* ===== Multi-threaded compression ===== */ /* ------------------------------------------ */ -static unsigned ZSTDMT_computeTargetJobLog(ZSTD_CCtx_params const params) +static unsigned ZSTDMT_computeTargetJobLog(const ZSTD_CCtx_params* params) { unsigned jobLog; - if (params.ldmParams.enableLdm) { + if (params->ldmParams.enableLdm) { /* In Long Range Mode, the windowLog is typically oversized. * In which case, it's preferable to determine the jobSize * based on chainLog instead. */ - jobLog = MAX(21, params.cParams.chainLog + 4); + jobLog = MAX(21, params->cParams.chainLog + 4); } else { - jobLog = MAX(20, params.cParams.windowLog + 2); + jobLog = MAX(20, params->cParams.windowLog + 2); } return MIN(jobLog, (unsigned)ZSTDMT_JOBLOG_MAX); } @@ -1193,27 +1199,27 @@ static int ZSTDMT_overlapLog(int ovlog, ZSTD_strategy strat) return ovlog; } -static size_t ZSTDMT_computeOverlapSize(ZSTD_CCtx_params const params) +static size_t ZSTDMT_computeOverlapSize(const ZSTD_CCtx_params* params) { - int const overlapRLog = 9 - ZSTDMT_overlapLog(params.overlapLog, params.cParams.strategy); - int ovLog = (overlapRLog >= 8) ? 0 : (params.cParams.windowLog - overlapRLog); + int const overlapRLog = 9 - ZSTDMT_overlapLog(params->overlapLog, params->cParams.strategy); + int ovLog = (overlapRLog >= 8) ? 0 : (params->cParams.windowLog - overlapRLog); assert(0 <= overlapRLog && overlapRLog <= 8); - if (params.ldmParams.enableLdm) { + if (params->ldmParams.enableLdm) { /* In Long Range Mode, the windowLog is typically oversized. * In which case, it's preferable to determine the jobSize * based on chainLog instead. * Then, ovLog becomes a fraction of the jobSize, rather than windowSize */ - ovLog = MIN(params.cParams.windowLog, ZSTDMT_computeTargetJobLog(params) - 2) + ovLog = MIN(params->cParams.windowLog, ZSTDMT_computeTargetJobLog(params) - 2) - overlapRLog; } assert(0 <= ovLog && ovLog <= ZSTD_WINDOWLOG_MAX); - DEBUGLOG(4, "overlapLog : %i", params.overlapLog); + DEBUGLOG(4, "overlapLog : %i", params->overlapLog); DEBUGLOG(4, "overlap size : %i", 1 << ovLog); return (ovLog==0) ? 0 : (size_t)1 << ovLog; } static unsigned -ZSTDMT_computeNbJobs(ZSTD_CCtx_params params, size_t srcSize, unsigned nbWorkers) +ZSTDMT_computeNbJobs(const ZSTD_CCtx_params* params, size_t srcSize, unsigned nbWorkers) { assert(nbWorkers>0); { size_t const jobSizeTarget = (size_t)1 << ZSTDMT_computeTargetJobLog(params); @@ -1236,9 +1242,9 @@ static size_t ZSTDMT_compress_advanced_internal( const ZSTD_CDict* cdict, ZSTD_CCtx_params params) { - ZSTD_CCtx_params const jobParams = ZSTDMT_initJobCCtxParams(params); - size_t const overlapSize = ZSTDMT_computeOverlapSize(params); - unsigned const nbJobs = ZSTDMT_computeNbJobs(params, srcSize, params.nbWorkers); + ZSTD_CCtx_params const jobParams = ZSTDMT_initJobCCtxParams(¶ms); + size_t const overlapSize = ZSTDMT_computeOverlapSize(¶ms); + unsigned const nbJobs = ZSTDMT_computeNbJobs(¶ms, srcSize, params.nbWorkers); size_t const proposedJobSize = (srcSize + (nbJobs-1)) / nbJobs; size_t const avgJobSize = (((proposedJobSize-1) & 0x1FFFF) < 0x7FFF) ? proposedJobSize + 0xFFFF : proposedJobSize; /* avoid too small last block */ const char* const srcStart = (const char*)src; @@ -1256,7 +1262,7 @@ static size_t ZSTDMT_compress_advanced_internal( ZSTD_CCtx* const cctx = mtctx->cctxPool->cctx[0]; DEBUGLOG(4, "ZSTDMT_compress_advanced_internal: fallback to single-thread mode"); if (cdict) return ZSTD_compress_usingCDict_advanced(cctx, dst, dstCapacity, src, srcSize, cdict, jobParams.fParams); - return ZSTD_compress_advanced_internal(cctx, dst, dstCapacity, src, srcSize, NULL, 0, jobParams); + return ZSTD_compress_advanced_internal(cctx, dst, dstCapacity, src, srcSize, NULL, 0, &jobParams); } assert(avgJobSize >= 256 KB); /* condition for ZSTD_compressBound(A) + ZSTD_compressBound(B) <= ZSTD_compressBound(A+B), required to compress directly into Dst (no additional buffer) */ @@ -1404,12 +1410,12 @@ size_t ZSTDMT_initCStream_internal( mtctx->singleBlockingThread = (pledgedSrcSize <= ZSTDMT_JOBSIZE_MIN); /* do not trigger multi-threading when srcSize is too small */ if (mtctx->singleBlockingThread) { - ZSTD_CCtx_params const singleThreadParams = ZSTDMT_initJobCCtxParams(params); + ZSTD_CCtx_params const singleThreadParams = ZSTDMT_initJobCCtxParams(¶ms); DEBUGLOG(5, "ZSTDMT_initCStream_internal: switch to single blocking thread mode"); assert(singleThreadParams.nbWorkers == 0); return ZSTD_initCStream_internal(mtctx->cctxPool->cctx[0], dict, dictSize, cdict, - singleThreadParams, pledgedSrcSize); + &singleThreadParams, pledgedSrcSize); } DEBUGLOG(4, "ZSTDMT_initCStream_internal: %u workers", params.nbWorkers); @@ -1435,11 +1441,11 @@ size_t ZSTDMT_initCStream_internal( mtctx->cdict = cdict; } - mtctx->targetPrefixSize = ZSTDMT_computeOverlapSize(params); + mtctx->targetPrefixSize = ZSTDMT_computeOverlapSize(¶ms); DEBUGLOG(4, "overlapLog=%i => %u KB", params.overlapLog, (U32)(mtctx->targetPrefixSize>>10)); mtctx->targetSectionSize = params.jobSize; if (mtctx->targetSectionSize == 0) { - mtctx->targetSectionSize = 1ULL << ZSTDMT_computeTargetJobLog(params); + mtctx->targetSectionSize = 1ULL << ZSTDMT_computeTargetJobLog(¶ms); } assert(mtctx->targetSectionSize <= (size_t)ZSTDMT_JOBSIZE_MAX); diff --git a/thirdparty/zstd/decompress/huf_decompress.c b/thirdparty/zstd/decompress/huf_decompress.c index 3f8bd29732..bb2d0a96bc 100644 --- a/thirdparty/zstd/decompress/huf_decompress.c +++ b/thirdparty/zstd/decompress/huf_decompress.c @@ -61,7 +61,9 @@ * Error Management ****************************************************************/ #define HUF_isError ERR_isError +#ifndef CHECK_F #define CHECK_F(f) { size_t const err_ = (f); if (HUF_isError(err_)) return err_; } +#endif /* ************************************************************** diff --git a/thirdparty/zstd/decompress/zstd_decompress.c b/thirdparty/zstd/decompress/zstd_decompress.c index e42872ad96..dd4591b7be 100644 --- a/thirdparty/zstd/decompress/zstd_decompress.c +++ b/thirdparty/zstd/decompress/zstd_decompress.c @@ -88,10 +88,7 @@ size_t ZSTD_estimateDCtxSize(void) { return sizeof(ZSTD_DCtx); } static size_t ZSTD_startingInputLength(ZSTD_format_e format) { - size_t const startingInputLength = (format==ZSTD_f_zstd1_magicless) ? - ZSTD_FRAMEHEADERSIZE_PREFIX - ZSTD_FRAMEIDSIZE : - ZSTD_FRAMEHEADERSIZE_PREFIX; - ZSTD_STATIC_ASSERT(ZSTD_FRAMEHEADERSIZE_PREFIX >= ZSTD_FRAMEIDSIZE); + size_t const startingInputLength = ZSTD_FRAMEHEADERSIZE_PREFIX(format); /* only supports formats ZSTD_f_zstd1 and ZSTD_f_zstd1_magicless */ assert( (format == ZSTD_f_zstd1) || (format == ZSTD_f_zstd1_magicless) ); return startingInputLength; @@ -376,7 +373,7 @@ unsigned long long ZSTD_findDecompressedSize(const void* src, size_t srcSize) { unsigned long long totalDstSize = 0; - while (srcSize >= ZSTD_FRAMEHEADERSIZE_PREFIX) { + while (srcSize >= ZSTD_startingInputLength(ZSTD_f_zstd1)) { U32 const magicNumber = MEM_readLE32(src); if ((magicNumber & ZSTD_MAGIC_SKIPPABLE_MASK) == ZSTD_MAGIC_SKIPPABLE_START) { @@ -574,9 +571,10 @@ void ZSTD_checkContinuity(ZSTD_DCtx* dctx, const void* dst) } /** ZSTD_insertBlock() : - insert `src` block into `dctx` history. Useful to track uncompressed blocks. */ + * insert `src` block into `dctx` history. Useful to track uncompressed blocks. */ size_t ZSTD_insertBlock(ZSTD_DCtx* dctx, const void* blockStart, size_t blockSize) { + DEBUGLOG(5, "ZSTD_insertBlock: %u bytes", (unsigned)blockSize); ZSTD_checkContinuity(dctx, blockStart); dctx->previousDstEnd = (const char*)blockStart + blockSize; return blockSize; @@ -628,11 +626,12 @@ static size_t ZSTD_decompressFrame(ZSTD_DCtx* dctx, /* check */ RETURN_ERROR_IF( - remainingSrcSize < ZSTD_FRAMEHEADERSIZE_MIN+ZSTD_blockHeaderSize, + remainingSrcSize < ZSTD_FRAMEHEADERSIZE_MIN(dctx->format)+ZSTD_blockHeaderSize, srcSize_wrong); /* Frame Header */ - { size_t const frameHeaderSize = ZSTD_frameHeaderSize(ip, ZSTD_FRAMEHEADERSIZE_PREFIX); + { size_t const frameHeaderSize = ZSTD_frameHeaderSize_internal( + ip, ZSTD_FRAMEHEADERSIZE_PREFIX(dctx->format), dctx->format); if (ZSTD_isError(frameHeaderSize)) return frameHeaderSize; RETURN_ERROR_IF(remainingSrcSize < frameHeaderSize+ZSTD_blockHeaderSize, srcSize_wrong); @@ -713,7 +712,7 @@ static size_t ZSTD_decompressMultiFrame(ZSTD_DCtx* dctx, dictSize = ZSTD_DDict_dictSize(ddict); } - while (srcSize >= ZSTD_FRAMEHEADERSIZE_PREFIX) { + while (srcSize >= ZSTD_startingInputLength(dctx->format)) { #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT >= 1) if (ZSTD_isLegacy(src, srcSize)) { @@ -909,6 +908,7 @@ size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, c { blockProperties_t bp; size_t const cBlockSize = ZSTD_getcBlockSize(src, ZSTD_blockHeaderSize, &bp); if (ZSTD_isError(cBlockSize)) return cBlockSize; + RETURN_ERROR_IF(cBlockSize > dctx->fParams.blockSizeMax, corruption_detected, "Block Size Exceeds Maximum"); dctx->expected = cBlockSize; dctx->bType = bp.blockType; dctx->rleSize = bp.origSize; @@ -953,6 +953,7 @@ size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, c RETURN_ERROR(corruption_detected); } if (ZSTD_isError(rSize)) return rSize; + RETURN_ERROR_IF(rSize > dctx->fParams.blockSizeMax, corruption_detected, "Decompressed Block Size Exceeds Maximum"); DEBUGLOG(5, "ZSTD_decompressContinue: decoded size from block : %u", (unsigned)rSize); dctx->decodedSize += rSize; if (dctx->fParams.checksumFlag) XXH64_update(&dctx->xxhState, dst, rSize); @@ -1095,7 +1096,7 @@ ZSTD_loadDEntropy(ZSTD_entropyDTables_t* entropy, size_t const dictContentSize = (size_t)(dictEnd - (dictPtr+12)); for (i=0; i<3; i++) { U32 const rep = MEM_readLE32(dictPtr); dictPtr += 4; - RETURN_ERROR_IF(rep==0 || rep >= dictContentSize, + RETURN_ERROR_IF(rep==0 || rep > dictContentSize, dictionary_corrupted); entropy->rep[i] = rep; } } @@ -1264,7 +1265,7 @@ size_t ZSTD_DCtx_loadDictionary_advanced(ZSTD_DCtx* dctx, { RETURN_ERROR_IF(dctx->streamStage != zdss_init, stage_wrong); ZSTD_clearDict(dctx); - if (dict && dictSize >= 8) { + if (dict && dictSize != 0) { dctx->ddictLocal = ZSTD_createDDict_advanced(dict, dictSize, dictLoadMethod, dictContentType, dctx->customMem); RETURN_ERROR_IF(dctx->ddictLocal == NULL, memory_allocation); dctx->ddict = dctx->ddictLocal; @@ -1297,14 +1298,14 @@ size_t ZSTD_DCtx_refPrefix(ZSTD_DCtx* dctx, const void* prefix, size_t prefixSiz /* ZSTD_initDStream_usingDict() : - * return : expected size, aka ZSTD_FRAMEHEADERSIZE_PREFIX. + * return : expected size, aka ZSTD_startingInputLength(). * this function cannot fail */ size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t dictSize) { DEBUGLOG(4, "ZSTD_initDStream_usingDict"); FORWARD_IF_ERROR( ZSTD_DCtx_reset(zds, ZSTD_reset_session_only) ); FORWARD_IF_ERROR( ZSTD_DCtx_loadDictionary(zds, dict, dictSize) ); - return ZSTD_FRAMEHEADERSIZE_PREFIX; + return ZSTD_startingInputLength(zds->format); } /* note : this variant can't fail */ @@ -1321,16 +1322,16 @@ size_t ZSTD_initDStream_usingDDict(ZSTD_DStream* dctx, const ZSTD_DDict* ddict) { FORWARD_IF_ERROR( ZSTD_DCtx_reset(dctx, ZSTD_reset_session_only) ); FORWARD_IF_ERROR( ZSTD_DCtx_refDDict(dctx, ddict) ); - return ZSTD_FRAMEHEADERSIZE_PREFIX; + return ZSTD_startingInputLength(dctx->format); } /* ZSTD_resetDStream() : - * return : expected size, aka ZSTD_FRAMEHEADERSIZE_PREFIX. + * return : expected size, aka ZSTD_startingInputLength(). * this function cannot fail */ size_t ZSTD_resetDStream(ZSTD_DStream* dctx) { FORWARD_IF_ERROR(ZSTD_DCtx_reset(dctx, ZSTD_reset_session_only)); - return ZSTD_FRAMEHEADERSIZE_PREFIX; + return ZSTD_startingInputLength(dctx->format); } @@ -1561,7 +1562,7 @@ size_t ZSTD_decompressStream(ZSTD_DStream* zds, ZSTD_outBuffer* output, ZSTD_inB zds->lhSize += remainingInput; } input->pos = input->size; - return (MAX(ZSTD_FRAMEHEADERSIZE_MIN, hSize) - zds->lhSize) + ZSTD_blockHeaderSize; /* remaining header bytes + next block header */ + return (MAX((size_t)ZSTD_FRAMEHEADERSIZE_MIN(zds->format), hSize) - zds->lhSize) + ZSTD_blockHeaderSize; /* remaining header bytes + next block header */ } assert(ip != NULL); memcpy(zds->headerBuffer + zds->lhSize, ip, toLoad); zds->lhSize = hSize; ip += toLoad; diff --git a/thirdparty/zstd/decompress/zstd_decompress_block.c b/thirdparty/zstd/decompress/zstd_decompress_block.c index 24f4859c56..767e5f9a0b 100644 --- a/thirdparty/zstd/decompress/zstd_decompress_block.c +++ b/thirdparty/zstd/decompress/zstd_decompress_block.c @@ -79,6 +79,7 @@ size_t ZSTD_decodeLiteralsBlock(ZSTD_DCtx* dctx, size_t ZSTD_decodeLiteralsBlock(ZSTD_DCtx* dctx, const void* src, size_t srcSize) /* note : srcSize < BLOCKSIZE */ { + DEBUGLOG(5, "ZSTD_decodeLiteralsBlock"); RETURN_ERROR_IF(srcSize < MIN_CBLOCK_SIZE, corruption_detected); { const BYTE* const istart = (const BYTE*) src; @@ -87,6 +88,7 @@ size_t ZSTD_decodeLiteralsBlock(ZSTD_DCtx* dctx, switch(litEncType) { case set_repeat: + DEBUGLOG(5, "set_repeat flag : re-using stats from previous compressed literals block"); RETURN_ERROR_IF(dctx->litEntropy==0, dictionary_corrupted); /* fall-through */ @@ -116,7 +118,7 @@ size_t ZSTD_decodeLiteralsBlock(ZSTD_DCtx* dctx, /* 2 - 2 - 18 - 18 */ lhSize = 5; litSize = (lhc >> 4) & 0x3FFFF; - litCSize = (lhc >> 22) + (istart[4] << 10); + litCSize = (lhc >> 22) + ((size_t)istart[4] << 10); break; } RETURN_ERROR_IF(litSize > ZSTD_BLOCKSIZE_MAX, corruption_detected); @@ -391,7 +393,8 @@ ZSTD_buildFSETable(ZSTD_seqSymbol* dt, symbolNext[s] = 1; } else { if (normalizedCounter[s] >= largeLimit) DTableH.fastMode=0; - symbolNext[s] = normalizedCounter[s]; + assert(normalizedCounter[s]>=0); + symbolNext[s] = (U16)normalizedCounter[s]; } } } memcpy(dt, &DTableH, sizeof(DTableH)); } @@ -570,38 +573,118 @@ typedef struct { size_t pos; } seqState_t; +/*! ZSTD_overlapCopy8() : + * Copies 8 bytes from ip to op and updates op and ip where ip <= op. + * If the offset is < 8 then the offset is spread to at least 8 bytes. + * + * Precondition: *ip <= *op + * Postcondition: *op - *op >= 8 + */ +static void ZSTD_overlapCopy8(BYTE** op, BYTE const** ip, size_t offset) { + assert(*ip <= *op); + if (offset < 8) { + /* close range match, overlap */ + static const U32 dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 }; /* added */ + static const int dec64table[] = { 8, 8, 8, 7, 8, 9,10,11 }; /* subtracted */ + int const sub2 = dec64table[offset]; + (*op)[0] = (*ip)[0]; + (*op)[1] = (*ip)[1]; + (*op)[2] = (*ip)[2]; + (*op)[3] = (*ip)[3]; + *ip += dec32table[offset]; + ZSTD_copy4(*op+4, *ip); + *ip -= sub2; + } else { + ZSTD_copy8(*op, *ip); + } + *ip += 8; + *op += 8; + assert(*op - *ip >= 8); +} + +/*! ZSTD_safecopy() : + * Specialized version of memcpy() that is allowed to READ up to WILDCOPY_OVERLENGTH past the input buffer + * and write up to 16 bytes past oend_w (op >= oend_w is allowed). + * This function is only called in the uncommon case where the sequence is near the end of the block. It + * should be fast for a single long sequence, but can be slow for several short sequences. + * + * @param ovtype controls the overlap detection + * - ZSTD_no_overlap: The source and destination are guaranteed to be at least WILDCOPY_VECLEN bytes apart. + * - ZSTD_overlap_src_before_dst: The src and dst may overlap and may be any distance apart. + * The src buffer must be before the dst buffer. + */ +static void ZSTD_safecopy(BYTE* op, BYTE* const oend_w, BYTE const* ip, ptrdiff_t length, ZSTD_overlap_e ovtype) { + ptrdiff_t const diff = op - ip; + BYTE* const oend = op + length; -/* ZSTD_execSequenceLast7(): - * exceptional case : decompress a match starting within last 7 bytes of output buffer. - * requires more careful checks, to ensure there is no overflow. - * performance does not matter though. - * note : this case is supposed to be never generated "naturally" by reference encoder, - * since in most cases it needs at least 8 bytes to look for a match. - * but it's allowed by the specification. */ + assert((ovtype == ZSTD_no_overlap && (diff <= -8 || diff >= 8 || op >= oend_w)) || + (ovtype == ZSTD_overlap_src_before_dst && diff >= 0)); + + if (length < 8) { + /* Handle short lengths. */ + while (op < oend) *op++ = *ip++; + return; + } + if (ovtype == ZSTD_overlap_src_before_dst) { + /* Copy 8 bytes and ensure the offset >= 8 when there can be overlap. */ + assert(length >= 8); + ZSTD_overlapCopy8(&op, &ip, diff); + assert(op - ip >= 8); + assert(op <= oend); + } + + if (oend <= oend_w) { + /* No risk of overwrite. */ + ZSTD_wildcopy(op, ip, length, ovtype); + return; + } + if (op <= oend_w) { + /* Wildcopy until we get close to the end. */ + assert(oend > oend_w); + ZSTD_wildcopy(op, ip, oend_w - op, ovtype); + ip += oend_w - op; + op = oend_w; + } + /* Handle the leftovers. */ + while (op < oend) *op++ = *ip++; +} + +/* ZSTD_execSequenceEnd(): + * This version handles cases that are near the end of the output buffer. It requires + * more careful checks to make sure there is no overflow. By separating out these hard + * and unlikely cases, we can speed up the common cases. + * + * NOTE: This function needs to be fast for a single long sequence, but doesn't need + * to be optimized for many small sequences, since those fall into ZSTD_execSequence(). + */ FORCE_NOINLINE -size_t ZSTD_execSequenceLast7(BYTE* op, - BYTE* const oend, seq_t sequence, - const BYTE** litPtr, const BYTE* const litLimit, - const BYTE* const base, const BYTE* const vBase, const BYTE* const dictEnd) +size_t ZSTD_execSequenceEnd(BYTE* op, + BYTE* const oend, seq_t sequence, + const BYTE** litPtr, const BYTE* const litLimit, + const BYTE* const prefixStart, const BYTE* const virtualStart, const BYTE* const dictEnd) { BYTE* const oLitEnd = op + sequence.litLength; size_t const sequenceLength = sequence.litLength + sequence.matchLength; BYTE* const oMatchEnd = op + sequenceLength; /* risk : address space overflow (32-bits) */ const BYTE* const iLitEnd = *litPtr + sequence.litLength; const BYTE* match = oLitEnd - sequence.offset; + BYTE* const oend_w = oend - WILDCOPY_OVERLENGTH; - /* check */ - RETURN_ERROR_IF(oMatchEnd>oend, dstSize_tooSmall, "last match must fit within dstBuffer"); + /* bounds checks */ + assert(oLitEnd < oMatchEnd); + RETURN_ERROR_IF(oMatchEnd > oend, dstSize_tooSmall, "last match must fit within dstBuffer"); RETURN_ERROR_IF(iLitEnd > litLimit, corruption_detected, "try to read beyond literal buffer"); /* copy literals */ - while (op < oLitEnd) *op++ = *(*litPtr)++; + ZSTD_safecopy(op, oend_w, *litPtr, sequence.litLength, ZSTD_no_overlap); + op = oLitEnd; + *litPtr = iLitEnd; /* copy Match */ - if (sequence.offset > (size_t)(oLitEnd - base)) { + if (sequence.offset > (size_t)(oLitEnd - prefixStart)) { /* offset beyond prefix */ - RETURN_ERROR_IF(sequence.offset > (size_t)(oLitEnd - vBase),corruption_detected); - match = dictEnd - (base-match); + RETURN_ERROR_IF(sequence.offset > (size_t)(oLitEnd - virtualStart), corruption_detected); + match = dictEnd - (prefixStart-match); if (match + sequence.matchLength <= dictEnd) { memmove(oLitEnd, match, sequence.matchLength); return sequenceLength; @@ -611,13 +694,12 @@ size_t ZSTD_execSequenceLast7(BYTE* op, memmove(oLitEnd, match, length1); op = oLitEnd + length1; sequence.matchLength -= length1; - match = base; + match = prefixStart; } } - while (op < oMatchEnd) *op++ = *match++; + ZSTD_safecopy(op, oend_w, match, sequence.matchLength, ZSTD_overlap_src_before_dst); return sequenceLength; } - HINT_INLINE size_t ZSTD_execSequence(BYTE* op, BYTE* const oend, seq_t sequence, @@ -631,20 +713,29 @@ size_t ZSTD_execSequence(BYTE* op, const BYTE* const iLitEnd = *litPtr + sequence.litLength; const BYTE* match = oLitEnd - sequence.offset; - /* check */ - RETURN_ERROR_IF(oMatchEnd>oend, dstSize_tooSmall, "last match must start at a minimum distance of WILDCOPY_OVERLENGTH from oend"); - RETURN_ERROR_IF(iLitEnd > litLimit, corruption_detected, "over-read beyond lit buffer"); - if (oLitEnd>oend_w) return ZSTD_execSequenceLast7(op, oend, sequence, litPtr, litLimit, prefixStart, virtualStart, dictEnd); - - /* copy Literals */ - if (sequence.litLength > 8) - ZSTD_wildcopy_16min(op, (*litPtr), sequence.litLength, ZSTD_no_overlap); /* note : since oLitEnd <= oend-WILDCOPY_OVERLENGTH, no risk of overwrite beyond oend */ - else - ZSTD_copy8(op, *litPtr); + /* Errors and uncommon cases handled here. */ + assert(oLitEnd < oMatchEnd); + if (iLitEnd > litLimit || oMatchEnd > oend_w) + return ZSTD_execSequenceEnd(op, oend, sequence, litPtr, litLimit, prefixStart, virtualStart, dictEnd); + + /* Assumptions (everything else goes into ZSTD_execSequenceEnd()) */ + assert(iLitEnd <= litLimit /* Literal length is in bounds */); + assert(oLitEnd <= oend_w /* Can wildcopy literals */); + assert(oMatchEnd <= oend_w /* Can wildcopy matches */); + + /* Copy Literals: + * Split out litLength <= 16 since it is nearly always true. +1.6% on gcc-9. + * We likely don't need the full 32-byte wildcopy. + */ + assert(WILDCOPY_OVERLENGTH >= 16); + ZSTD_copy16(op, (*litPtr)); + if (sequence.litLength > 16) { + ZSTD_wildcopy(op+16, (*litPtr)+16, sequence.litLength-16, ZSTD_no_overlap); + } op = oLitEnd; *litPtr = iLitEnd; /* update for next sequence */ - /* copy Match */ + /* Copy Match */ if (sequence.offset > (size_t)(oLitEnd - prefixStart)) { /* offset beyond prefix -> go into extDict */ RETURN_ERROR_IF(sequence.offset > (size_t)(oLitEnd - virtualStart), corruption_detected); @@ -659,123 +750,33 @@ size_t ZSTD_execSequence(BYTE* op, op = oLitEnd + length1; sequence.matchLength -= length1; match = prefixStart; - if (op > oend_w || sequence.matchLength < MINMATCH) { - U32 i; - for (i = 0; i < sequence.matchLength; ++i) op[i] = match[i]; - return sequenceLength; - } } } - /* Requirement: op <= oend_w && sequence.matchLength >= MINMATCH */ - - /* match within prefix */ - if (sequence.offset < 8) { - /* close range match, overlap */ - static const U32 dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 }; /* added */ - static const int dec64table[] = { 8, 8, 8, 7, 8, 9,10,11 }; /* subtracted */ - int const sub2 = dec64table[sequence.offset]; - op[0] = match[0]; - op[1] = match[1]; - op[2] = match[2]; - op[3] = match[3]; - match += dec32table[sequence.offset]; - ZSTD_copy4(op+4, match); - match -= sub2; - } else { - ZSTD_copy8(op, match); - } - op += 8; match += 8; - - if (oMatchEnd > oend-(16-MINMATCH)) { - if (op < oend_w) { - ZSTD_wildcopy(op, match, oend_w - op, ZSTD_overlap_src_before_dst); - match += oend_w - op; - op = oend_w; - } - while (op < oMatchEnd) *op++ = *match++; - } else { - ZSTD_wildcopy(op, match, (ptrdiff_t)sequence.matchLength-8, ZSTD_overlap_src_before_dst); /* works even if matchLength < 8 */ + /* Match within prefix of 1 or more bytes */ + assert(op <= oMatchEnd); + assert(oMatchEnd <= oend_w); + assert(match >= prefixStart); + assert(sequence.matchLength >= 1); + + /* Nearly all offsets are >= WILDCOPY_VECLEN bytes, which means we can use wildcopy + * without overlap checking. + */ + if (sequence.offset >= WILDCOPY_VECLEN) { + /* We bet on a full wildcopy for matches, since we expect matches to be + * longer than literals (in general). In silesia, ~10% of matches are longer + * than 16 bytes. + */ + ZSTD_wildcopy(op, match, (ptrdiff_t)sequence.matchLength, ZSTD_no_overlap); + return sequenceLength; } - return sequenceLength; -} - - -HINT_INLINE -size_t ZSTD_execSequenceLong(BYTE* op, - BYTE* const oend, seq_t sequence, - const BYTE** litPtr, const BYTE* const litLimit, - const BYTE* const prefixStart, const BYTE* const dictStart, const BYTE* const dictEnd) -{ - BYTE* const oLitEnd = op + sequence.litLength; - size_t const sequenceLength = sequence.litLength + sequence.matchLength; - BYTE* const oMatchEnd = op + sequenceLength; /* risk : address space overflow (32-bits) */ - BYTE* const oend_w = oend - WILDCOPY_OVERLENGTH; - const BYTE* const iLitEnd = *litPtr + sequence.litLength; - const BYTE* match = sequence.match; + assert(sequence.offset < WILDCOPY_VECLEN); - /* check */ - RETURN_ERROR_IF(oMatchEnd > oend, dstSize_tooSmall, "last match must start at a minimum distance of WILDCOPY_OVERLENGTH from oend"); - RETURN_ERROR_IF(iLitEnd > litLimit, corruption_detected, "over-read beyond lit buffer"); - if (oLitEnd > oend_w) return ZSTD_execSequenceLast7(op, oend, sequence, litPtr, litLimit, prefixStart, dictStart, dictEnd); - - /* copy Literals */ - if (sequence.litLength > 8) - ZSTD_wildcopy_16min(op, *litPtr, sequence.litLength, ZSTD_no_overlap); /* note : since oLitEnd <= oend-WILDCOPY_OVERLENGTH, no risk of overwrite beyond oend */ - else - ZSTD_copy8(op, *litPtr); /* note : op <= oLitEnd <= oend_w == oend - 8 */ + /* Copy 8 bytes and spread the offset to be >= 8. */ + ZSTD_overlapCopy8(&op, &match, sequence.offset); - op = oLitEnd; - *litPtr = iLitEnd; /* update for next sequence */ - - /* copy Match */ - if (sequence.offset > (size_t)(oLitEnd - prefixStart)) { - /* offset beyond prefix */ - RETURN_ERROR_IF(sequence.offset > (size_t)(oLitEnd - dictStart), corruption_detected); - if (match + sequence.matchLength <= dictEnd) { - memmove(oLitEnd, match, sequence.matchLength); - return sequenceLength; - } - /* span extDict & currentPrefixSegment */ - { size_t const length1 = dictEnd - match; - memmove(oLitEnd, match, length1); - op = oLitEnd + length1; - sequence.matchLength -= length1; - match = prefixStart; - if (op > oend_w || sequence.matchLength < MINMATCH) { - U32 i; - for (i = 0; i < sequence.matchLength; ++i) op[i] = match[i]; - return sequenceLength; - } - } } - assert(op <= oend_w); - assert(sequence.matchLength >= MINMATCH); - - /* match within prefix */ - if (sequence.offset < 8) { - /* close range match, overlap */ - static const U32 dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 }; /* added */ - static const int dec64table[] = { 8, 8, 8, 7, 8, 9,10,11 }; /* subtracted */ - int const sub2 = dec64table[sequence.offset]; - op[0] = match[0]; - op[1] = match[1]; - op[2] = match[2]; - op[3] = match[3]; - match += dec32table[sequence.offset]; - ZSTD_copy4(op+4, match); - match -= sub2; - } else { - ZSTD_copy8(op, match); - } - op += 8; match += 8; - - if (oMatchEnd > oend-(16-MINMATCH)) { - if (op < oend_w) { - ZSTD_wildcopy(op, match, oend_w - op, ZSTD_overlap_src_before_dst); - match += oend_w - op; - op = oend_w; - } - while (op < oMatchEnd) *op++ = *match++; - } else { - ZSTD_wildcopy(op, match, (ptrdiff_t)sequence.matchLength-8, ZSTD_overlap_src_before_dst); /* works even if matchLength < 8 */ + /* If the match length is > 8 bytes, then continue with the wildcopy. */ + if (sequence.matchLength > 8) { + assert(op < oMatchEnd); + ZSTD_wildcopy(op, match, (ptrdiff_t)sequence.matchLength-8, ZSTD_overlap_src_before_dst); } return sequenceLength; } @@ -1095,7 +1096,7 @@ ZSTD_decompressSequencesLong_body( /* decode and decompress */ for ( ; (BIT_reloadDStream(&(seqState.DStream)) <= BIT_DStream_completed) && (seqNb<nbSeq) ; seqNb++) { seq_t const sequence = ZSTD_decodeSequenceLong(&seqState, isLongOffset); - size_t const oneSeqSize = ZSTD_execSequenceLong(op, oend, sequences[(seqNb-ADVANCED_SEQS) & STORED_SEQS_MASK], &litPtr, litEnd, prefixStart, dictStart, dictEnd); + size_t const oneSeqSize = ZSTD_execSequence(op, oend, sequences[(seqNb-ADVANCED_SEQS) & STORED_SEQS_MASK], &litPtr, litEnd, prefixStart, dictStart, dictEnd); if (ZSTD_isError(oneSeqSize)) return oneSeqSize; PREFETCH_L1(sequence.match); PREFETCH_L1(sequence.match + sequence.matchLength - 1); /* note : it's safe to invoke PREFETCH() on any memory address, including invalid ones */ sequences[seqNb & STORED_SEQS_MASK] = sequence; @@ -1106,7 +1107,7 @@ ZSTD_decompressSequencesLong_body( /* finish queue */ seqNb -= seqAdvance; for ( ; seqNb<nbSeq ; seqNb++) { - size_t const oneSeqSize = ZSTD_execSequenceLong(op, oend, sequences[seqNb&STORED_SEQS_MASK], &litPtr, litEnd, prefixStart, dictStart, dictEnd); + size_t const oneSeqSize = ZSTD_execSequence(op, oend, sequences[seqNb&STORED_SEQS_MASK], &litPtr, litEnd, prefixStart, dictStart, dictEnd); if (ZSTD_isError(oneSeqSize)) return oneSeqSize; op += oneSeqSize; } diff --git a/thirdparty/zstd/zstd.h b/thirdparty/zstd/zstd.h index a1910ee223..72080ea87e 100644 --- a/thirdparty/zstd/zstd.h +++ b/thirdparty/zstd/zstd.h @@ -15,6 +15,7 @@ extern "C" { #define ZSTD_H_235446 /* ====== Dependency ======*/ +#include <limits.h> /* INT_MAX */ #include <stddef.h> /* size_t */ @@ -71,7 +72,7 @@ extern "C" { /*------ Version ------*/ #define ZSTD_VERSION_MAJOR 1 #define ZSTD_VERSION_MINOR 4 -#define ZSTD_VERSION_RELEASE 1 +#define ZSTD_VERSION_RELEASE 4 #define ZSTD_VERSION_NUMBER (ZSTD_VERSION_MAJOR *100*100 + ZSTD_VERSION_MINOR *100 + ZSTD_VERSION_RELEASE) ZSTDLIB_API unsigned ZSTD_versionNumber(void); /**< to check runtime library version */ @@ -196,9 +197,13 @@ ZSTDLIB_API ZSTD_CCtx* ZSTD_createCCtx(void); ZSTDLIB_API size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx); /*! ZSTD_compressCCtx() : - * Same as ZSTD_compress(), using an explicit ZSTD_CCtx - * The function will compress at requested compression level, - * ignoring any other parameter */ + * Same as ZSTD_compress(), using an explicit ZSTD_CCtx. + * Important : in order to behave similarly to `ZSTD_compress()`, + * this function compresses at requested compression level, + * __ignoring any other parameter__ . + * If any advanced parameter was set using the advanced API, + * they will all be reset. Only `compressionLevel` remains. + */ ZSTDLIB_API size_t ZSTD_compressCCtx(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, @@ -233,7 +238,7 @@ ZSTDLIB_API size_t ZSTD_decompressDCtx(ZSTD_DCtx* dctx, * using ZSTD_CCtx_set*() functions. * Pushed parameters are sticky : they are valid for next compressed frame, and any subsequent frame. * "sticky" parameters are applicable to `ZSTD_compress2()` and `ZSTD_compressStream*()` ! - * They do not apply to "simple" one-shot variants such as ZSTD_compressCCtx() + * __They do not apply to "simple" one-shot variants such as ZSTD_compressCCtx()__ . * * It's possible to reset all parameters to "default" using ZSTD_CCtx_reset(). * @@ -261,18 +266,26 @@ typedef enum { /* compression parameters * Note: When compressing with a ZSTD_CDict these parameters are superseded - * by the parameters used to construct the ZSTD_CDict. See ZSTD_CCtx_refCDict() - * for more info (superseded-by-cdict). */ - ZSTD_c_compressionLevel=100, /* Update all compression parameters according to pre-defined cLevel table + * by the parameters used to construct the ZSTD_CDict. + * See ZSTD_CCtx_refCDict() for more info (superseded-by-cdict). */ + ZSTD_c_compressionLevel=100, /* Set compression parameters according to pre-defined cLevel table. + * Note that exact compression parameters are dynamically determined, + * depending on both compression level and srcSize (when known). * Default level is ZSTD_CLEVEL_DEFAULT==3. * Special: value 0 means default, which is controlled by ZSTD_CLEVEL_DEFAULT. * Note 1 : it's possible to pass a negative compression level. - * Note 2 : setting a level sets all default values of other compression parameters */ + * Note 2 : setting a level resets all other compression parameters to default */ + /* Advanced compression parameters : + * It's possible to pin down compression parameters to some specific values. + * In which case, these values are no longer dynamically selected by the compressor */ ZSTD_c_windowLog=101, /* Maximum allowed back-reference distance, expressed as power of 2. + * This will set a memory budget for streaming decompression, + * with larger values requiring more memory + * and typically compressing more. * Must be clamped between ZSTD_WINDOWLOG_MIN and ZSTD_WINDOWLOG_MAX. * Special: value 0 means "use default windowLog". * Note: Using a windowLog greater than ZSTD_WINDOWLOG_LIMIT_DEFAULT - * requires explicitly allowing such window size at decompression stage if using streaming. */ + * requires explicitly allowing such size at streaming decompression stage. */ ZSTD_c_hashLog=102, /* Size of the initial probe table, as a power of 2. * Resulting memory usage is (1 << (hashLog+2)). * Must be clamped between ZSTD_HASHLOG_MIN and ZSTD_HASHLOG_MAX. @@ -283,13 +296,13 @@ typedef enum { * Resulting memory usage is (1 << (chainLog+2)). * Must be clamped between ZSTD_CHAINLOG_MIN and ZSTD_CHAINLOG_MAX. * Larger tables result in better and slower compression. - * This parameter is useless when using "fast" strategy. + * This parameter is useless for "fast" strategy. * It's still useful when using "dfast" strategy, * in which case it defines a secondary probe table. * Special: value 0 means "use default chainLog". */ ZSTD_c_searchLog=104, /* Number of search attempts, as a power of 2. * More attempts result in better and slower compression. - * This parameter is useless when using "fast" and "dFast" strategies. + * This parameter is useless for "fast" and "dFast" strategies. * Special: value 0 means "use default searchLog". */ ZSTD_c_minMatch=105, /* Minimum size of searched matches. * Note that Zstandard can still find matches of smaller size, @@ -344,7 +357,7 @@ typedef enum { ZSTD_c_contentSizeFlag=200, /* Content size will be written into frame header _whenever known_ (default:1) * Content size must be known at the beginning of compression. * This is automatically the case when using ZSTD_compress2(), - * For streaming variants, content size must be provided with ZSTD_CCtx_setPledgedSrcSize() */ + * For streaming scenarios, content size must be provided with ZSTD_CCtx_setPledgedSrcSize() */ ZSTD_c_checksumFlag=201, /* A 32-bits checksum of content is written at end of frame (default:0) */ ZSTD_c_dictIDFlag=202, /* When applicable, dictionary's ID is written into frame header (default:1) */ @@ -363,7 +376,7 @@ typedef enum { * Each compression job is completed in parallel, so this value can indirectly impact the nb of active threads. * 0 means default, which is dynamically determined based on compression parameters. * Job size must be a minimum of overlap size, or 1 MB, whichever is largest. - * The minimum size is automatically and transparently enforced */ + * The minimum size is automatically and transparently enforced. */ ZSTD_c_overlapLog=402, /* Control the overlap size, as a fraction of window size. * The overlap size is an amount of data reloaded from previous job at the beginning of a new job. * It helps preserve compression ratio, while each job is compressed in parallel. @@ -386,6 +399,7 @@ typedef enum { * ZSTD_c_forceAttachDict * ZSTD_c_literalCompressionMode * ZSTD_c_targetCBlockSize + * ZSTD_c_srcSizeHint * Because they are not stable, it's necessary to define ZSTD_STATIC_LINKING_ONLY to access them. * note : never ever use experimentalParam? names directly; * also, the enums values themselves are unstable and can still change. @@ -396,6 +410,7 @@ typedef enum { ZSTD_c_experimentalParam4=1001, ZSTD_c_experimentalParam5=1002, ZSTD_c_experimentalParam6=1003, + ZSTD_c_experimentalParam7=1004 } ZSTD_cParameter; typedef struct { @@ -793,12 +808,17 @@ ZSTDLIB_API size_t ZSTD_decompress_usingDict(ZSTD_DCtx* dctx, typedef struct ZSTD_CDict_s ZSTD_CDict; /*! ZSTD_createCDict() : - * When compressing multiple messages / blocks using the same dictionary, it's recommended to load it only once. - * ZSTD_createCDict() will create a digested dictionary, ready to start future compression operations without startup cost. + * When compressing multiple messages or blocks using the same dictionary, + * it's recommended to digest the dictionary only once, since it's a costly operation. + * ZSTD_createCDict() will create a state from digesting a dictionary. + * The resulting state can be used for future compression operations with very limited startup cost. * ZSTD_CDict can be created once and shared by multiple threads concurrently, since its usage is read-only. - * `dictBuffer` can be released after ZSTD_CDict creation, because its content is copied within CDict. - * Consider experimental function `ZSTD_createCDict_byReference()` if you prefer to not duplicate `dictBuffer` content. - * Note : A ZSTD_CDict can be created from an empty dictBuffer, but it is inefficient when used to compress small data. */ + * @dictBuffer can be released after ZSTD_CDict creation, because its content is copied within CDict. + * Note 1 : Consider experimental function `ZSTD_createCDict_byReference()` if you prefer to not duplicate @dictBuffer content. + * Note 2 : A ZSTD_CDict can be created from an empty @dictBuffer, + * in which case the only thing that it transports is the @compressionLevel. + * This can be useful in a pipeline featuring ZSTD_compress_usingCDict() exclusively, + * expecting a ZSTD_CDict parameter with any data, including those without a known dictionary. */ ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict(const void* dictBuffer, size_t dictSize, int compressionLevel); @@ -925,7 +945,7 @@ ZSTDLIB_API size_t ZSTD_CCtx_refCDict(ZSTD_CCtx* cctx, const ZSTD_CDict* cdict); * Note 3 : Referencing a prefix involves building tables, which are dependent on compression parameters. * It's a CPU consuming operation, with non-negligible impact on latency. * If there is a need to use the same prefix multiple times, consider loadDictionary instead. - * Note 4 : By default, the prefix is interpreted as raw content (ZSTD_dm_rawContent). + * Note 4 : By default, the prefix is interpreted as raw content (ZSTD_dct_rawContent). * Use experimental ZSTD_CCtx_refPrefix_advanced() to alter dictionary interpretation. */ ZSTDLIB_API size_t ZSTD_CCtx_refPrefix(ZSTD_CCtx* cctx, const void* prefix, size_t prefixSize); @@ -969,7 +989,7 @@ ZSTDLIB_API size_t ZSTD_DCtx_refDDict(ZSTD_DCtx* dctx, const ZSTD_DDict* ddict); * Note 2 : Prefix buffer is referenced. It **must** outlive decompression. * Prefix buffer must remain unmodified up to the end of frame, * reached when ZSTD_decompressStream() returns 0. - * Note 3 : By default, the prefix is treated as raw content (ZSTD_dm_rawContent). + * Note 3 : By default, the prefix is treated as raw content (ZSTD_dct_rawContent). * Use ZSTD_CCtx_refPrefix_advanced() to alter dictMode (Experimental section) * Note 4 : Referencing a raw content prefix has almost no cpu nor memory cost. * A full dictionary is more costly, as it requires building tables. @@ -1014,8 +1034,8 @@ ZSTDLIB_API size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict); * Some of them might be removed in the future (especially when redundant with existing stable functions) * ***************************************************************************************/ -#define ZSTD_FRAMEHEADERSIZE_PREFIX 5 /* minimum input size required to query frame header size */ -#define ZSTD_FRAMEHEADERSIZE_MIN 6 +#define ZSTD_FRAMEHEADERSIZE_PREFIX(format) ((format) == ZSTD_f_zstd1 ? 5 : 1) /* minimum input size required to query frame header size */ +#define ZSTD_FRAMEHEADERSIZE_MIN(format) ((format) == ZSTD_f_zstd1 ? 6 : 2) #define ZSTD_FRAMEHEADERSIZE_MAX 18 /* can be useful for static allocation */ #define ZSTD_SKIPPABLEHEADERSIZE 8 @@ -1063,6 +1083,8 @@ ZSTDLIB_API size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict); /* Advanced parameter bounds */ #define ZSTD_TARGETCBLOCKSIZE_MIN 64 #define ZSTD_TARGETCBLOCKSIZE_MAX ZSTD_BLOCKSIZE_MAX +#define ZSTD_SRCSIZEHINT_MIN 0 +#define ZSTD_SRCSIZEHINT_MAX INT_MAX /* internal */ #define ZSTD_HASHLOG3_MAX 17 @@ -1073,6 +1095,24 @@ ZSTDLIB_API size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict); typedef struct ZSTD_CCtx_params_s ZSTD_CCtx_params; typedef struct { + unsigned int matchPos; /* Match pos in dst */ + /* If seqDef.offset > 3, then this is seqDef.offset - 3 + * If seqDef.offset < 3, then this is the corresponding repeat offset + * But if seqDef.offset < 3 and litLength == 0, this is the + * repeat offset before the corresponding repeat offset + * And if seqDef.offset == 3 and litLength == 0, this is the + * most recent repeat offset - 1 + */ + unsigned int offset; + unsigned int litLength; /* Literal length */ + unsigned int matchLength; /* Match length */ + /* 0 when seq not rep and seqDef.offset otherwise + * when litLength == 0 this will be <= 4, otherwise <= 3 like normal + */ + unsigned int rep; +} ZSTD_Sequence; + +typedef struct { unsigned windowLog; /**< largest match distance : larger == more compression, more memory needed during decompression */ unsigned chainLog; /**< fully searched segment : larger == more compression, slower, more memory (useless for fast) */ unsigned hashLog; /**< dispatch table : larger == faster, more memory */ @@ -1101,21 +1141,12 @@ typedef enum { typedef enum { ZSTD_dlm_byCopy = 0, /**< Copy dictionary content internally */ - ZSTD_dlm_byRef = 1, /**< Reference dictionary content -- the dictionary buffer must outlive its users. */ + ZSTD_dlm_byRef = 1 /**< Reference dictionary content -- the dictionary buffer must outlive its users. */ } ZSTD_dictLoadMethod_e; typedef enum { - /* Opened question : should we have a format ZSTD_f_auto ? - * Today, it would mean exactly the same as ZSTD_f_zstd1. - * But, in the future, should several formats become supported, - * on the compression side, it would mean "default format". - * On the decompression side, it would mean "automatic format detection", - * so that ZSTD_f_zstd1 would mean "accept *only* zstd frames". - * Since meaning is a little different, another option could be to define different enums for compression and decompression. - * This question could be kept for later, when there are actually multiple formats to support, - * but there is also the question of pinning enum values, and pinning value `0` is especially important */ ZSTD_f_zstd1 = 0, /* zstd frame format, specified in zstd_compression_format.md (default) */ - ZSTD_f_zstd1_magicless = 1, /* Variant of zstd frame format, without initial 4-bytes magic number. + ZSTD_f_zstd1_magicless = 1 /* Variant of zstd frame format, without initial 4-bytes magic number. * Useful to save 4 bytes per generated frame. * Decoder cannot recognise automatically this format, requiring this instruction. */ } ZSTD_format_e; @@ -1126,7 +1157,7 @@ typedef enum { * to evolve and should be considered only in the context of extremely * advanced performance tuning. * - * Zstd currently supports the use of a CDict in two ways: + * Zstd currently supports the use of a CDict in three ways: * * - The contents of the CDict can be copied into the working context. This * means that the compression can search both the dictionary and input @@ -1142,6 +1173,12 @@ typedef enum { * working context's tables can be reused). For small inputs, this can be * faster than copying the CDict's tables. * + * - The CDict's tables are not used at all, and instead we use the working + * context alone to reload the dictionary and use params based on the source + * size. See ZSTD_compress_insertDictionary() and ZSTD_compress_usingDict(). + * This method is effective when the dictionary sizes are very small relative + * to the input size, and the input size is fairly large to begin with. + * * Zstd has a simple internal heuristic that selects which strategy to use * at the beginning of a compression. However, if experimentation shows that * Zstd is making poor choices, it is possible to override that choice with @@ -1150,6 +1187,7 @@ typedef enum { ZSTD_dictDefaultAttach = 0, /* Use the default heuristic. */ ZSTD_dictForceAttach = 1, /* Never copy the dictionary. */ ZSTD_dictForceCopy = 2, /* Always copy the dictionary. */ + ZSTD_dictForceLoad = 3 /* Always reload the dictionary */ } ZSTD_dictAttachPref_e; typedef enum { @@ -1158,7 +1196,7 @@ typedef enum { * levels will be compressed. */ ZSTD_lcm_huffman = 1, /**< Always attempt Huffman compression. Uncompressed literals will still be * emitted if Huffman compression is not profitable. */ - ZSTD_lcm_uncompressed = 2, /**< Always emit uncompressed literals. */ + ZSTD_lcm_uncompressed = 2 /**< Always emit uncompressed literals. */ } ZSTD_literalCompressionMode_e; @@ -1210,20 +1248,38 @@ ZSTDLIB_API unsigned long long ZSTD_decompressBound(const void* src, size_t srcS * or an error code (if srcSize is too small) */ ZSTDLIB_API size_t ZSTD_frameHeaderSize(const void* src, size_t srcSize); +/*! ZSTD_getSequences() : + * Extract sequences from the sequence store + * zc can be used to insert custom compression params. + * This function invokes ZSTD_compress2 + * @return : number of sequences extracted + */ +ZSTDLIB_API size_t ZSTD_getSequences(ZSTD_CCtx* zc, ZSTD_Sequence* outSeqs, + size_t outSeqsSize, const void* src, size_t srcSize); + /*************************************** * Memory management ***************************************/ /*! ZSTD_estimate*() : - * These functions make it possible to estimate memory usage - * of a future {D,C}Ctx, before its creation. - * ZSTD_estimateCCtxSize() will provide a budget large enough for any compression level up to selected one. - * It will also consider src size to be arbitrarily "large", which is worst case. - * If srcSize is known to always be small, ZSTD_estimateCCtxSize_usingCParams() can provide a tighter estimation. - * ZSTD_estimateCCtxSize_usingCParams() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel. - * ZSTD_estimateCCtxSize_usingCCtxParams() can be used in tandem with ZSTD_CCtxParams_setParameter(). Only single-threaded compression is supported. This function will return an error code if ZSTD_c_nbWorkers is >= 1. - * Note : CCtx size estimation is only correct for single-threaded compression. */ + * These functions make it possible to estimate memory usage of a future + * {D,C}Ctx, before its creation. + * + * ZSTD_estimateCCtxSize() will provide a budget large enough for any + * compression level up to selected one. Unlike ZSTD_estimateCStreamSize*(), + * this estimate does not include space for a window buffer, so this estimate + * is guaranteed to be enough for single-shot compressions, but not streaming + * compressions. It will however assume the input may be arbitrarily large, + * which is the worst case. If srcSize is known to always be small, + * ZSTD_estimateCCtxSize_usingCParams() can provide a tighter estimation. + * ZSTD_estimateCCtxSize_usingCParams() can be used in tandem with + * ZSTD_getCParams() to create cParams from compressionLevel. + * ZSTD_estimateCCtxSize_usingCCtxParams() can be used in tandem with + * ZSTD_CCtxParams_setParameter(). + * + * Note: only single-threaded compression is supported. This function will + * return an error code if ZSTD_c_nbWorkers is >= 1. */ ZSTDLIB_API size_t ZSTD_estimateCCtxSize(int compressionLevel); ZSTDLIB_API size_t ZSTD_estimateCCtxSize_usingCParams(ZSTD_compressionParameters cParams); ZSTDLIB_API size_t ZSTD_estimateCCtxSize_usingCCtxParams(const ZSTD_CCtx_params* params); @@ -1334,7 +1390,8 @@ ZSTDLIB_API ZSTD_DDict* ZSTD_createDDict_advanced(const void* dict, size_t dictS * Create a digested dictionary for compression * Dictionary content is just referenced, not duplicated. * As a consequence, `dictBuffer` **must** outlive CDict, - * and its content must remain unmodified throughout the lifetime of CDict. */ + * and its content must remain unmodified throughout the lifetime of CDict. + * note: equivalent to ZSTD_createCDict_advanced(), with dictLoadMethod==ZSTD_dlm_byRef */ ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict_byReference(const void* dictBuffer, size_t dictSize, int compressionLevel); /*! ZSTD_getCParams() : @@ -1361,7 +1418,9 @@ ZSTDLIB_API size_t ZSTD_checkCParams(ZSTD_compressionParameters params); ZSTDLIB_API ZSTD_compressionParameters ZSTD_adjustCParams(ZSTD_compressionParameters cPar, unsigned long long srcSize, size_t dictSize); /*! ZSTD_compress_advanced() : - * Same as ZSTD_compress_usingDict(), with fine-tune control over compression parameters (by structure) */ + * Note : this function is now DEPRECATED. + * It can be replaced by ZSTD_compress2(), in combination with ZSTD_CCtx_setParameter() and other parameter setters. + * This prototype will be marked as deprecated and generate compilation warning on reaching v1.5.x */ ZSTDLIB_API size_t ZSTD_compress_advanced(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, @@ -1369,7 +1428,9 @@ ZSTDLIB_API size_t ZSTD_compress_advanced(ZSTD_CCtx* cctx, ZSTD_parameters params); /*! ZSTD_compress_usingCDict_advanced() : - * Same as ZSTD_compress_usingCDict(), with fine-tune control over frame parameters */ + * Note : this function is now REDUNDANT. + * It can be replaced by ZSTD_compress2(), in combination with ZSTD_CCtx_loadDictionary() and other parameter setters. + * This prototype will be marked as deprecated and generate compilation warning in some future version */ ZSTDLIB_API size_t ZSTD_compress_usingCDict_advanced(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, @@ -1441,6 +1502,12 @@ ZSTDLIB_API size_t ZSTD_CCtx_refPrefix_advanced(ZSTD_CCtx* cctx, const void* pre * There is no guarantee on compressed block size (default:0) */ #define ZSTD_c_targetCBlockSize ZSTD_c_experimentalParam6 +/* User's best guess of source size. + * Hint is not valid when srcSizeHint == 0. + * There is no guarantee that hint is close to actual source size, + * but compression ratio may regress significantly if guess considerably underestimates */ +#define ZSTD_c_srcSizeHint ZSTD_c_experimentalParam7 + /*! ZSTD_CCtx_getParameter() : * Get the requested compression parameter value, selected by enum ZSTD_cParameter, * and store it into int* value. @@ -1613,8 +1680,13 @@ ZSTDLIB_API size_t ZSTD_decompressStream_simpleArgs ( * pledgedSrcSize must be correct. If it is not known at init time, use * ZSTD_CONTENTSIZE_UNKNOWN. Note that, for compatibility with older programs, * "0" also disables frame content size field. It may be enabled in the future. + * Note : this prototype will be marked as deprecated and generate compilation warnings on reaching v1.5.x */ -ZSTDLIB_API size_t ZSTD_initCStream_srcSize(ZSTD_CStream* zcs, int compressionLevel, unsigned long long pledgedSrcSize); +ZSTDLIB_API size_t +ZSTD_initCStream_srcSize(ZSTD_CStream* zcs, + int compressionLevel, + unsigned long long pledgedSrcSize); + /**! ZSTD_initCStream_usingDict() : * This function is deprecated, and is equivalent to: * ZSTD_CCtx_reset(zcs, ZSTD_reset_session_only); @@ -1623,42 +1695,66 @@ ZSTDLIB_API size_t ZSTD_initCStream_srcSize(ZSTD_CStream* zcs, int compressionLe * * Creates of an internal CDict (incompatible with static CCtx), except if * dict == NULL or dictSize < 8, in which case no dict is used. - * Note: dict is loaded with ZSTD_dm_auto (treated as a full zstd dictionary if + * Note: dict is loaded with ZSTD_dct_auto (treated as a full zstd dictionary if * it begins with ZSTD_MAGIC_DICTIONARY, else as raw content) and ZSTD_dlm_byCopy. + * Note : this prototype will be marked as deprecated and generate compilation warnings on reaching v1.5.x */ -ZSTDLIB_API size_t ZSTD_initCStream_usingDict(ZSTD_CStream* zcs, const void* dict, size_t dictSize, int compressionLevel); +ZSTDLIB_API size_t +ZSTD_initCStream_usingDict(ZSTD_CStream* zcs, + const void* dict, size_t dictSize, + int compressionLevel); + /**! ZSTD_initCStream_advanced() : * This function is deprecated, and is approximately equivalent to: * ZSTD_CCtx_reset(zcs, ZSTD_reset_session_only); - * ZSTD_CCtx_setZstdParams(zcs, params); // Set the zstd params and leave the rest as-is + * // Pseudocode: Set each zstd parameter and leave the rest as-is. + * for ((param, value) : params) { + * ZSTD_CCtx_setParameter(zcs, param, value); + * } * ZSTD_CCtx_setPledgedSrcSize(zcs, pledgedSrcSize); * ZSTD_CCtx_loadDictionary(zcs, dict, dictSize); * - * pledgedSrcSize must be correct. If srcSize is not known at init time, use - * value ZSTD_CONTENTSIZE_UNKNOWN. dict is loaded with ZSTD_dm_auto and ZSTD_dlm_byCopy. + * dict is loaded with ZSTD_dct_auto and ZSTD_dlm_byCopy. + * pledgedSrcSize must be correct. + * If srcSize is not known at init time, use value ZSTD_CONTENTSIZE_UNKNOWN. + * Note : this prototype will be marked as deprecated and generate compilation warnings on reaching v1.5.x */ -ZSTDLIB_API size_t ZSTD_initCStream_advanced(ZSTD_CStream* zcs, const void* dict, size_t dictSize, - ZSTD_parameters params, unsigned long long pledgedSrcSize); +ZSTDLIB_API size_t +ZSTD_initCStream_advanced(ZSTD_CStream* zcs, + const void* dict, size_t dictSize, + ZSTD_parameters params, + unsigned long long pledgedSrcSize); + /**! ZSTD_initCStream_usingCDict() : * This function is deprecated, and equivalent to: * ZSTD_CCtx_reset(zcs, ZSTD_reset_session_only); * ZSTD_CCtx_refCDict(zcs, cdict); * * note : cdict will just be referenced, and must outlive compression session + * Note : this prototype will be marked as deprecated and generate compilation warnings on reaching v1.5.x */ ZSTDLIB_API size_t ZSTD_initCStream_usingCDict(ZSTD_CStream* zcs, const ZSTD_CDict* cdict); + /**! ZSTD_initCStream_usingCDict_advanced() : - * This function is deprecated, and is approximately equivalent to: + * This function is DEPRECATED, and is approximately equivalent to: * ZSTD_CCtx_reset(zcs, ZSTD_reset_session_only); - * ZSTD_CCtx_setZstdFrameParams(zcs, fParams); // Set the zstd frame params and leave the rest as-is + * // Pseudocode: Set each zstd frame parameter and leave the rest as-is. + * for ((fParam, value) : fParams) { + * ZSTD_CCtx_setParameter(zcs, fParam, value); + * } * ZSTD_CCtx_setPledgedSrcSize(zcs, pledgedSrcSize); * ZSTD_CCtx_refCDict(zcs, cdict); * * same as ZSTD_initCStream_usingCDict(), with control over frame parameters. * pledgedSrcSize must be correct. If srcSize is not known at init time, use * value ZSTD_CONTENTSIZE_UNKNOWN. + * Note : this prototype will be marked as deprecated and generate compilation warnings on reaching v1.5.x */ -ZSTDLIB_API size_t ZSTD_initCStream_usingCDict_advanced(ZSTD_CStream* zcs, const ZSTD_CDict* cdict, ZSTD_frameParameters fParams, unsigned long long pledgedSrcSize); +ZSTDLIB_API size_t +ZSTD_initCStream_usingCDict_advanced(ZSTD_CStream* zcs, + const ZSTD_CDict* cdict, + ZSTD_frameParameters fParams, + unsigned long long pledgedSrcSize); /*! ZSTD_resetCStream() : * This function is deprecated, and is equivalent to: @@ -1673,6 +1769,7 @@ ZSTDLIB_API size_t ZSTD_initCStream_usingCDict_advanced(ZSTD_CStream* zcs, const * For the time being, pledgedSrcSize==0 is interpreted as "srcSize unknown" for compatibility with older programs, * but it will change to mean "empty" in future version, so use macro ZSTD_CONTENTSIZE_UNKNOWN instead. * @return : 0, or an error code (which can be tested using ZSTD_isError()) + * Note : this prototype will be marked as deprecated and generate compilation warnings on reaching v1.5.x */ ZSTDLIB_API size_t ZSTD_resetCStream(ZSTD_CStream* zcs, unsigned long long pledgedSrcSize); @@ -1718,8 +1815,10 @@ ZSTDLIB_API size_t ZSTD_toFlushNow(ZSTD_CCtx* cctx); * ZSTD_DCtx_loadDictionary(zds, dict, dictSize); * * note: no dictionary will be used if dict == NULL or dictSize < 8 + * Note : this prototype will be marked as deprecated and generate compilation warnings on reaching v1.5.x */ ZSTDLIB_API size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t dictSize); + /** * This function is deprecated, and is equivalent to: * @@ -1727,14 +1826,17 @@ ZSTDLIB_API size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dic * ZSTD_DCtx_refDDict(zds, ddict); * * note : ddict is referenced, it must outlive decompression session + * Note : this prototype will be marked as deprecated and generate compilation warnings on reaching v1.5.x */ ZSTDLIB_API size_t ZSTD_initDStream_usingDDict(ZSTD_DStream* zds, const ZSTD_DDict* ddict); + /** * This function is deprecated, and is equivalent to: * * ZSTD_DCtx_reset(zds, ZSTD_reset_session_only); * * re-use decompression parameters from previous init; saves dictionary loading + * Note : this prototype will be marked as deprecated and generate compilation warnings on reaching v1.5.x */ ZSTDLIB_API size_t ZSTD_resetDStream(ZSTD_DStream* zds); @@ -1908,8 +2010,8 @@ ZSTDLIB_API ZSTD_nextInputType_e ZSTD_nextInputType(ZSTD_DCtx* dctx); /*! Block functions produce and decode raw zstd blocks, without frame metadata. - Frame metadata cost is typically ~18 bytes, which can be non-negligible for very small blocks (< 100 bytes). - User will have to take in charge required information to regenerate data, such as compressed and content sizes. + Frame metadata cost is typically ~12 bytes, which can be non-negligible for very small blocks (< 100 bytes). + But users will have to take in charge needed metadata to regenerate data, such as compressed and content sizes. A few rules to respect : - Compressing and decompressing require a context structure @@ -1920,12 +2022,14 @@ ZSTDLIB_API ZSTD_nextInputType_e ZSTD_nextInputType(ZSTD_DCtx* dctx); + copyCCtx() and copyDCtx() can be used too - Block size is limited, it must be <= ZSTD_getBlockSize() <= ZSTD_BLOCKSIZE_MAX == 128 KB + If input is larger than a block size, it's necessary to split input data into multiple blocks - + For inputs larger than a single block, really consider using regular ZSTD_compress() instead. - Frame metadata is not that costly, and quickly becomes negligible as source size grows larger. - - When a block is considered not compressible enough, ZSTD_compressBlock() result will be zero. - In which case, nothing is produced into `dst` ! - + User must test for such outcome and deal directly with uncompressed data - + ZSTD_decompressBlock() doesn't accept uncompressed data as input !!! + + For inputs larger than a single block, consider using regular ZSTD_compress() instead. + Frame metadata is not that costly, and quickly becomes negligible as source size grows larger than a block. + - When a block is considered not compressible enough, ZSTD_compressBlock() result will be 0 (zero) ! + ===> In which case, nothing is produced into `dst` ! + + User __must__ test for such outcome and deal directly with uncompressed data + + A block cannot be declared incompressible if ZSTD_compressBlock() return value was != 0. + Doing so would mess up with statistics history, leading to potential data corruption. + + ZSTD_decompressBlock() _doesn't accept uncompressed data as input_ !! + In case of multiple successive blocks, should some of them be uncompressed, decoder must be informed of their existence in order to follow proper history. Use ZSTD_insertBlock() for such a case. |