锐单电子商城 , 一站式电子元器件采购平台!
  • 电话:400-990-0325

1, libyuv 编译 for android

时间:2023-06-06 01:07:00 473j400v聚脂电容器

libyuv is an open source project that includes YUV scaling and conversion functionality.

Scale YUV to prepare content for compression, with point, bilinear or box filter.
Convert to YUV from webcam formats.
Convert from YUV to formats for rendering/effects.
Rotate by 90/180/270 degrees to adjust for mobile devices in portrait mode.
Optimized for SSE2/SSSE3/AVX2 on x86/x64.
Optimized for Neon on Arm.
Optimized for DSP R2 on Mips.



直接ndk-build即可






参考资料:

一,Getting Started
How to get and build the libyuv code.
Pre-requisites
Getting the Code
Create a working directory, enter it, and run:
gclient config https://chromium.googlesource.com/libyuv/libyuvgclient sync
Then you'll get a .gclient file like:
solutions = [ { "name" : "libyuv", "url" : "https://chromium.googlesource.om/libyuv/libyuv", "deps_file" : "DEPS", "managed" : True, "custom_deps" : { }, "safesync_url": "", },];
For iOS add  ;target_os=['ios'];  to your OSX .gclient and run  GYP_DEFINES="OS=ios" gclient sync.
Android
For Android add  ;target_os=['android'];  to your Linux .gclient
solutions = [ { "name" : "libyuv", "url" : "https://chromium.googlesource.com/libyuv/libyuv", "deps_file" : "DEPS", "managed" : True, "custom_deps" : { }, "safesync_url": "", },];target_os = ["android", "unix"];
Then run:
export GYP_DEFINES="OS=android"gclient sync
Caveat: Theres an error with Google Play services updates. If you get the error “Your version of the Google Play services library is not up to date”, run the following:
cd chromium/src./build/android/play_services/update.py downloadcd ../..
For Windows the gclient sync must be done from an Administrator command prompt.
The sync will generate native build files for your environment using gyp (Windows: Visual Studio, OSX: XCode, Linux: make). This generation can also be forced manually:  gclient runhooks
To get just the source (not buildable):
git clone https://chromium.googlesource.com/libyuv/libyuv
Building the Library and Unittests
Windows
set GYP_DEFINES=target_arch=ia32call python gyp_libyuv -fninja -G msvs_version=2013ninja -j7 -C out\Releaseninja -j7 -C out\Debugset GYP_DEFINES=target_arch=x64call python gyp_libyuv -fninja -G msvs_version=2013ninja -C out\Debug_x64ninja -C out\Release_x64
Building with clangcl
set GYP_DEFINES=clang=1 target_arch=ia32 libyuv_enable_svn=1set LLVM_REPO_URL=svn://svn.chromium.org/llvm-projectcall python tools\clang\scripts\update.pycall python gyp_libyuv -fninja libyuv_test.gypninja -C out\Debugninja -C out\Release
OSX
Clang 64 bit shown. Remove  clang=1  for GCC and change x64 to ia32 for 32 bit.
GYP_DEFINES="clang=1 target_arch=x64" ./gyp_libyuvninja -j7 -C out/Debugninja -j7 -C out/ReleaseGYP_DEFINES="clang=1 target_arch=ia32" ./gyp_libyuvninja -j7 -C out/Debugninja -j7 -C out/Release
iOS
Add to .gclient last line:  target_os=['ios'];
armv7
GYP_DEFINES="OS=ios target_arch=armv7 target_subarch=arm32" GYP_CROSSCOMPILE=1 GYP_GENERATOR_FLAGS="output_dir=out_ios" ./gyp_libyuvninja -j7 -C out_ios/Debug-iphoneos libyuv_unittestninja -j7 -C out_ios/Release-iphoneos libyuv_unittest
arm64
GYP_DEFINES="OS=ios target_arch=arm64 target_subarch=arm64" GYP_CROSSCOMPILE=1 GYP_GENERATOR_FLAGS="output_dir=out_ios" ./gyp_libyuvninja -j7 -C out_ios/Debug-iphoneos libyuv_unittestninja -j7 -C out_ios/Release-iphoneos libyuv_unittest
both armv7 and arm64 (fat)
GYP_DEFINES="OS=ios target_arch=armv7 target_subarch=both" GYP_CROSSCOMPILE=1 GYP_GENERATOR_FLAGS="output_dir=out_ios" ./gyp_libyuvninja -j7 -C out_ios/Debug-iphoneos libyuv_unittestninja -j7 -C out_ios/Release-iphoneos libyuv_unittest
simulator
GYP_DEFINES="OS=ios target_arch=ia32 target_subarch=arm32" GYP_CROSSCOMPILE=1 GYP_GENERATOR_FLAGS="output_dir=out_sim" ./gyp_libyuvninja -j7 -C out_sim/Debug-iphonesimulator libyuv_unittestninja -j7 -C out_sim/Release-iphonesimulator libyuv_unittest
Android
Add to .gclient last line:  target_os=['android'];
armv7
GYP_DEFINES="OS=android" GYP_CROSSCOMPILE=1 ./gyp_libyuvninja -j7 -C out/Debug yuv_unittest_apkninja -j7 -C out/Release yuv_unittest_apk
arm64
GYP_DEFINES="OS=android target_arch=arm64 target_subarch=arm64" GYP_CROSSCOMPILE=1 ./gyp_libyuvninja -j7 -C out/Debug yuv_unittest_apkninja -j7 -C out/Release yuv_unittest_apk
ia32
GYP_DEFINES="OS=android target_arch=ia32" GYP_CROSSCOMPILE=1 ./gyp_libyuvninja -j7 -C out/Debug yuv_unittest_apkninja -j7 -C out/Release yuv_unittest_apkGYP_DEFINES="OS=android target_arch=ia32 android_full_debug=1" GYP_CROSSCOMPILE=1 ./gyp_libyuvninja -j7 -C out/Debug yuv_unittest_apk
mipsel
GYP_DEFINES="OS=android target_arch=mipsel" GYP_CROSSCOMPILE=1 ./gyp_libyuvninja -j7 -C out/Debug yuv_unittest_apkninja -j7 -C out/Release yuv_unittest_apk
arm32 disassembly:
third_party/android_tools/ndk/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-objdump -d out/Release/obj/source/libyuv.row_neon.o
arm64 disassembly:
third_party/android_tools/ndk/toolchains/aarch64-linux-android-4.9/prebuilt/linux-x86_64/bin/aarch64-linux-android-objdump -d out/Release/obj/source/libyuv.row_neon64.o
Running tests:
util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose --release --gtest_filter=*
Running test as benchmark:
util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose --release --gtest_filter=* -a "--libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1"
Running test with C code:
util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose --release --gtest_filter=* -a "--libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=1 --libyuv_cpu_info=1"
Building with GN
gn gen out/Release "--args=is_debug=false target_cpu=\"x86\""gn gen out/Debug "--args=is_debug=true target_cpu=\"x86\""ninja -C out/Releaseninja -C out/Debug
Building Offical with GN
gn gen out/Official "--args=is_debug=false is_official_build=true is_chrome_branded=true"ninja -C out/Official
Linux
GYP_DEFINES="target_arch=x64" ./gyp_libyuvninja -j7 -C out/Debugninja -j7 -C out/ReleaseGYP_DEFINES="target_arch=ia32" ./gyp_libyuvninja -j7 -C out/Debugninja -j7 -C out/Release
CentOS
On CentOS 32 bit the following work around allows a sync:
export GYP_DEFINES="host_arch=ia32"gclient sync
Windows Shared Library
Modify libyuv.gyp from ‘static_library’ to ‘shared_library’, and add ‘LIBYUV_BUILDING_SHARED_LIBRARY’ to ‘defines’.
gclient runhooks
After this command follow the building the library instructions above.
Build targets
ninja -C out/Debug libyuvninja -C out/Debug libyuv_unittestninja -C out/Debug compareninja -C out/Debug convertninja -C out/Debug psnrninja -C out/Debug cpuid
Building the Library with make
Linux
make -j7 V=1 -f linux.mkmake -j7 V=1 -f linux.mk cleanmake -j7 V=1 -f linux.mk CXX=clang++
Building the Library with cmake
Default debug build:
mkdir outcd outcmake ..cmake --build .
Release build/install
mkdir outcd outcmake -DCMAKE_INSTALL_PREFIX="/usr/lib" -DCMAKE_BUILD_TYPE="Release" ..cmake --build . --config Releasesudo cmake --build . --target install --config Release
Windows 8 Phone
Pre-requisite:
  • Install Visual Studio 2012 and Arm to your environment.
Then:
call "c:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_arm\vcvarsx86_arm.bat"
or with Visual Studio 2013:
call "c:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_arm\vcvarsx86_arm.bat"nmake /f winarm.mk cleannmake /f winarm.mk
Windows Shared Library
Modify libyuv.gyp from ‘static_library’ to ‘shared_library’, and add ‘LIBYUV_BUILDING_SHARED_LIBRARY’ to ‘defines’. Then run this.
gclient runhooks
After this command follow the building the library instructions above.
64 bit Windows
set GYP_DEFINES=target_arch=x64gclient runhooks V=1
ARM Linux
export GYP_DEFINES="target_arch=arm"export CROSSTOOL=``/arm-none-linux-gnueabiexport CXX=$CROSSTOOL-g++export CC=$CROSSTOOL-gccexport AR=$CROSSTOOL-arexport AS=$CROSSTOOL-asexport RANLIB=$CROSSTOOL-ranlibgclient runhooks
Running Unittests
Windows
out\Release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter="*"
OSX
out/Release/libyuv_unittest --gtest_filter="*"
Linux
out/Release/libyuv_unittest --gtest_filter="*"
Replace --gtest_filter=“*” with specific unittest to run. May include wildcards. e.g.
out/Release/libyuv_unittest --gtest_filter=libyuvTest.I420ToARGB_Opt
CPU Emulator tools
Intel SDE (Software Development Emulator)
Then run:
c:\intelsde\sde -hsw -- out\release\libyuv_unittest.exe --gtest_filter=*
Memory tools
Running Dr Memory memcheck for Windows
set GYP_DEFINES=build_for_tool=drmemory target_arch=ia32call python gyp_libyuv -fninja -G msvs_version=2013ninja -C out\Debugdrmemory out\Debug\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*
Running UBSan
Sanitizers available: TSan, MSan, ASan, UBSan, LSan
GYP_DEFINES='ubsan=1' gclient runhooksninja -C out/Release
Running Valgrind memcheck
solutions = [ { "name" : "libyuv", "url" : "https://chromium.googlesource.com/libyuv/libyuv", "deps_file" : "DEPS", "managed" : True, "custom_deps" : { "libyuv/chromium/src/third_party/valgrind": "https://chromium.googlesource.com/chromium/deps/valgrind/binaries", }, "safesync_url": "", },]
Then run:
GYP_DEFINES="clang=0 target_arch=x64 build_for_tool=memcheck" python gyp_libyuvninja -C out/Debugvalgrind out/Debug/libyuv_unittest
Running Thread Sanitizer (TSan)
GYP_DEFINES="clang=0 target_arch=x64 build_for_tool=tsan" python gyp_libyuvninja -C out/Debugvalgrind out/Debug/libyuv_unittest
Running Address Sanitizer (ASan)
GYP_DEFINES="clang=0 target_arch=x64 build_for_tool=asan" python gyp_libyuvninja -C out/Debugvalgrind out/Debug/libyuv_unittest
Benchmarking
The unittests can be used to benchmark.
Windows
set LIBYUV_WIDTH=1280set LIBYUV_HEIGHT=720set LIBYUV_REPEAT=999set LIBYUV_FLAGS=-1out\Release\libyuv_unittest.exe --gtest_filter=*I420ToARGB_Opt
Linux and Mac
LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=1000 out/Release/libyuv_unittest --gtest_filter=*I420ToARGB_OptlibyuvTest.I420ToARGB_Opt (547 ms)
Indicates 0.547 ms/frame for 1280 x 720.
Making a change
gclient syncgit checkout -b mycl -t origin/mastergit pullgit add -ugit commit -m "my change"git cl lintgit cl trygit cl upload -r a-reviewer@chomium.org -sgit cl land


二,filtering

Introduction

This document discusses the current state of filtering in libyuv. An emphasis on maximum performance while avoiding memory exceptions, and minimal amount of code/complexity. See future work at end.

LibYuv Filter Subsampling

There are 2 challenges with subsampling

centering of samples, which involves clamping on edges
clipping a source region
Centering depends on scale factor and filter mode.

Down Sampling

If scaling down, the stepping rate is always src_width / dst_width.

dx = src_width / dst_width;
e.g. If scaling from 1280x720 to 640x360, the step thru the source will be 2.0, stepping over 2 pixels of source for each pixel of destination.

Centering, depends on filter mode.

Point downsampling takes the middle pixel.

x = dx >> 1;
For odd scale factors (e.g. 3x down) this is exactly the middle. For even scale factors, this rounds up and takes the pixel to the right of center. e.g. scale of 4x down will take pixel 2.

Bilinear filter, uses the 2x2 pixels in the middle.

x = dx / 2 - 0.5;
For odd scale factors (e.g. 3x down) this is exactly the middle, and point sampling is used. For even scale factors, this evenly filters the middle 2x2 pixels. e.g. 4x down will filter pixels 1,2 at 50% in both directions.

Box filter averages the entire box so sampling starts at 0.

x = 0;
For a scale factor of 2x down, this is equivalent to bilinear.

Up Sampling

Point upsampling use stepping rate of src_width / dst_width and a starting coordinate of 0.

x = 0;
dx = src_width / dst_width;
e.g. If scaling from 640x360 to 1280x720 the step thru the source will be 0.0, stepping half a pixel of source for each pixel of destination. Each pixel is replicated by the scale factor.

Bilinear filter stretches such that the first pixel of source maps to the first pixel of destination, and the last pixel of source maps to the last pixel of destination.

x = 0;
dx = (src_width - 1) / (dst_width - 1);
This method is not technically correct, and will likely change in the future.

It is inconsistent with the bilinear down sampler. The same method could be used for down sampling, and then it would be more reversible, but that would prevent specialized 2x down sampling.

Although centered, the image is slightly magnified.

The filtering was changed in early 2013 - previously it used:

x = 0;
dx = (src_width - 1) / (dst_width - 1);
Which is the correct scale factor, but shifted the image left, and extruded the last pixel. The reason for the change was to remove the extruding code from the low level row functions, allowing 3 functions to sshare the same row functions - ARGBScale, I420Scale, and ARGBInterpolate. Then the one function was ported to many cpu variations: SSE2, SSSE3, AVX2, Neon and ‘Any’ version for any number of pixels and alignment. The function is also specialized for 0,25,50,75%.

The above goes still has the potential to read the last pixel 100% and last pixel + 1 0%, which may cause a memory exception. So the left pixel goes to a fraction less than the last pixel, but filters in the minimum amount of it, and the maximum of the last pixel.

dx = FixedDiv((src_width << 16) - 0x00010001, (dst << 16) - 0x00010000);
Box filter for upsampling switches over to Bilinear.

Scale snippet:

#define CENTERSTART(dx, s) (dx < 0) ? -((-dx >> 1) + s) : ((dx >> 1) + s)
#define FIXEDDIV1(src, dst) FixedDiv((src << 16) - 0x00010001, \
(dst << 16) - 0x00010000);

// Compute slope values for stepping.
void ScaleSlope(int src_width, int src_height,
int dst_width, int dst_height,
FilterMode filtering,
int* x, int* y, int* dx, int* dy) {
assert(x != NULL);
assert(y != NULL);
assert(dx != NULL);
assert(dy != NULL);
assert(src_width != 0);
assert(src_height != 0);
assert(dst_width > 0);
assert(dst_height > 0);
if (filtering == kFilterBox) {
// Scale step for point sampling duplicates all pixels equally.
*dx = FixedDiv(Abs(src_width), dst_width);
*dy = FixedDiv(src_height, dst_height);
*x = 0;
*y = 0;
} else if (filtering == kFilterBilinear) {
// Scale step for bilinear sampling renders last pixel once for upsample.
if (dst_width <= Abs(src_width)) {
*dx = FixedDiv(Abs(src_width), dst_width);
*x = CENTERSTART(*dx, -32768);
} else if (dst_width > 1) {
*dx = FIXEDDIV1(Abs(src_width), dst_width);
*x = 0;
}
if (dst_height <= src_height) {
*dy = FixedDiv(src_height, dst_height);
*y = CENTERSTART(*dy, -32768); // 32768 = -0.5 to center bilinear.
} else if (dst_height > 1) {
*dy = FIXEDDIV1(src_height, dst_height);
*y = 0;
}
} else if (filtering == kFilterLinear) {
// Scale step for bilinear sampling renders last pixel once for upsample.
if (dst_width <= Abs(src_width)) {
*dx = FixedDiv(Abs(src_width), dst_width);
*x = CENTERSTART(*dx, -32768);
} else if (dst_width > 1) {
*dx = FIXEDDIV1(Abs(src_width), dst_width);
*x = 0;
}
*dy = FixedDiv(src_height, dst_height);
*y = *dy >> 1;
} else {
// Scale step for point sampling duplicates all pixels equally.
*dx = FixedDiv(Abs(src_width), dst_width);
*dy = FixedDiv(src_height, dst_height);
*x = CENTERSTART(*dx, 0);
*y = CENTERSTART(*dy, 0);
}
// Negative src_width means horizontally mirror.
if (src_width < 0) {
*x += (dst_width - 1) * *dx;
*dx = -*dx;
src_width = -src_width;
}
}
Future Work

Point sampling should ideally be the same as bilinear, but pixel by pixel, round to nearest neighbor. But as is, it is reversible and exactly matches ffmpeg at all scale factors, both up and down. The scale factor is

dx = src_width / dst_width;
The step value is centered for down sample:

x = dx / 2;
Or starts at 0 for upsample.

x = 0;
Bilinear filtering is currently correct for down sampling, but not for upsampling. Upsampling is stretching the first and last pixel of source, to the first and last pixel of destination.

dx = (src_width - 1) / (dst_width - 1);
x = 0;
It should be stretching such that the first pixel is centered in the middle of the scale factor, to match the pixel that would be sampled for down sampling by the same amount. And same on last pixel.

dx = src_width / dst_width;
x = dx / 2 - 0.5;
This would start at -0.5 and go to last pixel + 0.5, sampling 50% from last pixel + 1. Then clamping would be needed. On GPUs there are numerous ways to clamp.

Clamp the coordinate to the edge of the texture, duplicating the first and last pixel.
Blend with a constant color, such as transparent black. Typically best for fonts.
Mirror the UV coordinate, which is similar to clamping. Good for continuous tone images.
Wrap the coordinate, for texture tiling.
Allow the coordinate to index beyond the image, which may be the correct data if sampling a subimage.
Extrapolate the edge based on the previous pixel. pixel -0.5 is computed from slope of pixel 0 and 1.
Some of these are computational, even for a GPU, which is one reason textures are sometimes limited to power of 2 sizes. We do care about the clipping case, where allowing coordinates to become negative and index pixels before the image is the correct data. But normally for simple scaling, we want to clamp to the edge pixel. For example, if bilinear scaling from 3x3 to 30x30, we’d essentially want 10 pixels of each of the original 3 pixels. But we want the original pixels to land in the middle of each 10 pixels, at offsets 5, 15 and 25. There would be filtering between 5 and 15 between the original pixels 0 and 1. And filtering between 15 and 25 from original pixels 1 and 2. The first 5 pixels are clamped to pixel 0 and the last 5 pixels are clamped to pixel 2. The easiest way to implement this is copy the original 3 pixels to a buffer, and duplicate the first and last pixels. 0,1,2 becomes 0, 0,1,2, 2. Then implement a filtering without clamping. We call this source extruding. Its only necessary on up sampling, since down sampler will always have valid surrounding pixels. Extruding is practical when the image is

相关文章