Android二维码识别
什么是二维码
Android系统最常用的二维码开源库ZXing,借助ZXing来深入了解一下二维码识别机制。在这之前有必要了解二维码的组成,以最常用的QRCode(快速识别二维码)为例,
二维码构成.png
二维码的生成有其通用的编码规范,上图所示为一个二维码的基本组成部分。 除了寻象图形、校正图形及定位图形用于帮助定位外,格式信息包含纠错水平、掩码类型等信息,识别的过程就是根据这些信息对数据及纠错码字区域进程解码,当然解码涉及的算法比较复杂。文章重点在于对二维码识别及解析流程作梳理,算法的部分暂时不做深究。
获取相机预览帧
对二维码有了一个大概的认知接下来就可以愉(jian)快(nan)地分析ZXing源码了。去github把代码clone下来之后,对于安卓项目只须编译android和core这两个包下的源码即可,如图编译后的项目结构如下
项目结构.png
几个核心类CaptureActivity、CaptureActivityHandler、DecodeThread、DecodeHandler、QRCodeReader、PlanarYUVLuminanceSource、HybridBinarizer、BitMatrix、Detector、Decoder
在项目入口CaptureActivity里对相机进行初始化
private void initCamera(SurfaceHolder surfaceHolder) {
...
cameraManager.openDriver(surfaceHolder);
// Creating the handler starts the preview, which can also throw a RuntimeException.
if (handler == null) {
handler = new CaptureActivityHandler(this, decodeFormats, decodeHints, characterSet, cameraManager);
}
...
在CaptureActivityHandler构造方法里初始化并开启子线程DecodeThread,后面可以看到DecodeThread里构建了一个消息处理器DecodeHandler,开始监听获取到的每一帧图像。
CaptureActivityHandler(CaptureActivity activity,
Collection<BarcodeFormat> decodeFormats,
Map<DecodeHintType,?> baseHints,
String characterSet,
CameraManager cameraManager) {
this.activity = activity;
decodeThread = new DecodeThread(activity, decodeFormats, baseHints, characterSet,
new ViewfinderResultPointCallback(activity.getViewfinderView()));
decodeThread.start();
state = State.SUCCESS;
// Start ourselves capturing previews and decoding.
this.cameraManager = cameraManager;
//开始扫描流程
cameraManager.startPreview();
restartPreviewAndDecode();
}
private void restartPreviewAndDecode() {
...
cameraManager.requestPreviewFrame(decodeThread.getHandler(), R.id.decode);
...
}
相机开始扫描,并为相机设置预览帧回调,这里的handler就是DecodeHandler,消息id为R.id.decode,相机在获取到一帧图像后会发送消息,由DecodeHandler处理。
public synchronized void requestPreviewFrame(Handler handler, int message) {
OpenCamera theCamera = camera;
if (theCamera != null && previewing) {
//DecodeHandler R.id.decode
previewCallback.setHandler(handler, message);
theCamera.getCamera().setOneShotPreviewCallback(previewCallback);
}
}
下面为PreviewCallback的回调处理,thePreviewHandler即为DecodeHandler
@Override
public void onPreviewFrame(byte[] data, Camera camera) {
Point cameraResolution = configManager.getCameraResolution();
Handler thePreviewHandler = previewHandler;
if (cameraResolution != null && thePreviewHandler != null) {
Message message = thePreviewHandler.obtainMessage(previewMessage, cameraResolution.x,
cameraResolution.y, data);
message.sendToTarget();
previewHandler = null;
} else {
Log.d(TAG, "Got preview callback, but no handler or resolution available");
}
}
解析预览帧
预览到的一帧图像将会回调到DecodeHandler,在DecodeHandler里开始解析这一帧图像。在解析前首先会对数据源进行封装,对于计算机而言一般需要用矩阵来表示一个二维图像,图像的每一个像素都只是这个矩阵中的一个元素。可以通过PlanarYUVLuminanceSource这个类将相机的yuv数据源抽象成矩阵的形式,以便后续的解析。
private void decode(byte[] data, int width, int height) {
...
Result rawResult = null;
PlanarYUVLuminanceSource source = activity.getCameraManager().buildLuminanceSource(data, width, height);
if (source != null) {
BinaryBitmap bitmap = new BinaryBitmap(new HybridBinarizer(source));
try {
//寻找相应的解码器
rawResult = multiFormatReader.decodeWithState(bitmap);
} catch (ReaderException re) {
// continue
} finally {
multiFormatReader.reset();
}
...
}
multiFormatReader.decodeWithState(bitmap)这里会遍历各种格式的解码器,直到找到相应的解码器,对于二维码而言对应的解码器是QRCodeReader,到这里开始解码的核心步骤,
@Override
public final Result decode(BinaryBitmap image, Map<DecodeHintType,?> hints)
throws NotFoundException, ChecksumException, FormatException {
DecoderResult decoderResult;
ResultPoint[] points;
...
//1.对yuv数据进行二值化处理,最终返回01矩阵
DetectorResult detectorResult = new Detector(image.getBlackMatrix()).detect(hints);
//2.根据二维码生成标准进行解码
decoderResult = decoder.decode(detectorResult.getBits(), hints);
points = detectorResult.getPoints();
...
二值化
可以看到上述代码注释1处image.getBlackMatrix,这里实际上会调用到HybridBinarizer.getBlackMatrix。HybridBinarizer可以翻译为二值化器。经过二值化处理后的图像会由灰度或彩色图像转换为黑白图像。二值化其实是为了简化图像表现形式,使图像的形状和轮廓更清晰,减少干扰信息,更易于计算机提取关键部分进行处理。对于二维码识别来说,转换成黑白图像后也刚好对应二维码的黑白像素块。二值化的过程即通过一定的算法找到一个阈值,若当前像素点小于这个阈值则取黑,然后将像素点在矩阵中的位置保存到BitMatrix。可以看到BitMatrix中矩阵的存取方式,x为行,y为列,返回true代表这一点为黑。
/**
* <p>Gets the requested bit, where true means black.</p>
*获取指定像素点是否为黑
* @param x The horizontal component (i.e. which column)
* @param y The vertical component (i.e. which row)
* @return value of given bit in matrix
*/
public boolean get(int x, int y) {
int offset = y * rowSize + (x / 32);
return ((bits[offset] >>> (x & 0x1f)) & 1) != 0;
}
/**
* <p>Sets the given bit to true.</p>
*将黑色像素点置为true
* @param x The horizontal component (i.e. which column)
* @param y The vertical component (i.e. which row)
*/
public void set(int x, int y) {
int offset = y * rowSize + (x / 32);
bits[offset] |= 1 << (x & 0x1f);
}
定位二维码区域
在对捕捉到的图像进行二值化处理之后,开始对图像中二维码的进行定位。仍然看上述代码注释1处,后半部分Detector.detect方法,会首先定位到三个寻象图形及校正符的位置,进而找到图片中二维码的位置。寻像图形水平方向黑/白/黑/白黑的比例为1:1:3:1:1,按照这个比例找到寻象图形的大概位置。
public final DetectorResult detect(Map<DecodeHintType,?> hints) throws NotFoundException, FormatException {
resultPointCallback = hints == null ? null :
(ResultPointCallback) hints.get(DecodeHintType.NEED_RESULT_POINT_CALLBACK);
FinderPatternFinder finder = new FinderPatternFinder(image, resultPointCallback);
//寻找三个定位图形
FinderPatternInfo info = finder.find(hints);
//根据定位图形,寻找校正图形及数据区域
return processFinderPatternInfo(info);
}
解码
拿到0、1矩阵后就可以根据二维码生成规范进行反向解码。
private DecoderResult decode(BitMatrixParser parser, Map<DecodeHintType,?> hints)
throws FormatException, ChecksumException {
//读取版本信息
Version version = parser.readVersion();
//读取纠错水平
ErrorCorrectionLevel ecLevel = parser.readFormatInformation().getErrorCorrectionLevel();
// Read codewords 去掩码后的数据区
byte[] codewords = parser.readCodewords();
// Separate into data blocks 获取数据块
DataBlock[] dataBlocks = DataBlock.getDataBlocks(codewords, version, ecLevel);
// Count total number of data bytes
int totalBytes = 0;
for (DataBlock dataBlock : dataBlocks) {
totalBytes += dataBlock.getNumDataCodewords();
}
byte[] resultBytes = new byte[totalBytes];
int resultOffset = 0;
// Error-correct and copy data blocks together into a stream of bytes
//编码时对数据码进行分组,也就是分成不同的Block,然后对各个Block进行纠错编码
for (DataBlock dataBlock : dataBlocks) {
byte[] codewordBytes = dataBlock.getCodewords();
int numDataCodewords = dataBlock.getNumDataCodewords();
correctErrors(codewordBytes, numDataCodewords);
for (int i = 0; i < numDataCodewords; i++) {
resultBytes[resultOffset++] = codewordBytes[i];
}
}
// Decode the contents of that stream of bytes
return DecodedBitStreamParser.decode(resultBytes, version, ecLevel, hints);
}
拿到版本、纠错水平这些辅助信息后开始数据区进行解码,二维码支持的类型主要有数字编码、字符编码、字节编码、日文编码,还有其他的混合编码以及一些特殊用途的编码。
static DecoderResult decode(byte[] bytes,
Version version,
ErrorCorrectionLevel ecLevel,
Map<DecodeHintType,?> hints) throws FormatException {
BitSource bits = new BitSource(bytes);
StringBuilder result = new StringBuilder(50);
List<byte[]> byteSegments = new ArrayList<>(1);
int symbolSequence = -1;
int parityData = -1;
try {
CharacterSetECI currentCharacterSetECI = null;
boolean fc1InEffect = false;
Mode mode;
do {
// While still another segment to read...
if (bits.available() < 4) {
// OK, assume we're done. Really, a TERMINATOR mode should have been recorded here
mode = Mode.TERMINATOR;
} else {
mode = Mode.forBits(bits.readBits(4)); // mode is encoded by 4 bits
}
switch (mode) {
case TERMINATOR:
break;
case FNC1_FIRST_POSITION:
case FNC1_SECOND_POSITION:
// We do little with FNC1 except alter the parsed result a bit according to the spec
fc1InEffect = true;
break;
case STRUCTURED_APPEND://混合编码
if (bits.available() < 16) {
throw FormatException.getFormatInstance();
}
// sequence number and parity is added later to the result metadata
// Read next 8 bits (symbol sequence #) and 8 bits (parity data), then continue
symbolSequence = bits.readBits(8);
parityData = bits.readBits(8);
break;
case ECI://特殊字符集
// Count doesn't apply to ECI
int value = parseECIValue(bits);
currentCharacterSetECI = CharacterSetECI.getCharacterSetECIByValue(value);
if (currentCharacterSetECI == null) {
throw FormatException.getFormatInstance();
}
break;
case HANZI:
// First handle Hanzi mode which does not start with character count
// Chinese mode contains a sub set indicator right after mode indicator
int subset = bits.readBits(4);
int countHanzi = bits.readBits(mode.getCharacterCountBits(version));
if (subset == GB2312_SUBSET) {
decodeHanziSegment(bits, result, countHanzi);
}
break;
default:
// "Normal" QR code modes:
// How many characters will follow, encoded in this mode?
int count = bits.readBits(mode.getCharacterCountBits(version));
switch (mode) {
case NUMERIC://数字
decodeNumericSegment(bits, result, count);
break;
case ALPHANUMERIC://字符,字母和数字组成
decodeAlphanumericSegment(bits, result, count, fc1InEffect);
break;
case BYTE://字节 比如汉字,通常使用这种
decodeByteSegment(bits, result, count, currentCharacterSetECI, byteSegments, hints);
break;
case KANJI://日语
decodeKanjiSegment(bits, result, count);
break;
default:
throw FormatException.getFormatInstance();
}
break;
}
} while (mode != Mode.TERMINATOR);
} catch (IllegalArgumentException iae) {
// from readBits() calls
throw FormatException.getFormatInstance();
}
return new DecoderResult(bytes,
result.toString(),
byteSegments.isEmpty() ? null : byteSegments,
ecLevel == null ? null : ecLevel.toString(),
symbolSequence,
parityData);
}
image.png
最终返回DecoderResult,终于见到了我们熟悉的字符串text。取到这个结果之后,比如一个url,就可以打开系统浏览器或者进行自定义处理了。