比特币探究之隔离见证
隔离见证(segregated witness,简称segwit),是比特币历史上一次很重要的升级,涉及到共识规则和网络协议。它正式激活于2017年8月24日,区块高度481,824。此前,比特币的交易验证,需要依赖两部分数据,一部分是交易状态,简单地说就是谁给谁转账多少钱;另一部分是见证数据,证明这个交易的真实性和合法性。我们知道,交易一旦确定,状态就是不可更改的了,但是见证数据由于其算法设计,却是可以改变的,或者说证据是可以不只一份的。那么如果有恶意攻击者,通过修改见证数据就可以修改交易ID,这被称之为延展性攻击,会带来相当的不安全性。据说Mt.Gox黑客事件就从这个漏洞而来。
隔离见证的提出,将见证数据隔离在区块基本信息之外,也就意味着交易ID只跟交易状态有关,那么交易一旦发生,任何人都无法再修改交易ID,这就顺利解决了所谓的延展性攻击。同时它带来的另外一个好处,就是区块容量在不需要硬分叉的前提下增大了,并且为下一步闪电网络铺平了路子。
隔离见证是比特币历史上的重大变革关于隔离见证的知识,还可以参见隔离见证(CSDN)以及什么是隔离见证(知乎)这两篇贴子,英文好的可以直接看Github上的说明。本文的重点,依然是直接切入源码看实现。
那么隔离见证是如何实现的呢?
一、怎么隔离,在哪隔离
首先来看交易输入CTxIn,下面是它的部分代码(src/primitives/transaction.h):
class CTxIn
{
public:
COutPoint prevout;
CScript scriptSig;
uint32_t nSequence;
CScriptWitness scriptWitness; //仅当交易被序列化时才参与
template <typename Stream, typename Operation>
inline void SerializationOp(Stream& s, Operation ser_action) {
READWRITE(prevout);
READWRITE(scriptSig);
READWRITE(nSequence);
}
//...
}
简单地理解,所谓隔离见证,就是把原来scriptSig里的主要内容,转移到scriptWitness中去,注意上面的序列化代码中,scriptWitness是不会被序列化的,它只在整个交易被序列化时才参与。同时,相应的scriptSig就变成空脚本了,这就是所谓的隔离,附带的一个好处就是交易size减小了,相应的交易费用也会降低。需要注意的是,scriptWitness里的内容是经过了进一步处理的,已经不再是脚本,详情可以参见上面列出的参考文章。
那么scriptWitness是什么时候生成的?答案是在CreateTransaction里,最后生成签名的时候。以下是src/script/sign.cpp中ProduceSignature函数的部分代码,这里只引用隔离见证相关的部分:
bool ProduceSignature(const SigningProvider& provider, const BaseSignatureCreator& creator,
const CScript& fromPubKey, SignatureData& sigdata)
{
//...
if (solved && whichType == TX_WITNESS_V0_KEYHASH)
{
CScript witnessscript; //签名脚本
witnessscript << OP_DUP << OP_HASH160 << ToByteVector(result[0]) << OP_EQUALVERIFY << OP_CHECKSIG;
txnouttype subType;
solved = solved && SignStep(provider, creator, witnessscript, result, subType,
SigVersion::WITNESS_V0, sigdata);
sigdata.scriptWitness.stack = result; //填入scriptWitness
sigdata.witness = true;
result.clear(); //注意这里清空了
}
else if (solved && whichType == TX_WITNESS_V0_SCRIPTHASH)
{
CScript witnessscript(result[0].begin(), result[0].end());
sigdata.witness_script = witnessscript; //赎回脚本
txnouttype subType;
solved = solved && SignStep(provider, creator, witnessscript, result, subType, SigVersion::WITNESS_V0, sigdata)
&& subType != TX_SCRIPTHASH && subType != TX_WITNESS_V0_SCRIPTHASH && subType != TX_WITNESS_V0_KEYHASH;
result.push_back(std::vector<unsigned char>(witnessscript.begin(), witnessscript.end()));
sigdata.scriptWitness.stack = result; //填入scriptWitness
sigdata.witness = true;
result.clear(); //注意这里清空了
} else if (solved && whichType == TX_WITNESS_UNKNOWN) {
sigdata.witness = true;
}
sigdata.scriptSig = PushAll(result); //实际上是scriptSig清空了
//...
return sigdata.complete;
}
可以看到,如果使用了隔离见证,那么交易签名被存入了scriptWitness,而不是scriptSig。这就是所谓隔离的由来。
注意scriptWitness内部使用的stack来存储数据,每个witness都由一个var_int打头,代表接下来的数据长度。如果某个输入没有见证,那么其witness就是一个0x00。
二、Transaction ID
一个交易的txid是以下序列的双SHA256加密结果:
[nVersion][txins][txouts][nLockTime]
采用隔离见证以后,txid的定义仍然保持不变,但是另外增加了一个wtxid,它对应的序列是这样:
[nVersion][marker][flag][txins][txouts][witness][nLockTime]
下面是src/primitives/transaction.h(cpp)中的相关源码,为便于阅读,稍有整理:
class CTransaction
{
//...
private:
//这两个hash值在交易被构建时计算,并且只在内存中不写磁盘
//注意CTransaction数据值是不会变的,会变的是CMutableTransaction
const uint256 hash;
const uint256 m_witness_hash;
uint256 ComputeHash() const { //计算txid,注意设定了无见证参数
return SerializeHash(*this, SER_GETHASH, SERIALIZE_TRANSACTION_NO_WITNESS);
}
uint256 ComputeWitnessHash() const { //计算wtxid,第3个参数为0默认有见证
if (!HasWitness()) return hash; //如果没有见证数据,直接返回hash
return SerializeHash(*this, SER_GETHASH, 0);
}
//...
}
SerializeHash函数,采用输入流的方式读取Transaction数据,最后调用的是SerializeTransaction函数:
template<typename Stream, typename TxType>
inline void SerializeTransaction(const TxType& tx, Stream& s) {
//根据Computer时设定的参数,确定带不带见证
const bool fAllowWitness = !(s.GetVersion() & SERIALIZE_TRANSACTION_NO_WITNESS);
s << tx.nVersion;
unsigned char flags = 0;
if (fAllowWitness) {
if (tx.HasWitness()) { //带见证,且确实包含见证数据
flags |= 1;
}
}
if (flags) {
std::vector<CTxIn> vinDummy;
s << vinDummy; //输入一个空vector,其实就是输入一个0,它对应的就是marker
s << flags; //对应flag,一定是1
}
s << tx.vin;
s << tx.vout;
if (flags & 1) { //如果带见证,依次输入见证数据
for (size_t i = 0; i < tx.vin.size(); i++) {
s << tx.vin[i].scriptWitness.stack;
}
}
s << tx.nLockTime;
}
下面是对应的UnsierializeTransaction函数:
template<typename Stream, typename TxType>
inline void UnserializeTransaction(TxType& tx, Stream& s) {
const bool fAllowWitness = !(s.GetVersion() & SERIALIZE_TRANSACTION_NO_WITNESS);
s >> tx.nVersion;
unsigned char flags = 0;
tx.vin.clear();
tx.vout.clear();
s >> tx.vin; //先读一个vin,来判断到底有没有见证数据。如果没有见证,这里就是正常的vin
if (tx.vin.size() == 0 && fAllowWitness) { //确实是空的,而且带见证,那么刚刚读取的就是marker
s >> flags; //再读入flag,目前必定为1
if (flags != 0) { //然后开始读输入、输出
s >> tx.vin;
s >> tx.vout;
}
} else {
s >> tx.vout; //vin刚刚已经读了,这里只读vout就可以了
}
if ((flags & 1) && fAllowWitness) {
flags ^= 1;
for (size_t i = 0; i < tx.vin.size(); i++) {
s >> tx.vin[i].scriptWitness.stack; //依次读入见证数据
}
} if (flags) {
//如果读入flags不是1(可能是未来版本生成的),抛出异常
throw std::ios_base::failure("Unknown transaction optional data");
}
s >> tx.nLockTime;
}
三、Coinbase Commitment
我们知道,交易信息是被打包进MerkleTreeRoot,然后写进区块头确保不可篡改的。那么隔离见证之后,我们同样也要确保witness数据不可篡改。比特币是怎么来实现的呢?
首先,所有的wtxid会被打包进见证版的Merkle树,见src/consensus/merkle.cpp中的BlockWitnessMerkleRoot函数:
uint256 BlockWitnessMerkleRoot(const CBlock& block, bool* mutated)
{
std::vector<uint256> leaves;
leaves.resize(block.vtx.size());
leaves[0].SetNull(); //币基交易的见证哈希是0.
for (size_t s = 1; s < block.vtx.size(); s++) {
leaves[s] = block.vtx[s]->GetWitnessHash();
}
return ComputeMerkleRoot(std::move(leaves), mutated);
}
随后,在生成区块的时候,创建币基交易时,会生成一个Coinbase Commitment(币基承诺)。下面是src/miner.cpp中CreateNewBlock函数的节选:
CMutableTransaction coinbaseTx;
coinbaseTx.vin.resize(1);
coinbaseTx.vin[0].prevout.SetNull();
coinbaseTx.vout.resize(1);
coinbaseTx.vout[0].scriptPubKey = scriptPubKeyIn;
coinbaseTx.vout[0].nValue = nFees + GetBlockSubsidy(nHeight, chainparams.GetConsensus());
coinbaseTx.vin[0].scriptSig = CScript() << nHeight << OP_0;
pblock->vtx[0] = MakeTransactionRef(std::move(coinbaseTx));
pblocktemplate->vchCoinbaseCommitment = GenerateCoinbaseCommitment(*pblock, pindexPrev, chainparams.GetConsensus());
pblocktemplate->vTxFees[0] = -nFees;
上面倒数第二行,调用了GenerateCoinbaseCommitment函数,它定义在src/validation.cpp中,源码是这样的:
std::vector<unsigned char> GenerateCoinbaseCommitment(CBlock& block, const CBlockIndex* pindexPrev,
const Consensus::Params& consensusParams)
{
std::vector<unsigned char> commitment;
int commitpos = GetWitnessCommitmentIndex(block); //从币基交易的输出中寻找承诺项,没找到就返回-1
std::vector<unsigned char> ret(32, 0x00);
if (consensusParams.vDeployments[Consensus::DEPLOYMENT_SEGWIT].nTimeout != 0) {
if (commitpos == -1) { //没有找到,就开始创建承诺,先生成见证版Merkle树根
uint256 witnessroot = BlockWitnessMerkleRoot(block, nullptr);
CHash256().Write(witnessroot.begin(), 32).Write(ret.data(), 32).Finalize(witnessroot.begin());
CTxOut out; //构建一个币基交易的输出
out.nValue = 0; //金额是0
out.scriptPubKey.resize(38); //公钥脚本长度38,前6个字节固定为0x6a24aa21a9ed
out.scriptPubKey[0] = OP_RETURN; //0x6a
out.scriptPubKey[1] = 0x24; //36,即后面的总长度
out.scriptPubKey[2] = 0xaa; //0xaa21a9ed,固定不变的承诺头
out.scriptPubKey[3] = 0x21;
out.scriptPubKey[4] = 0xa9;
out.scriptPubKey[5] = 0xed;
memcpy(&out.scriptPubKey[6], witnessroot.begin(), 32); //插入见证版Merkle树根
commitment = std::vector<unsigned char>(out.scriptPubKey.begin(), out.scriptPubKey.end());
CMutableTransaction tx(*block.vtx[0]);
tx.vout.push_back(out); //币基交易中添加这个输出
block.vtx[0] = MakeTransactionRef(std::move(tx)); //写回区块
}
}
UpdateUncommittedBlockStructures(block, pindexPrev, consensusParams); //更新区块其他结构
return commitment;
}
币基交易中添加输出之后,它的输入也有相应变化,也就是上面最后调用的UpdateUncommittedBlockStructures函数:
void UpdateUncommittedBlockStructures(CBlock& block, const CBlockIndex* pindexPrev,
const Consensus::Params& consensusParams)
{
int commitpos = GetWitnessCommitmentIndex(block);
static const std::vector<unsigned char> nonce(32, 0x00);
if (commitpos != -1 && IsWitnessEnabled(pindexPrev, consensusParams) && !block.vtx[0]->HasWitness()) {
CMutableTransaction tx(*block.vtx[0]); //修改币基交易
tx.vin[0].scriptWitness.stack.resize(1); //向空输入中添加一项
tx.vin[0].scriptWitness.stack[0] = nonce;
block.vtx[0] = MakeTransactionRef(std::move(tx)); //写回区块
}
}
OK,既然费那么大劲写入承诺,那么一定要对它进行检查,否则就失去意义了。这段代码在ContextualCheckBlock函数中,以下是它的部分代码:
bool fHaveWitness = false;
if (VersionBitsState(pindexPrev, consensusParams, Consensus::DEPLOYMENT_SEGWIT, versionbitscache)
== ThresholdState::ACTIVE) {
int commitpos = GetWitnessCommitmentIndex(block);
if (commitpos != -1) {
bool malleated = false;
uint256 hashWitness = BlockWitnessMerkleRoot(block, &malleated);
if (block.vtx[0]->vin[0].scriptWitness.stack.size() != 1
|| block.vtx[0]->vin[0].scriptWitness.stack[0].size() != 32) {
return state.DoS(100, false, REJECT_INVALID, "bad-witness-nonce-size", true,
strprintf("%s : invalid witness reserved value size", __func__));
}
CHash256().Write(hashWitness.begin(), 32)
.Write(&block.vtx[0]->vin[0].scriptWitness.stack[0][0], 32)
.Finalize(hashWitness.begin());
if (memcmp(hashWitness.begin(), &block.vtx[0]->vout[commitpos].scriptPubKey[6], 32)) {
return state.DoS(100, false, REJECT_INVALID, "bad-witness-merkle-match", true,
strprintf("%s : witness merkle commitment mismatch", __func__));
}
fHaveWitness = true;
}
}
四、交易哈希算法
隔离见证同时还修改了交易签名所用的哈希算法,此前原有算法存在两个方面缺陷,一个是当交易中sigOp数量增加时,复杂度呈平方增长;另一个是算法不涉及输入金额,可能对冷钱包的使用有所影响。
关于新的交易哈希算法的详细解释,可以参见Github上的原文 。下面直接摘取src/script/interpreter.cpp中的SignatureHash函数的部分源码:
template <class T>
uint256 SignatureHash(const CScript& scriptCode, const T& txTo, unsigned int nIn, int nHashType,
const CAmount& amount, SigVersion sigversion, const PrecomputedTransactionData* cache)
{
//...
if (sigversion == SigVersion::WITNESS_V0) {
uint256 hashPrevouts, hashSequence, hashOutputs;
const bool cacheready = cache && cache->ready;
if (!(nHashType & SIGHASH_ANYONECANPAY)) {
hashPrevouts = cacheready ? cache->hashPrevouts : GetPrevoutHash(txTo);
}
if (!(nHashType & SIGHASH_ANYONECANPAY) && (nHashType & 0x1f) != SIGHASH_SINGLE
&& (nHashType & 0x1f) != SIGHASH_NONE) {
hashSequence = cacheready ? cache->hashSequence : GetSequenceHash(txTo);
}
if ((nHashType & 0x1f) != SIGHASH_SINGLE && (nHashType & 0x1f) != SIGHASH_NONE) {
hashOutputs = cacheready ? cache->hashOutputs : GetOutputsHash(txTo);
} else if ((nHashType & 0x1f) == SIGHASH_SINGLE && nIn < txTo.vout.size()) {
CHashWriter ss(SER_GETHASH, 0);
ss << txTo.vout[nIn];
hashOutputs = ss.GetHash();
}
//数据准备好了,下面是正式处理过程,可以看出其复杂度明显降低
CHashWriter ss(SER_GETHASH, 0);
ss << txTo.nVersion; //版本号
ss << hashPrevouts;
ss << hashSequence;
ss << txTo.vin[nIn].prevout;
ss << scriptCode;
ss << amount; //金额这里包含了
ss << txTo.vin[nIn].nSequence;
ss << hashOutputs;
ss << txTo.nLockTime;
ss << nHashType;
return ss.GetHash();
}
//...
}
五、脚本验证
在创建交易的最后,会对签名脚本进行验证,涉及到隔离见证的部分,先看src/scripts/interpreter.cpp中VerifyScript函数的部分源码:
bool VerifyScript(const CScript& scriptSig, const CScript& scriptPubKey, const CScriptWitness* witness,
unsigned int flags, const BaseSignatureChecker& checker, ScriptError* serror)
{
//...
int witnessversion;
std::vector<unsigned char> witnessprogram;
if (flags & SCRIPT_VERIFY_WITNESS) {
if (scriptPubKey.IsWitnessProgram(witnessversion, witnessprogram)) {
hadWitness = true;
if (scriptSig.size() != 0) {
return set_error(serror, SCRIPT_ERR_WITNESS_MALLEATED);
}
if (!VerifyWitnessProgram(*witness, witnessversion, witnessprogram, flags, checker, serror)) {
return false;
}
stack.resize(1);
}
}
//...
}
可以看到,它调用了VerifyWitnessProgram来进行验证。它的源码是这样的:
static bool VerifyWitnessProgram(const CScriptWitness& witness, int witversion,
const std::vector<unsigned char>& program, unsigned int flags,
const BaseSignatureChecker& checker, ScriptError* serror)
{
std::vector<std::vector<unsigned char> > stack;
CScript scriptPubKey;
if (witversion == 0) {
if (program.size() == WITNESS_V0_SCRIPTHASH_SIZE) {
//32位的P2WSH,witness为stack + witnessScript,而witnessScript经双SHA256就是32位program
if (witness.stack.size() == 0) {
return set_error(serror, SCRIPT_ERR_WITNESS_PROGRAM_WITNESS_EMPTY);
}
scriptPubKey = CScript(witness.stack.back().begin(), witness.stack.back().end());
stack = std::vector<std::vector<unsigned char> >(witness.stack.begin(), witness.stack.end() - 1);
uint256 hashScriptPubKey;
CSHA256().Write(&scriptPubKey[0], scriptPubKey.size()).Finalize(hashScriptPubKey.begin());
if (memcmp(hashScriptPubKey.begin(), program.data(), 32)) {
return set_error(serror, SCRIPT_ERR_WITNESS_PROGRAM_MISMATCH);
}
} else if (program.size() == WITNESS_V0_KEYHASH_SIZE) {
//20位P2WPKH,witness就是sig + pubkey,其中pubkey经过HASH160之后就是20位program
if (witness.stack.size() != 2) {
return set_error(serror, SCRIPT_ERR_WITNESS_PROGRAM_MISMATCH);
}
scriptPubKey << OP_DUP << OP_HASH160 << program << OP_EQUALVERIFY << OP_CHECKSIG;
stack = witness.stack;
} else {
return set_error(serror, SCRIPT_ERR_WITNESS_PROGRAM_WRONG_LENGTH);
}
} else if (flags & SCRIPT_VERIFY_DISCOURAGE_UPGRADABLE_WITNESS_PROGRAM) {
return set_error(serror, SCRIPT_ERR_DISCOURAGE_UPGRADABLE_WITNESS_PROGRAM);
} else { //高版本见证脚本就等将来的软分叉吧
return set_success(serror);
}
//栈数据不允许溢出
for (unsigned int i = 0; i < stack.size(); i++) {
if (stack.at(i).size() > MAX_SCRIPT_ELEMENT_SIZE)
return set_error(serror, SCRIPT_ERR_PUSH_SIZE);
}
//执行一下,看看结果是不是TRUE
if (!EvalScript(stack, scriptPubKey, flags, checker, SigVersion::WITNESS_V0, serror)) {
return false;
}
//stack最后只能剩1个数据TRUE
if (stack.size() != 1)
return set_error(serror, SCRIPT_ERR_CLEANSTACK);
if (!CastToBool(stack.back()))
return set_error(serror, SCRIPT_ERR_EVAL_FALSE);
return true;
}
谢谢阅读。如有不妥之处,请高手不吝指正。
原创不易,恳请支持!你的赞赏,我的动力!