翻译 | 增强比特币隐私性 “蒲公英”提议(Dandelion)
原文链接:
Github:Dandelion: Privacy-Preserving Transaction Propagation 带有隐私保护的交易传播
相关新闻:
Bitcoin Developers Publish BIP For 'Dandelion' Privacy Project
自己尝试翻译难免有错漏,欢迎各位大神指正~
作者开发者信息 主要内容目录
Abstract 摘要
Dandelion is a new transaction broadcasting mechanism that reduces the risk of eavesdroppers linking transactions to the source IP.
蒲公英是一种新的交易广播机制,可以降低窃听者将交易与源IP关联的风险。
Dandelion transaction propagation proceeds in two phases: first the “stem” phase, and then “fluff” phase. During the stem phase, each node relays the transaction to a *single* peer. After a random number of hops along the stem, the transaction enters the fluff phase, which behaves just like ordinary flooding/diffusion. Even when an attacker can identify the location of the fluff phase, it is much more difficult to identify the source of the stem.
蒲公英交易传播分两步进行:首先是“茎”阶段,然后是“绒毛”阶段。在茎阶段期间,每个节点将交易传达到*单一*对等节点。在沿着茎走过随机数跳数之后,交易进入绒毛阶段,其行为就和普通的区块链溢流/扩散行为类似了。即使攻击者可以识别绒毛阶段的开始位置,识别茎的来源会相对而言更难一些。(peer 本文中翻译成对等节点 不知道更好的翻译是什么)
Illustration 示意图:
传播方法示意图Motivation 动机
Bitcoin transaction propagation does not hide the source of a transaction very well, especially against a “supernode” eavesdropper that forms a large number of outgoing connections to reachable nodes on the network [1,2,3]. From the point of view of a supernode, the peer that relays the transaction *first* is the most likely to be the true source, or at least a close neighbor of the source. Various application-level behaviors of Bitcoin Core enable eavesdroppers to infer the peer-to-peer connection topology, which can further help identify the source [2,3]. 比特币交易传播不能很好地隐藏交易的来源,特别是针对 和网络上可达节点之间有大量输出链接的“超级节点”窃听者。从超级节点的角度来说,*最先*传达交易的对等节点最有可能是真正的来源,或者至少是来源的近邻。比特币核心的各种应用级行为使得窃听者能够推断出端对端对等节点互相连接的拓扑结构,进一步帮助识别来源。
The Dandelion protocol obscures the source by propagating each transaction along the stem (away from the source), before initiating the diffusion process at a safe distance. Dandelion was introduced in [4] and presented at ACM Sigmetrics 2017. This BIP discusses a more practical and robust variant of Dandelion called Dandelion++ (research article forthcoming [5]). For the sake of this document, we ignore the ++ distinction and just call the proposal Dandelion overall.
蒲公英协议在 达到安全扩散过程的距离之前 是通过沿着茎(远离源)来传播每比交易的,以此来遮蔽来源。蒲公英在[4]中被介绍并在ACM Sigmetrics 2017上公开演讲。这个比特币改进建议(BIP,Bitcoin Improvement Proposal)讨论了一个更实用和更强大的蒲公英变种,被称为蒲公英++(研究文章即将出版[5])。
Dandelion exhibits two key properties: first, its anonymity properties are near-optimal among propagation mechanisms that do not obfuscate transactions (e.g., using encryption). The intuition is that under Dandelion, transactions from different nodes generate statistically similar propagation patterns; hence, adversarial nodes will be unable to use those propagation patterns to reliably infer the source IP. The relevant analysis and proofs are included in [4]. Second, Dandelion does not significantly increase transaction propagation latency. Dandelion's latency overhead (compared to the status quo) is equal to the time spent in stem phase. Although [4] does not measure this latency explicitly, our simulations suggest that the average overhead will be on the order of seconds, and we are currently running experiments to empirically measure this latency distribution.
蒲公英展示了两个关键特性:
首先,其匿名性质在不模糊交易的传播机制(例如,使用加密)中接近最优。直观上看就是用蒲公英时,来自不同节点的交易会产生统计上相似的传播模式;因此,有不良企图的节点将无法使用这些传播模式来可靠地推断源IP。相关分析和证明都包含在论文[4]中。
其次,蒲公英不会显着增加交易传播延迟。蒲公英的延迟开销(与现状相比)等于“茎”阶段花费的时间。虽然[4]中没有明确地测量这个延迟,我们的模拟表明平均开销将在几秒钟左右,我们目前正在进行实验来经验性地测量这个延迟的分布。
Specification 具体规范
The Dandelion protocol is based on three mechanisms: 蒲公英协议基于三种机制:
1. *Stem/fluff propagation.* Dandelion transactions begin in “stem mode,” during which each node relays the transaction to a single randomly-chosen peer. With some fixed probability, the transaction transitions to “fluff” mode, after which it is relayed according to ordinary Bitcoin flooding/diffusion.
1. *茎/绒毛传播*
蒲公英交易以“茎模式”开始,在此期间每个节点将交易传达到单个随机选择的对等节点。 以某个固定的概率为准,交易会转入为“绒毛”模式,之后再使用普通的比特币溢流/扩散进行传达。
2. *Mempool Embargo.* During the stem phase, each stem node (Alice) stores the transaction in an “embargoed” state. During the embargo period, Alice behaves as though she has not seen the transaction. That is, Alice will not include the embargoed transaction when responding toMEMPOOLrequests, and will not respond to GETDATA requests from another node (Bob) unless Alice previously sent an INV to Bob. The embargo period ends as soon as Alice receives an INV advertising the transaction as being in fluff mode.
2.*内存池禁运.*
在茎阶段,每个茎节点(Alice节点)将交易存储在“禁止”状态。在禁运期间,Alice节点假装没有看到交易(此处指禁止状态的交易)。也就是说,Alice在响应MEMPOOL请求时不会包含被禁止的交易,并且不会响应来自另一个节点(Bob)的GETDATA请求,除非Alice先前向Bob发送了INV。一旦Alice收到INV禁运期就结束了,该交易即将进入绒毛模式。
注解部分图片MEMPOOL注(mastering bitcoin第六章):
Almost every node on the bitcoin network maintains a temporary list of unconfirmed transactions called thememory pool,mempool, ortransaction pool. Nodes use this pool to keep track of transactions that are known to the network but are not yet included in the blockchain.比特币网络中几乎每个节点都会维护一份未确认交易的临时列表,被称为内存池或交易池。节点们利用这个池来追踪记录那些被网络所知晓、但还未被区块链所包含的交易
GETDATA和INV注(mastering bitcoin第六章):
The peer that has the longer blockchain has more blocks than the other node and can identify which blocks the other node needs in order to "catch up." It will identify the first 500 blocks to share and transmit their hashes using an inv (inventory) message. The node missing these blocks will then retrieve them, by issuing a series of getdata messages requesting the full block data and identifying the requested blocks using the hashes from the inv message.
This process of comparing the local blockchain with the peers and retrieving any missing blocks happens any time a node goes offline for any period of time. Whether a node has been offline for a few minutes and is missing a few blocks, or a month and is missing a few thousand blocks, it starts by sendinggetblocks, gets aninvresponse, and starts downloading the missing blocks.Figure 6-6shows the inventory and block propagation protocol.
拥有更长区块链的对等节点比其他节点有更多的区块,可以识别出哪些区块们是其他节点需要“补充”的。它会识别出第一批可供分享的500个区块,通过使用inv(inventory)消息把这些区块的哈希值传播出去。缺少这些区块的节点便可以通过各自发送的getdata消息来请求得到全区块信息,用包含在inv消息中的哈希值来确认是否为正确的被请求的区块,从而读取这些缺失的区块。
每当一个节点离线,不管离线时间有多长,这个与对等节点比较本地区块链并恢复缺失区块的过程就会被触发。如果一个节点只离线几分钟,可能只会缺失几个区块;当它离线长达一个月,可能会缺失上千个区块。但无论哪种情况,它都会从发送getblocks消息开始,收到一个inv响应,接着开始下载缺失的区块库存清单和区块广播协议如下图所示。
3. *Robust propagation.* Privacy enhancements should not put transactions at risk of not propagating. To protect against failures (either malicious or accidental) where a stem node fails to relay a transaction (thereby precluding the fluff phase), each node starts a random timer upon receiving a transaction in stem phase. If the node does not receive any INV messages for that transaction before the timer expires, then the embargo ends and the node diffuses the transaction normally.
3. *稳健的传播*
隐私增强不应置交易于不传播的风险。为了防止故障(恶意或意外),也就是茎节点无法传送交易(进而阻止绒毛阶段),每个节点在接收到处于茎阶段的交易时会启动一个随机定时器。如果在节点在定时器超时之前它没有收到该交易的任何INV消息,则禁止结束同时节点正常地扩散该交易。
Dandelion stem mode transactions are indicated by a new type of inventory item and a new transaction type.蒲公英茎模式的交易是由新的存货类型和新的交易类型标示的。
New Dandelion transaction inventory type: 新的蒲公英交易存货类型:
MSG_DANDELION_TX = 5
MSG_WITNESS_DANDELION_TX = MSG_DANDELION_TX | MSG_WITNESS_FLAG
Dandelion transaction message type:新交易信息类型
NetMsgType::DANDELIONTX;
After receiving a Dandelion transaction, the node flips a biased coin to determine whether to propagate it in “stem mode”, or to switch to “fluff mode.” The bias is controlled by a parameter exposed to the command line, initially 90% chance of staying in stem mode (meaning the expected stem length would be 10 hops). 在收到蒲公英交易之后,节点会“抛掷有偏向性的硬币”来决定是否以“茎模式”传播,或是切换到“绒毛模式”。其中的偏向是由暴露给命令行的参数控制的,最初有90%的机会停留在茎模式(意味着预期的“茎”长度将为10跳)。
-dandelion=<n>
Configure Dandelion (privacy-preserving transaction propagation) stem
probability percent (default: 90, max 100, 0 to disable)
We have evaluated a spectrum of possible schemes for selecting which peers receive relayed stem transactions. We call the set of possible relays the “stem set”. On one extreme, a node could relay every stem transaction to the same peer (i.e., the stem set has size 1). On the other extreme, a node could randomly choose a new relay from its existing peers for every transaction. The tradeoffs between these options affect how much “precision” an attacker gets, and depend on how easily the attacker can infer the connection topology of the Dandelion-relay overlay. Our compromise is to maintain a stable, randomly-chosen stem set of *two* peers, and to select one randomly each time we relay a transaction. The stem set is chosen from among the outgoing (or whitelisted) connections, which prevents an adversary from easily inserting itself into the stem graph. Each node periodically re-randomizes its stem set every 10 minutes. 我们已经评估了一些可能的方案用于选择由哪些对等节点接收中继过来的茎交易事务。我们将可能的中继集合称为“茎集合”。在一个极端情况下,节点可以将每个茎交易事物中继到同一个对等节点(也就是说茎集合大小为1)。另一方面,一个节点也可以从现有对等节点中随机选择一个新的来中继每一个交易。这些选项之间的权衡会影响攻击者可以获得多少“精确度”,并且取决于攻击者有多容易来推断蒲公英 - 中继覆盖的连接拓扑。 我们的妥协是保持一个稳定的,随机选择的包含*两个*对等节点的茎集合,并在每次中转一个交易时随机选择其中一个对等节点。茎集合从输出(或白名单)连接中选出,可防止对手轻易将自身加入到茎图谱中。每个节点每10分钟周期性地重新随机化其茎集合。
Service bits: Support for Dandelion is indicated by a temporary experimental service bit. 服务位:对蒲公英的支持由临时的实验性服务位来表示。
NODE_DANDELION = (1 << 24)
We imagine that in the future, the service will be discovered in-band by sending “MSG_DANDELION_TX” after the hand-shake. 我们设想将来,在网络中发送“MSG_DANDELION_TX”该服务就会被发觉。
Considerations 考虑事项
The main implementation challenges are: (1) identifying a satisfactory tradeoff between Dandelion’s privacy guarantees and its latency/overhead, and (2) ensuring that privacy cannot be degraded through abuse of existing mechanisms. In particular, the implementation should prevent an attacker from identifying stem nodes without interfering too much with the various existing mechanisms for efficient and DoS-resistant propagation.
主要的实施挑战是:(1)在蒲公英的隐私保证与其延迟/间接费用之间找到的令人满意的折衷方案,以及(2)确保通过滥用现有机制不会降低隐私性。 尤其要注意的,蒲公英的实施应该在不严重影响现有的各种高效和带DoS抗性传播现有机制前提下,防止攻击者识别茎节点。
The privacy afforded by Dandelion depends on 3 parameters: the stem probability, the number of outbound peers that can serve as dandelion relays (i.e., the “stem set” size), and the time between re-randomizations of the stem set. These parameters define a tradeoff between privacy and broadcast latency/processing overhead. Lowering the stem probability harms privacy but helps reduce latency by shortening the mean stem length; based on theory, simulations, and experiments, we have chosen a default of 90%. Lowering the stem set size (from a default size of 2 to 1) makes dandelion’s privacy guarantees fragile to adversaries that can learn the stem set of each node; here we choose robustness over optimal privacy. Reducing the time between each node’s re-randomization of its stem set reduces the chance of an adversary learning the stem sets for each node, at the expense of increased overhead. These tradeoffs are outlined more precisely in a forthcoming article [5]. 蒲公英提供的隐私性取决于三个参数:茎的概率,就是可用作蒲公英中转的外围对等节点的数量(即“茎集合”大小)以及茎集合的两次再随机化之间的时间。这些参数定义了隐私性和广播延迟/处理开销之间的折衷。 降低茎的概率会危害隐私性,但缩短平均茎的长度有助于减少延迟; 基于理论,模拟和实验,我们选择了90%的默认值。降低茎集合的大小(从默认的2到1)会使蒲公英的隐私保证在攻击者面前免得很脆弱,因为他们可以学习每个节点的茎集合; 这里我们选择鲁棒性而不是选最优隐私性。 减少每个节点对其茎集合再次随机化之间的时间减少了攻击者学习每个节点茎集合的机会,但要付出更多的开销。 在即将出版的文章[5]中更详细地概述了这些权衡。
When receiving a Dandelion inventory item, the request skips the usual “mapAlreadyAskedFor” priority queue, which adds a 2-minute timer before asking the next node. Otherwise, an eavesdropper would be able to probe whether a node has received stem transactiontxby sending INV(tx) and checking whether or not GETDATA(tx) is received in response.
当收到蒲公英存货清单项目时,该请求将跳过通常的“mapAlreadyAskedFor”优先级队列,该优先级队列添加了在询问下一个节点之前计时2分钟的定时器。如果不这样做,窃听者将能够通过发送INV(tx)来探测节点是否接收到茎交易,并检查响应中是否接收到GETDATA(tx)。
When receiving or asking for a Dandelion stem transaction, we avoid placing that transaction in filterInventoryKnown. This way, transactions can also travel back “up” the stem in the fluff phase.在接收或要求蒲公英茎交易时,我们避免将交易放进filterInventoryKnown。 这样一来,在绒毛阶段交易也可能回传回到茎经过的节点。
Like ordinary transactions, Dandelion transactions are only relayed after being successfully accepted to mempool. This ensures that nodes will never be punished for relaying Dandelion transactions, and that existing replace-by-fee and fee-filter behavior is preserved.
像普通交易一样,蒲公英的交易只有在被成功被MEMPOOL接受之后才被中转传递。 这确保节点中转蒲公英交易将永远不会受到惩罚(此处不理解),并且现有的替代费用和费用过滤行为得以保留。(此处不理解)
If an orphan transaction is received in Dandelion mode, it is added tomapOrphanTransactions, and also marked as stem-mode. If the transaction is later accepted to mempool, then it is relayed as a Dandelion transaction (either stem mode or fluff mode, depending on a coin flip).如果在蒲公英模式下接收到孤立交易,它会被添加为 mapOrphanTransactions,并将其标记为茎模式。 如果该交易稍后被mempool接受,则将其作为蒲公英交易进行中转(可能以茎模式或绒毛模式中转,取决于抛硬币结果)。
If a node receives a child transaction that depends on one or more currently-embargoed Dandelion transactions, then the transaction is also relayed in stem mode, and the embargo timer is set to the maximum of the embargo times of its parents. This helps ensure that parent transactions enter fluff mode before child transactions.
如果一个节点接收到一个子交易,它依赖于一个或多个当前禁止的蒲公英交易,那么该子交易也将以茎模式进行中转,并且禁运定时器会设置为其母交易的禁运时间的最大值。 这有助于确保母交易在子交易之前进入绒毛模式被广播出去。
Transaction propagation latency should be minimally affected by opting-in to this privacy feature; in particular, a transaction should never be prevented from propagating at all because of Dandelion. The random timer guarantees that the embargo mechanism is temporary, and every transaction is relayed according to the ordinary diffusion mechanism after some maximum (random) delay on the order of 30-60 seconds.
加入使用此蒲公英隐私功能的交易传播的延迟应该尽量的小; 特别是,交易绝对不能因为使用了蒲公英而被阻止传播。 随机定时器保证了封锁机制是暂时的,并且在30-60秒的最大(随机)延迟之后,每个交易会根据普通扩散机制来中转扩散。
Despite the best effort of the embargo mechanism, it is difficult to ensure that an attacker cannot find some other indirect way to probe whether a node has participated in the stem phase for a transaction. For example, an attacker seeing tx1 could create an invalid transaction tx2 that purports to spend an output from tx1, and send it to victim node Alice. If Alice has placed tx1 in mempool, then Alice will reject the attacker and disconnect it; if Alice has not seen tx1, then tx2 would instead go to the orphan cache. Fortunately, this example at least requires the attacker to burn one of its outgoing connections. Here we’re favoring an imperfect solution over something more complicated to implement.
尽管禁运机制尽了最大的努力,但是很难确保攻击者无法找到其他间接方式来试探某个节点是否参与了交易的茎阶段。 例如,看到tx1的攻击者可以创建一个无效的事务tx2,它意图基于tx1产生一个输出,并将其发送到受害者节点Alice。 如果Alice已经将tx1放在mempool中,那么Alice将拒绝攻击者并将其断开; 如果Alice没有看到tx1,那么tx2会转到孤立交易缓存。 幸运的是,这个例子至少要求攻击者破坏其中一个传出的连接。 在这里,对比一个实施起来更为复杂的方案,我们选择于一个不完美的解决方案。
Backward compatibility 向后兼容性
Dandelion is intended for gradual deployment and adoption, with privacy gains that increase monotonically with the fraction of adopting nodes. Theoretical analysis shows that at any adoption level, the privacy of Dandelion nodes will be better than the status quo, and non-Dandelion nodes will have privacy no worse than the status quo [5]. To achieve these results, we rely on two implementation decisions:
蒲公英希望可以逐步部署和采用,随采用节点的比重增加隐私增益也单调增加。 理论分析表明,在任何采用情况上,蒲公英节点的隐私性都将比现状更好,非蒲公英节点的隐私性不会比现状更差[5]。 为了实现这些结果,我们依靠两个实施决策:
1. *Fluff mode by default.* Nodes without Dandelion support will continue to relay transactions with independent exponential delays. Hence, if a Dandelion node extends the stem to a non-Dandelion node, it is as if the transaction automatically enters the fluff phase. Thus, the fewer nodes that support Dandelion, the shorter the average stem length.
1. *默认用绒毛模式*
无蒲公英支持的节点将继续以独立指数延迟(???此处不明白)中转交易。 因此,如果蒲公英节点将茎交易扩展到非蒲公英节点,就等于交易自动进入了绒毛阶段。 因此,支持蒲公英的节点越少,平均茎长越短。
2. *Avoidance of self-reporting.* Nodes do not consider the Dandelion service bit when choosing which nodes to connect to or which nodes to relay to. Otherwise, during the initial gradual adoption period, this could give preference to attackers that signal Dandelion support.
2. *避免自我报告*
节点在选择要连接的节点或要中转交易的节点时,不要考虑蒲公英服务位。 否则,在最初的逐渐采用期间,这可能会赋予展示自己支持蒲公英交易的袭击者优先权。
Implementation 实施
A reference implementation of Dandelion is available at
https://github.com/gfanti/bitcoin/tree/dandelion
蒲公英的参考实施可以在 https://github.com/gfanti/bitcoin/tree/dandelion 找到
This implementation includes a modification to the wallet software that relays newly-created transactions as Dandelion transactions. 该实现中的修改包括对将新创建的交易作为蒲公英交易进行中转的钱包软件。
An incomplete testing harness, `p2p-dandelion.py`, is also included. 还包括一个不完整的测试工具`p2p-dandelion.py`。
Analysis 分析
This BIP references some theoretical analysis that is work in progress, e.g., regarding the anonymity guarantees of partially-deployed Dandelion. The following plot shows theoretical results on the expected recall (probability of linking a transaction to an IP address) as a function of the fraction of corrupt nodes.
这个比特币改进建议参考了一些正在进行的理论分析,例如关于部分部署的蒲公英的匿名保证。 下图显示了预期召回(将交易连到IP地址的概率)和损坏节点比例的函数的理论结果。
Here we are assuming an adversarial model where a constant fraction of colluding nodes passively observe all transactions. These "spy" nodes try to infer the source of each transaction using observed timestamps, knowledge of the graph, and knowledge of which node delivered each transaction to each spy node. ‘Version-checking’ means that a node preferentially adds dandelion-compatible peers to its stem set. ‘No-version-checking’ means that each Dandelion node chooses its stem set without regards for dandelion-compatibility (as recommended in this BIP). The plot shows theoretical upper and lower bounds on the expected recall for *Dandelion nodes* that use each strategy, so the true expected recall for version-checking (resp. no-version-checking) is somewhere in the blue (resp. red) region. Non-Dandelion nodes will experience no change in their expected recall. These results suggest that no-version-checking is a better strategy, which also outperforms the simulated status quo (diffusion), even when deployment levels are low. For the simulated performance of diffusion, we used the suboptimal 'first-spy' estimator, where for each transaction, the adversary implicates the first honest node to deliver that transaction to any spy node. This gives a lower bound on the adversary's deanonymization power with respect to diffusion.
在这里,我们假设一个对抗模型,有一组固定比例的相互勾结的节点被动地观察所有的交易。这些“间谍”节点尝试使用观察到的时间戳,图的知识以及哪个节点向每个间谍节点递送每个交易的知识来推断每个交易的源。 '版本检查'意味着一个节点会优先将蒲公英兼容的对等节点添加到其茎集合中。 “无版本检查”是指每个蒲公英节点选择其茎集合的时候不考虑蒲公英的兼容性(如本比特币改进计划中推荐的做法)。该图显示了在使用每个策略情况下的*蒲公英节点*的预期召回的理论上限和下限值,因此版本检查(同理 无版本检查)的真正预期召回位于蓝色(对应 红色)地区。(翻译注:版本检查对应蓝色,无版本检查对应红色) 非蒲公英节点的预期召回将不会发生变化。这些结果表明,无版本检查是一个更好的策略,即使在部署情况较低的情况下,性能也优于模拟现状(扩散模式)。对于扩散模式的模拟性能,我们使用次优的“首次间谍”估计器,其中对于每个交易,对手暗示第一个诚实节点将该交易传递给任何间谍节点。这个做法给出了在扩散模式下对手去匿名能力的最低值。
Acknowledgements 致谢
Gregory Maxwell provided us with invaluable feedback and suggestions that guided our implementation approach.
Gregory Maxwel 为我们提供了宝贵的反馈和建议,指导了我们的实施方法。
References 参考文献
1. An Analysis of Anonymity in Bitcoin Using P2P Network Traffic
http://fc14.ifca.ai/papers/fc14_submission_71.pdf
2. Deanonymisation of clients in Bitcoin P2P network
https://arxiv.org/abs/1405.7418
3. Discovering Bitcoin’s Public Topology and Influential Nodes
https://cs.umd.edu/projects/coinscope/coinscope.pdf
4. (Sigmetrics 2017) Dandelion: Redesigning the Bitcoin Network for Anonymity
https://arxiv.org/abs/1701.04439
5. Dandelion++: TBA
Copyright
This document is placed in the public domain.
翻译后疑问:
一直有个疑问,不管在传播过程中如何掩盖来源,只要有public address,不还是可以知道一个交易的来源吗?如果知道某个public address 还是可以追踪该地址底下所有相关的交易
所以蒲公英到底提升在什么地方呢?
希望有大神可以解答。