running averages

flattr this!

Many aspects of bittorrent requires maintaining an estimate of some kinds of samples. uTP keeps a running estimate of round-trip times for each connection. When streaming torrents, it is useful to keep an estimate of download time per piece (to know what a reasonable timeout is). This post takes a closer look at how to do this and what can go wrong. The traditional algorithm for doing this is described in the TCP s

memory cache optimizations

flattr this!

When optimizing memory access, and memory cache misses in particular, there are surprisingly few tools to help you. valgrind’s cachegrind tool is the closest one I’ve found. It gives you a lot of information on cache misses, but not necessarily in the form you need it. About a week ago I started looking into lowering the memory cache pressure by just making my data structures smaller and have less waste

swarm connectivity

flattr this!

In bittorrent it is important to keep the swarm as evenly connected as possible. Clustering of peers may create bottlenecks for piece distribution and create a skewed market for trading pieces. Keep in mind that local piece availability is used as an approximation for global piece availability in the rarest-first piece picking algorithm. This post is relevant for most peer-to-peer systems, where having well connect

DHT security

flattr this!

One of the vulnerabilities of typical DHTs, in particular the bittorrent DHT, is the fact that participants can choose their own node ID. This enables an attacker to deliberately place themselves at a locaton in the DHT where they know they will be responsible for storing some specific data. At that point, there are a few naughty things that can be done, for example: lie and say the data doesn’t exists wh

principles of high performance programs

flattr this!

This article is an attempt to sum up a small number of generic rules that appear to be useful rules of thumb when creating high performing programs. It is structured by first establishing some fundamental causes of performance hits followed by their extensions. memory latency A significant source of performance degradation on modern computers is the latency of SDRAM. While the CPU is waiting for a read from memory

asynchronous disk I/O

flattr this!

Since 2010, I’ve been working, on and off, on a branch off of libtorrent which use asynchronous disk I/O, instead of the synchronous disk calls in the disk thread in 0.16.x versions. The aio branch has several performance improvements apart from allowing multiple disk operations outstanding at any given time. For instance: 1. the disk cache allows multiple threads accessing it (cache hits are served immediate

windows’ disk cache

flattr this!

A long standing problem with bittorrent clients on windows is that if you’re seeding large files or downloading large files, windows may decide to essentially use all your physical RAM for disk cache. The disk cache grows to the point where running processes start having their working set swapped out, significantly slowing down the system as a whole. Both uTorrent and libtorrent based clients have this proble

bittorrent over SSL

flattr this!

Running bittorrent over SSL could make sense for several applications. Anything you want distributed to a closed group, but large enough to warrant bittorrent would do well being distributed over bittorrent/SSL. Currently closed group distributions either don’t use any peer-to-peer distribution at all, or they use poor-man’s privacy/security. I’m referring to the “private” flag of torr

seeding a million torrents

flattr this!

There are two main architectures of peer-to-peer networks. There’s the peer-centric (limewire style) and content-centric (bittorrent style). In a peer-centric network each participant announces its existence to the network, and other peers looking for content go around asking peers if they have the content. This makes the peer-centric networks scale well with pieces of content. There’s (essentially) no

socket receive buffers

flattr this!

In an attempt to save memory copying, libtorrent attempts to receive payload bytes directly into page aligned, pool allocated disk buffers. These buffers can then be used to DMA directly to disk (either with blocking O_DIRECT files or via AIO operations, if run on a clever kernel). To do this for the bittorrent protocol, the network loop needs to read 5 bytes (4 bytes length-prefix and 1 byte message code), if the