Category: optimization

DHT bootstrap node

With the release of libtorrent-1.1.1, libtorrent finally got its very own default DHT bootstrap node, dht.libtorrent.org:25401. This post gives some background on the work that went into setting it up.

slow start

This post is a result of looking into a slow-start performance issue in uTP. Slow-start is a mechanism in TCP employed to discover the capacity of a link, before transitioning into the steady state regime of additive increase and multiplicative decrease. Slow start is employed on new connections and after time-outs (where the congestion window . . .

libtorrent alert queue

The main mechanism libtorrent uses to report events and errors to the client is via alerts. Alerts are messages as c++ objects with additional information depending on the type of message. Periodically clients poll for new alerts from a session object. In the next major release of libtorrent detailed peer logging will be available as . . .

bdecode parsers

I have recently revisited the bdecoder in libtorrent, and ended up implementing a new bdecoder that is two orders of magnitude faster than the original (naive) parser. This is the 3rd decoder in libtorrent’s history, and I would like to cover its evolution of parsing bencoded data.

a bittorrent filesystem

One of the main bottlenecks when downloading and seeding content over bittorrent is accessing the disk. This post explores the option to bypass traditional filesystems and use a block device as storage for torrents, in order to improve download performance. bittorrent protocol BitTorrent downloads conceptually divide up content into pieces, which are downloaded in rarest-first order. . . .

VirtualAlloc pitfall

When allocating blocks in the disk cache, libtorrent uses valloc(), to allocate page-aligned 16kiB blocks. On windows, the natural couterpart to valloc() is VirtualAlloc(). Having these blocks page aligned may provide performance improvements when reading and writing files that are aligned to the block boundaries. The 16kiB allocation size is derived from the bittorrent protocol . . .

memory cache optimizations

When optimizing memory access, and memory cache misses in particular, there are surprisingly few tools to help you. valgrind’s cachegrind tool is the closest one I’ve found. It gives you a lot of information on cache misses, but not necessarily in the form you need it. About a week ago I started looking into lowering . . .

socket receive buffers

In an attempt to save memory copying, libtorrent attempts to receive payload bytes directly into page aligned, pool allocated disk buffers. These buffers can then be used to DMA directly to disk (either with blocking O_DIRECT files or via AIO operations, if run on a clever kernel). To do this for the bittorrent protocol, the . . .