Posts

We recently upgraded to Zimbra 8.6 from 8.0.7. We hit a problem that not even zimbra support was able to figure out. Perhaps its that our Zimbra instance has been upgraded since version 3 (or 4) but it got into a bad state whenever we tried to update the proxy. Here’s the solution. First the problem; $ zmproxyctl status zmnginxctl is not running $ zmproxyctl restart Stopping nginx...nginx is not running. Starting nginx...nginx: [emerg] invalid port in upstream "mail.
2016-02-29
4 min read
Just like production code, you should assume things are going to go wrong in your tests and when it does you want good logging to help track down what happened and why. So just like production code, you should use a logging framework within your DSL, use meaningful log levels and think about what info you’d need in the logs if something went wrong (and what you don’t). There’s also a few things we’ve found very useful specifically in our test logging.
2016-02-28
3 min read
LMAX Exchange developers are giving two talks at QCon London this year. Sam Adams, our Head of Software, will be discussing the awesome LMAX Continuous Delivery process in his talk “CD at LMAX: Testing into production and back again”. I will be talking about JVM warm-up strategies and how to inspect the machinations of the Hotspot compiler in “Hot code is faster code - addressing JVM warm-up”. If you’re at the conference, please come and say hello.
2016-02-10
1 min read
In this series we are attempting to solve a clock synchronisation problem to a degree of accuracy in order to satisfy MiFID II regulations, and we’re trying to do it without spending a lot of money. So far we have: Talked about the regulations and how we might solve this with Linux software Built a “PTP Bridge” with Puppet Started recording metrics with collectd and InfluxDB, and Finished recording metrics Drawn lots of graphs with Grafana and found contention on our firewall In this post we will look at introducing a second ASIC based firewall to route PTP through so that PTP accuracy is not affected by our management firewall getting busy.
2016-02-09
8 min read
Monitoring of various metrics is a large part of ensuring that our systems are behaving in the way that we expect. For low-latency systems in particular, we need to be able to develop an understanding of where in the system any latency spikes are occurring. Ideally, we want to be able to detect and diagnose a problem before it’s noticed by any of our customers. In order to do this, at LMAX Exchange we have developed extensive tracing capabilities that allow us to inspect request latency at many different points in our infrastructure.
2016-01-27
12 min read
A few months ago, I wrote about how we had improved our journalling write latency at LMAX by upgrading our kernel and file-system. As a follow up to some discussion on write techniques, I then explored the difference between a seek/write and positional write strategy. The journey did not end at that point, and we carried on testing to see if we could improve things even further. Our initial upgrade work involved changing the file-system from ext3 to ext4 (reflecting the default choice of the kernel version that we upgraded to).
2015-12-09
7 min read
With Selenium 1, JavaScript alert and confirmation dialogs were intercepted by the Selenium JavaScript library so they never appeared on-screen and were accessed using selenium.isAlertPresent(), selenium.isConfirmationPresent(), selenium.chooseOkOnNextConfirmation() and similar methods. With Selenium 2 aka WebDriver, the dialogs do appear on screen and you access them with webDriver.switchTo().alert() which returns an Alert instance for further interactions. However, when you use WebDriverBackedSelenium to mix Selenium 1 and WebDriver APIs – for example, during migrations from one to the other – the alerts don’t appear on screen and webDriver.
2015-12-08
2 min read
Today LMAX Exchange has released ElementSpecification, a very small library we built to make working with selectors in selenium/WebDriver tests easier. It has three main aims: Make it easier to understand selectors by using a very English-like syntax Avoid common pitfalls when writing selectors that lead to either brittle or intermittent tests Strongly discourage writing overly complicated selectors. Essentially, we use ElementSpecification anywhere that we would have written CSS or XPath selectors by hand.
2015-12-04
2 min read
We realised we had a problem when our performance testing environment locked up. Our deployment system wasn’t responding to input, and when we looked at it we had a bunch of exceptions: turns out we were running out of file handles. We’d seen similar situations before when the file handle count was unnecessarily low, but that wasn’t the case here. Fortunately, we re-deploy and run a fresh perf test roughly every 45 minutes, so that narrowed the range of possible commits down significantly.
2015-10-21
14 min read
In programming courses one of the first thing you’re taught is to avoid “magic literals” – numbers or strings that are hardcoded in the middle of an algorithm. The recommended solution is to extract them into a constant. Sometimes this is great advice, for example: if (amount > 1000) { checkAdditionalAuthorization(); } would be much more readable if we extracted a ADDITIONAL_AUTHORIZATION_THRESHOLD variable – primarily so the magic 1000 gets a name. That’s not a hard and fast rule though.
2015-10-08
2 min read
In 2014 LMAX Exchange traded over $1 trillion dollars. The Sunday Times named us the fastest growing tech company in the UK and we previously won Oracle’s Duke's Choice Award for the most innovative programming framework.In this talk we reveal how we develop software. We cover all aspects from our design philosophy, how we practice agile to our continuous integration pipeline and show how far you can go with automatically testing everything.
2015-10-01
1 min read
Probably the best thing I’ve discovered with my recent playing is Travis CI. I’ve known about it for quite some time, even played with it for simple projects but never with anything with any real complexity. Given this project uses rails which wants a database for pretty much every test and that database has to be postgres because I’m using it’s jsonb support, plus capybara and phantomjs for good measure, this certainly isn’t a simple project to test.
2015-10-01
2 min read
I’ve been playing around with ruby on rails recently, partly to play around with rails and partly to take a run at a web app I’ve been considering (which I’ve open sourced because why not?). It turns out the last time I played with it was back in 2005 and slightly amusingly my thoughts on it haven’t changed all that much. The lack of configuration is still good, but the amount of magic involved makes it hard to understand what’s going on.
2015-09-30
2 min read
For the next instalment of this series on low-latency tuning at LMAX Exchange, I’m going to talk about reducing jitter introduced by the operating system.Our applications typically execute many threads, running within a JVM, which in turns runs atop the Linux operating system. Linux is a general-purpose multi-tasking OS, which can target phones, tablets, laptops, desktops and server-class machines. Due to this broad reach, it can sometimes be necessary to supply some guidance in order to achieve the lowest latency.
2015-09-26
10 min read
When you learn Java or any other programming language you usually start by looking at the basics of the type system and how the arithmetic operations work. You learn how numbers are represented and what types of numbers your programming language offers to you. At the same time, one of the first rules that you learn in Java is that implicit casting only occurs when you are doing a “safe” conversion, otherwise you need to explicitly tell your compiler that you understand what you are doing.
2015-09-04
3 min read
Given our DSL makes heavy use of aliases, we often have to provide a way to include the real name or ID as part of some string. For example, an audit record for a new account might be: Created account 127322 with username someUser123. But in our acceptance test we’d create the user with: registrationAPI.createUser("someUser"); someUser is just an alias, the DSL creates a unique username to use and the system assigns a unique account ID that the DSL keeps track of for us.
2015-08-27
2 min read
I have been working on a small tool to measure the effects of system jitter within a JVM; it is a very simple app that measures inter-thread latencies. The tool’s primary purpose is to demonstrate the use of linux performance tools such as perf_events and ftrace in finding causes of latency.Before using this tool for a demonstration, I wanted to make sure that it was going to actually behave in the way I intended. During testing, I seemed to always end up with a max inter-thread latency of around 100us.
2015-08-05
4 min read
My colleague Chris Gollop and I are taking our talk on the road and heading to the innovative Belgium Testing Days Conference taking place in Brussels, 18th-21st May where we will be giving an updated version of “Testing within an agile environment”. Our talk describes how we view testing at LMAX Exchange and how different concepts, from biology and evolution through to behavioural economics, have influenced the process meaning we are not only happy to experiment with new ideas but positively encourage it.
2015-05-18
1 min read
For the last few months at LMAX Exchange, we’ve been working on building out our next generation platform. Every few years we refresh our hardware and upgrade the machines that run our systems, and this time we decided to have a look at upgrading the operating system at the same time. When our first generation exchange was built, we were happy with low-millisecond-level mean latencies. After a couple of years of operation, we upgraded to newer hardware, made some significant software changes and ended up with mean end-to-end latencies of around 250 microseconds.
2015-05-15
8 min read
My colleague Sam & I will be talking at JAX Finance next week (28th/29th April). I’ll be doing a talk with Vijay from Azul on our experiences at LMAX Exchange with deploying Zing to production. In the talk, we’ll discuss how to go about making such a change in a safe manner, some of the internals of Zing, and lessons learned along the way. Sam’s talk describes how we achieve high-throughput and low-latency at LMAX Exchange, and the architecture that we’ve developed to become the UK’s fastest growing tech firm.
2015-04-24
1 min read
I’ll be giving a talk called “Programming Bitcoin in Java” at Riga Dev Day on the 22nd of January. The talk will cover: What is Bitcoin? How does it work? How to use the bitcoinj open-source library The future of blockchain technology Hope to see you there!
2015-01-20
1 min read
I had high hopes for BTRFS. The brochure was very enticing. Checksums, snapshots, disk management… all good things for someone who fondly remembers the digital advanced file system. Unfortunately the brochure describes something that right now is a construction site. Lately I’ve been getting really disenchanted by it. There are several reasons for that, but the general bugginess and instability is the main reason my enthusiasm is waning. At home I have several filesystems on several hosts that all run BTRFS and filesystem problems are a common occurrence under both light and heavy loads.
2015-01-03
7 min read
Rich Bowen – We’ve Always Done It That Way: Principle 13 in the Toyota Way says that one should make decisions slowly, by consensus, thoroughly considering all options, and then implement those decisions rapidly. We believe a similar thing at the ASF. So to people who have only been around for a short time, it looks like we never change anything. But the truth is that we change things slowly, because what we’re doing works, and we need to be sure that change is warranted, and is a good idea.
2014-12-08
2 min read
Jeffrey Ventrella in The Case for Slow Programming: Venture-backed software development here in the San Francisco Bay area is on a fever-pitch fast-track. Money dynamics puts unnatural demands on a process that would be best left to the natural circadian rhythms of design evolution. Fast is not always better. In fact, slower sometimes actually means faster – when all is said and done. Jeffrey’s right in suggesting that we sometimes need to go slower to go faster, unfortunately he makes the mistake of believing that committing and releasing in very short cycles is the cause of these problems:
2014-12-06
3 min read
Lawrence Kesteloot has an excellent post Java for Everything. About a year ago, though, I started to form a strange idea: That Java is the right language for all jobs. (I pause here while you vomit in your mouth.) This rests on the argument that what you perceive to be true does not match reality, and that’s never a popular approach, but let me explain anyway. There are two key realisations that are vital to understanding why this argument has merit and the first one is right there in the introduction: what you perceive to be true does not match reality.
2014-12-02
3 min read
You can now download the new coalescingRingBuffer-1.1.3.jar and coalescingRingBuffer-1.1.3-src.zip. Improvements: size() method now ensures a consistent view of both the producer and consumer positions see http://psy-lob-saw.blogspot.co.uk/2014/07/concurrent-bugs-size-matters.html for details A special thanks to Stanimir Simeonoff, Nitsan Wakart and Martin Thompson for discovering and suggesting fixes for this issue!
2014-11-07
1 min read
So, going back and fixing up auth on our few remaining older systems (centos 5, not internet facing) came across the error below. Solution was beautifully non obvious, so it goes here in the external memory pack. yum --enablerepo=my-repo-x86_64 list updates Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: a.centos.mirror * epel: another.centos.mirror * extras: and.a.further.centos.mirror * updates: centos-updates.co.uk my-repo-x86_64 | 2.9 kB 00:00 my-repo-x86_64/primary_db | 7.1 kB 00:00 http://myreposerver.internal.domain/my-repo/repodata/7d1016c9fcac64ee6c0fe9b5b\ 58ed1e791dae601b1b0be13ea8af523761fbabd-primary.sqlite.bz2: [Errno -3] \ Error performing checksum Trying other mirror.
2014-09-29
1 min read
When generating a JavaScript file dynamically it’s not uncommon to have to embed an arbitrary string into the resulting code so it can be operated on. For example: function createCode(inputValue) { return "function getValue() { return '" + inputValue + "'; }" } This simplistic version works great for simple strings: createCode("Hello world!"); // Gives: function getValue() { return 'Hello world!'; } But breaks as soon as inputValue contains a special character, e.g.
2014-09-26
1 min read
In Software is sometimes done Rian van der Merwe makes the argument that we need more software that is “done”: I do wonder what would happen if we felt the weight of responsibility a little more when we’re designing software. What if we go into a project as if the design we come up with might not only be done at some point, but might be around for 100 years or more? Would we make it fit into the web environment better, give it a timeless aesthetic, and spend more time considering the consequences of our design decisions?
2014-09-25
3 min read
I’ll be giving some talks over the next few months: 8 Oct: Auckland Software Craftsmanship - 6 Years of test automation. 16 Oct: Auckland JVM Group - Stuff I learned about performance. 5 Nov: QCon San Francisco 2014 - Stuff I learned about performance.
2014-09-23
1 min read