Avatar
Luke Bigum
Head of Systems
Old Father Time
Previously we’ve talked about how we use Nagios / Icinga for three broad types of monitoring at LMAX Exchange: alerting, metrics, and validation. The difference between our definitions of alerting and validation is a fine one and it more has to do with the importance of the state of the thing we are checking and the frequency in which we check it. An example of what I consider an “Alert” is if Apache is running or not on a web server.
2017-06-12
8 min read
Just before New Year 2017 a leap second was inserted into Coordinated Universal Time (UTC). At LMAX Exchange we had some luxury to play with how we handled the leap second. January 1st is a public holiday, there’s no trading, so we are free to do recovery if something didn’t go according to plan. This blog post is an analysis of the results of various time synchronisation clients (NTP and PTP) using different methods to handle the leap second.
2017-01-30
9 min read
A month or two ago I was asked by someone in our Operations team what clock synchronisation is and why we need to do it. I gave them a very basic few sentence answer. That got me thinking that I never read an easy explanation when I myself got started in this area, and the terminology used can be confusing if it’s the first time you come across it. Below is a copy-paste out of our internal documentation where I attempt to explain computer clock synchronisation and the reason for it.
2016-10-05
6 min read
In this series we are attempting to solve a clock synchronisation problem to a degree of accuracy in order to satisfy MiFID II regulations, and we’re trying to do it without spending a lot of money. So far we have: Talked about the regulations and how we might solve this with Linux software Built a “PTP Bridge” with Puppet Started recording metrics with collectd and InfluxDB, and Finished recording metrics Drawn lots of graphs with Grafana and found contention on our firewall Tried a dedicated firewall for PTP The start of 2016 opened up a few new avenues for this project.
2016-04-08
20 min read
In this series we are attempting to solve a clock synchronisation problem to a degree of accuracy in order to satisfy MiFID II regulations, and we’re trying to do it without spending a lot of money. So far we have: Talked about the regulations and how we might solve this with Linux software Built a “PTP Bridge” with Puppet Started recording metrics with collectd and InfluxDB, and Finished recording metrics Drawn lots of graphs with Grafana and found contention on our firewall In this post we will look at introducing a second ASIC based firewall to route PTP through so that PTP accuracy is not affected by our management firewall getting busy.
2016-02-09
8 min read
I spoilt myself in 2013 for my Birthday and Christmas and bought the beautiful ASUS Zenbook UX301LA. The model I ordered comes with a touch screen WQHD (2560x1440) display, an Intel i7 4558U CPU, 8GiB of RAM and 2 internal SSDs. Needless to say it’s very cool! The laptop comes with OEM Windows 8, which despite all the bad geek press online I actually like it. Without a touch screen though it would be useless but with a touch screen it works quite well.
2014-01-30
5 min read
My company launched their new website recently. When we launched before Christmas we encountered a reoccurring problem that was more difficult than most to diagnose. The problem itself is very specific to our site so I doubt the exact details will help many people, but maybe the troubleshooting steps involved will prove interesting to someone. I’m not particularly proud of the time it took to track down nor our exact thought process (hardly blowing my own horn with this post) but here we go anyway.
2012-01-19
8 min read
Now something more meaty: configuring Solaris 10 to use Fedora Directory Server as an LDAP source of users, groups and authentication. This information is sourced from the FDS Project and Sun documentation. In order for PAM authentication to work, the Solaris 10 server needs to be recently patched. I’m not sure which patch it is specifically, but patch level 138889-07 (from ‘uname -a’) will be enough. Create Fedora Directory Server profile This step should only need to be done once per FDS cluster, multiple Solaris 10 machines can use the same profile.
2009-05-09
8 min read