EPSRC Centre for Doctoral Training Cloud Computing for Big Data

Thomas

Thomas

Thomas Cooper was part of the CDT’s first cohort. His PhD research aims to help people find a better way to select the best configuration option in a stream processing system, saving time and energy. This work saw him secure a four-month internship at Twitter HQ in San Francisco.

Securing an internship at Twitter HQ

It was when Twitter’s head of real time computing Karthik Ramasamy stood up at the conference that Thomas’ ears pricked up.

A few of the CDT faculty and cohort were visiting Orange County, California, for the Distributed Event Based Systems Conference in 2016. And the conversation had shifted to an area Thomas was very well aware of.

Karthik was giving a talk about Heron, Twitter’s new stream processing system. He confessed there were a lot of problems they hadn’t ironed out yet. The team hadn’t configured their systems to go as fast as they could, and as a result it remained a very manual process.

Thomas caught up with him later, and mentioned that something he was working on might be able to help them out.

18 months (and a lot of emails, video calls and interviews) later, he was beginning an internship in the Bay Area.

What impact do you hope your project will have?

From Physics to Computer Science

It had been a winding road up to that point for Thomas, who originally studied Physics with Space Science at the University of Leicester.

“I was going to be a rocket scientist, until I realised that theoretical physics was actually quite hard. Then I got an MSc in acoustic design and moved into construction, designing buildings and specifying sound tiles for seven years.

“After several years I was bored of doing the same thing again and again. I really wanted to change direction, so I did the MSc Computer Science course at Newcastle and was like a pig in muck. I met Prof Paul Watson and was working with him on the SiDE project, building a robot (equipped with lasers!) to help in a project guiding care home residents. I was enjoying having the freedom to explore something different to my old job.”

Joining the CDT

It was Paul Watson who convinced him to join the first cohort of the CDT.

“I was worried about taking the academic route, but it ended up giving me skills in areas that I’d have had to spend decades in the software industry to learn. If I’d just gone straight into a software engineering job, I’d have strong opinions on tabs vs spaces, but I wouldn’t have come across people with strong opinions about dynamic linear models. I have absorbed so much being around these mathematicians.”

The value of the CDT

This experience helped him at Twitter, where he found himself able to understand the needs (and language) of both the Real Time Computing and Data Science/Machine Learning teams.

“I realised I could understand what both teams were saying in meetings, and suddenly I was the one explaining things to each team. I realised how useful all this [the CDT] had been, and that my experience could be very useful.”

Discovering software engineering

Thomas spent four months developing a prediction system called Caladrius, which interfaced with Twitter’s in-house stream processing system.

“It was like being in an Aladdin’s cave of people doing things I was interested in. I know a lot of companies like to talk about cooperation, but [at Twitter] people would walk into each other’s work areas and just have a chat about really interesting problems. I wanted to listen to all those conversations.

It helped me realise that software engineering is the thing I want to do with my life, and also that there could really be benefits to more crossover between industry and academia. There are clever people here [in the CDT] that would look at industry problems and say: ‘That’s easy’. And vice versa.”

Providing missing skills

Thomas wants to continue working in the field of stream processing, and says that the work of the CDT in training tomorrow’s experts is something that’s definitely filling a gap.

“It’s definitely providing skills that are often missing. A set of skills that are going to become increasingly useful. The CDT is producing people that are in that sweet spot between data engineers and mathematicians.”

What have you learned?

You can read more about Thomas’ work on Caladrius on his blog.