Networked software is often built using a service-oriented architecture, but networked machine learning applications may be easier to manage using a different programming style.
What's new: Andrei Paleyes, Christian Cabrera, and Neil D. Lawrence at University of Cambridge compared the work required to build a business-oriented machine learning program using a service oriented architecture (SOA) and flow-based programming (FBP).
Key insight: SOA divides a program into services — bundles of functions and memory for, say, navigation, payment processing, and collecting customer ratings in a ride-sharing app — connected to a central hub that passes messages among them. In this arrangement, machine learning applications that draw on large databases generate a high volume of messages, which can require a lot of computation and time spent debugging. FBP, by contrast, conceives a program as a network of functions, or nodes, that exchange data directly with one another. This approach cuts the amount of communication required and makes it easier to track data paths, making it easier to build efficient machine learning programs.
How it works: Over three phases of development, the authors used SOA and FBT to implement taxi-booking applications that took advantage of machine learning. Then they measured the impact of each programming approach on code size, ease of revision, and code complexity.
- In Phase 1, the authors built separate modules that assigned drivers to incoming ride requests, kept track of rides, updated information such as passenger pickup and drop-off times, and measured passenger wait times. SOA called for rider and driver services, while FBP required nodes to handle the interactions among each data stream, such as allocating a ride or calculating the wait time.
- In Phase 2, they added the ability to collect simulated ride requests, driver locations, and rider wait times. Using SOA, they built a new service and modified each previous service to collect the data. Using FBP, they added a node to capture these inputs and outputs.
- In Phase 3, they added a machine learning model trained to estimate passenger wait times using the data collected in Phase 2. The changes required in both approaches were similar. Using FBP, they added a node; using SOA, they added a service.
Results: Both approaches showed distinct benefits. FBP produced a better cognitive complexity score (a measurement of how difficult a code is to understand, where higher is more difficult) in all phases of development. For instance, in Phase 3, FPB scored 1.4 while SOA scored 2.0. On the other hand, the SOA code was easier to revise and less complex in all phases of development. (The authors point out that SOA may have scored higher because it’s more widely used and many libraries exist to reduce code size and complexity. With similar libraries, FBT might catch up.)
Why it matters: FBP provided a better developer experience during data collection, according to the authors’ subjective evaluation. This would allow developers to spend more time optimizing data capture and quality. In addition, reducing the expertise required for data collection could enable machine learning engineers to play a bigger role in that process and improve a model’s performance from the data up.
We’re thinking: Given the ambiguous results, going with the flow might mean sticking with the more familiar SOA approach.