Paypal工程博客
PayPal 尝试用 akka stream 和 akka actor 来处理交易,这给他们带来了一个从更自然的角度来重新审视业务模型的机会,并且,将处理时间缩短了 80% 以及获得了接近 0 的失败率: Learnings from Using a Reactive Platform – Akka/Squbs | PayPal Engineering BlogO网页链接
草原老师推荐,Squbs是产品infoQ有介绍
Learnings from Using a Reactive Platform – Akka/Squbs
By Balaji Srinivasaraghavan December 13, 2017
Introduction
Simple and reliable APIs are important for winning customer trust in
all industries. This is especially true when the API is a customer
facing endpoint that accepts payment instructions. Add to that the
ability to submit 15,000 payments in a single request and reliability
becomes critical.
Let’s talk a bit aboutPayouts– a PayPal product through which users can submit mass payment requests. It’s an asynchronous interaction – a request with multiple payment instructions returns a token. The caller can then use the token to poll for the payment outcomes or listen for webhooks.
As you can imagine, customers like to be sure we got each request
right and want to be kept up to date on the status of the money
movements. The actual request processing involves orchestrating calls to
a slew of services ranging from ones that provide information on
recipients to services that perform the actual money movement.
We recently finished work on improving the features offered by the product and on further improving its reliability. One of the changes we made was to useAkkato build the payment orchestration engine as a data flow pipeline. Discussed below are some learnings from that exercise.
Concurrency – Harder Than It Needs to Be?
Object-oriented design does a great job of capturing the static
aspects of a real world system with entities, their attributes and
relationships. These aspects can be translated directly to Java
artifacts – classes with variables using inheritance and composition.
Real world entities also have a dynamic nature to them – the way in
which they come to life over the lifetime of the application and how
they interact with each other. Java unfortunately does not have a
sufficiently nuanced abstraction for representing such changes in states
of entities over time or their degree of concurrency. Instead, we have a
CPU’s (hardware) model of the world – threads and processors, that has
no equivalent in the real world. This results in an impedance mismatch
that the programmer is left to reconcile.
Some improvements in the building blocks for representing concurrent
aspects came with Java 8. While we didn’t get a better model, we did get
some elegant constructs that reduced the pain of using Java for such
tasks.
Lambdas made multithreaded Java code actually readable. Java 8 introduced the ability to compose futures.
And, we got streams. Streams trace their roots back to a programming
paradigm from the 1960s called data flow programming. The model lets us
compose a series of transformations on a dataset as a pipeline. While we
concentrate on the logical composition of a data processing job, the
framework is tasked with figuring out the physical orchestration of
parallel processing for the job.
Java’s implementation of streams was designed with the expectation
that it would not have to deal with high latency (typically IO). This
makes it unsuitable for building services. But, reactive frameworks such
as Akka provide implementations that account for high latency tasks
such as service invocations within a stream.
Data flow programmingusing streams and theactor modelprovide great conceptual models for reasoning about the dynamic aspects of a real world system. These models represent a paradigm shift once adopted. We can use them to represent concurrency using abstractions that are intuitive. The same model can be represented in Java as well and brought to life by the underlying platform, without developers having to manually code for concurrency using lower level primitives.
Learnings From Using Reactive Streams and the Actor Model
To set the context, the primary design
goal of the system is to reliably accept and process payments. That
means we maintain an accurate record of progress and prioritize features
such as check pointing & auto-recovery over response time and
throughput. That does not mean the solution can be “slow” though. The
difference in emphasis only means we need to get more done in the same
amount of time.
Below are some of the learnings we had over the course of the project as it relates to reactive streams and using the actor model. The ideas come from a variety of sources including other teams who had used Akka before us, the team which maintainsSqubs(Akka with PayPal operationalization) as well as our own experience.
Foundational Patterns for Reuse
Over time, there are a bunch of stream patterns that recur throughout
the application. It helps to create a pattern library as we encounter
them.
Select between two flows based on a predicate
Select one of many flows based on a decider function
Bypass running a flow and short circuit if an error was encountered in a previous stage
Retry an operation based on a decider (see one from Squbshere)
Akka provides some patterns as well –akka-stream-contrib
An example of a flow selector that selects one of many flows based on a decider function would be:
public static Flow selectByIndex(ActorSystem system, Function numericPartitioner, Flow... innerFlows) {
return Flow.fromGraph(GraphDSL.create(builder -> {
// given a data element in the stream, return the index of the flow to be used to process it
UniformFanOutShape top = builder.add(Partition.create(innerFlows.length, msg -> {
int partition = numericPartitioner.apply(msg);
return partition;
}));
// merge the output of the individual flows back to the primary flow
// this is made possible by the use of a homogeneous envelope type across the flow
akka.stream.scaladsl.MergePreferred.MergePreferredShape bottom = builder.add(MergePreferred.create(innerFlows.length - 1));
int index = 0;
for (Flow flow : innerFlows) {
FlowShape tgt = builder.add(flow);
builder.from(top.out(index)).toInlet(tgt.in());
//last flow is the preferred flow
builder.from(tgt.out()).toInlet((index + 1 == innerFlows.length) ? bottom.preferred() : bottom.in(index));
index++;
}
return FlowShape.of(top.in(), bottom.out());
}));
}
While individually most of these helpers are not complex to create,
they save a lot of effort for the team over time. This is especially
true for ones such as a predicate based selector that show up
everywhere. These otherwise would have to be written as graph stages.
With helpers, we can now use flow composition to put them to use and
focus on writing and certifying the business logic alone.
Failure is the Message
It helps a great deal to use a homogenous data type to transfer
information on success and failure down the stream without aborting
processing. To achieve that, we model stream elements as envelopes with
business information and metadata about the status of previous
transformations.
For example, say we could not reach a service as part of a stream
stage. We can embed a Try or an optional exception in the current
message when that happens. A subsequent transformation may choose to not
run and short-circuit on seeing the failure while a stage that updates
the database could choose to participate to record the outcome. A stream
stage’s behavior of not participating in processing if an exception was
recorded should be provisioned via flow composition rather than having
each stage run the same check.
Use Actors when Warranted
Akka streams should serve as the default model for most use cases.
Code modelled as a stream is not only performant but also simple to
create and maintain. But, sometimes actors provide advantages that are
difficult to achieve otherwise.
The primary use case for actors is, of course, to maintain state and
to perform tasks that benefit from supervision. Actors can work with a
single stream or across multiple streams for use cases such as
aggregation, performance or business statistics and even centralized
logging via an event bus. They are also a natural choice when you want
to separate processing of non-critical tasks away from a critical
pipeline. They are also useful for implementing blocking tasks (on a
separate thread pool) especially if we don’t need feedback on the
outcome.
Stream Decomposition Using Actors
There are advantages in decomposing a large stream into smaller
streams that are anchored in chained actors. We used such a model to
reduce the solution complexity while servicing the requirement to
checkpoint and auto-recover. Each of those actors act as a checkpoint
from where processing can resume in case of a failure.
We can model check pointing within the stream as well. But, using
chained actors with smaller and more focused streams made the code
simpler to write and easier to maintain. This is because the branching
introduced by features such as check pointing are orthogonal to business
logic. They compound the complexity already being expressed in the
stream and tend to not be amenable to local optimizations like flow
composition.
If you are designing for high velocity or performance, you are going
to want to use a single stream. But, for most other applications,
leveraging actors and streams together can help simplify the design of
complex streams.
Platform Lock-in and Portability
If we can provide a reasonable level of isolation between the actual
business logic and the code that strings them together and orchestrates
the flow, we can retain a good amount of flexibility in moving to a
different implementation of reactive streams or the actor model.
One area that needs more attention though is Akka HTTP. One of Akka’s
key strengths as a reactive platform is the way in which Akka HTTP is
tied in with Akka Streams. Together they deliver a lot of value for
developers as a simple cohesive package. But, in the context of
portability, it makes sense for us to build a layer of abstraction on
top of the Akka HTTP client. This serves two purposes –
Airgap developers from Akka HTTP to enable changing the HTTP client provider at a later point in time
A layer of indirection enables provisioning a richer set of features that’s comparable to JAX-RS – marshalling/unmarshalling, custom retry. The flip side is we lose advanced features such as client sideHTTP response streaming.
Conclusion
We’re in the midst of a significant transformation in the way we
write Java applications. Functional programming in Java 8 is fairly well
understood by now. Reactive programming, given the massive adoption
Java streams had, has established itself as indispensable. Reactive
streams have been added to the Java 9 SDK.
Broadly, these are 3 class of applications that benefit from adopting a reactive framework.
Backend services – those with multiple downstream dependencies (micro-services) or with a need to orchestrate complex tasks
Engineering teams who want to leverage modern frameworks that
abstract complexity in traditional programing – multithreading,
non-blocking IO
Asynchronous or high velocity event streams – twitter, mobile application GUI
Akka is built for performance and ease of use. That’s a hard balance
to achieve. It’s even harder to achieve on your own. Whether it’s
concurrency or non-blocking IO, the Akka toolkit provides intuitive and
performant models for developers to build upon.
In our case, we saw an 80% reduction in processing time with the new
stack and a near 0% failure rate. Surely, as we turn on more features,
there’s more to learn and improve. But, it’s clear that adopting
reactive streams and the actor model has enabled us to service our
customers better by providing a more reliable and performant API.