The HTTP data source calls an HTTP endpoint. When you use it, the AppSync service will send the HTTP request to the specified endpoint with the configured elements: path, headers, query parameters, and request body. Then the HTTP response is available for the resolver response so you can implement any kind of transformation there.
The HTTP data source defines only the endpoint and it's up to the resolvers to fill in the other parts. Because of this, you need only one data source per endpoint and multiple resolvers can use that. For example, if you have an API Gateway REST API then you'll have one data source with its domain, then the different resolvers can call its methods and paths.
A Lambda function can do everything including sending HTTP requests to arbitrary endpoints. So why bother with the HTTP data source when you can just write a function with the same functionality?
I found that the HTTP data source is good for simple things like fetching data from a single service. But when you need more complex things like retrying, custom timeout, some sort of error handling, rollback, or even just sending multiple calls to different endpoints, a Lambda function is necessary. But there is no clear line separating the two approaches.
This is similar to the DynamoDB data source: to send a single transaction it's better to use the dedicated data source. But when you need more complex things then a dedicated Lambda function is easier.
On the Console there is a simple setting: apart from the name of the data source, all you need is the endpoint.
There are some other "hidden" settings that we'll cover later in the Calling AWS services chapter.
You can find code example for this chapter here.