You are viewing the preview version of this book
Click here for the full version.

HTTP

The HTTP data source calls an HTTP endpoint. When you use it, the AppSync service will send the HTTP request to the specified endpoint with the configured elements: path, headers, query parameters, and request body. Then the HTTP response is available for the resolver response so you can implement any kind of transformation there.

The HTTP data source defines only the endpoint and it's up to the resolvers to fill in the other parts. Because of this, you need only one data source per endpoint and multiple resolvers can use that. For example, if you have an API Gateway REST API then you'll have one data source with its domain, then the different resolvers can call its methods and paths.

The data source and the resolver are combined to define the request
HTTP data source or Lambda function?

A Lambda function can do everything including sending HTTP requests to arbitrary endpoints. So why bother with the HTTP data source when you can just write a function with the same functionality?

I found that the HTTP data source is good for simple things like fetching data from a single service. But when you need more complex things like retrying, custom timeout, some sort of error handling, rollback, or even just sending multiple calls to different endpoints, a Lambda function is necessary. But there is no clear line separating the two approaches.

This is similar to the DynamoDB data source: to send a single transaction it's better to use the dedicated data source. But when you need more complex things then a dedicated Lambda function is easier.

Data source settings

On the Console there is a simple setting: apart from the name of the data source, all you need is the endpoint.

Only the endpoint can be set on the Management Console

There are some other "hidden" settings that we'll cover later in the Calling AWS services chapter.

Sending requests

Let's write a simple resolver that returns the latest XKCD comic!

XKCD has a sort of API. At https://xkcd.com/info.0.json there is a JSON file with details about the latest comic. We'll configure AppSync to send a GET request there and return the response JSON.

The schema we'll use:

type Query {
  latestXkcd: AWSJSON!
}

Notice that there are no arguments for this field. This is because no part of the request is dynamic: all AppSync needs to do is to send a specific request to the endpoint.

The data source defines the endpoint domain:

An HTTP data source for the XKCD API

Then the resolver adds the path and the Content-type header:

{
  "version": "2018-05-29",
  "method": "GET",
  "params": {
    "query": {},
    "headers": {
      "Content-Type" : "application/json"
    },
  },
  "resourcePath": "/info.0.json"
}

Finally, the response mapping template returns the body of the HTTP response:

#if ($ctx.error)
  $util.error($ctx.error.message, $ctx.error.type)
#end
#if (
  $ctx.result.statusCode < 200 ||
  $ctx.result.statusCode >= 300
)
  $util.error(
    $ctx.result.body,
    "StatusCode$ctx.result.statusCode"
  )
#end
$ctx.result.body

To test it, send a GraphQL query to the AppSync API:

query MyQuery {
  latestXkcd
}

And the response:

{
  "data": {
  "latestXkcd": "{
      \"month\":\"7\",
      \"num\":2645,
      \"link\":\"\",
      \"year\":\"2022\",
      \"news\":\"\",
      \"safe_title\":\"The Best Camera\",
      \"transcript\":\"\",
      \"alt\":\"The best camera is the one at L2.\",
      \"img\": \"https://imgs.xkcd.com/comics/the_best_camera.png\",
      \"title\":\"The Best Camera\",
      \"day\":\"13\"
    }"
  }
}
Error handling

Notice the 2-steps error handling in the response mapping template. The $ctx.error is only present if there was a problem with the request, such as it's in an invalid format or a required parameter is missing. This is rare and usually occurs due to a programming error or insufficient escaping.

The second part of error handling checks the response status code. AppSync treats everything as a success, but usually a non-2XX status code indicates a failure. So this part is needed to make sure an error response is treated as an error.

Debugging requests

A frustrating part about the HTTP data source is the lack of transparency: AppSync sends an HTTP request to the endpoint, but there is no way to see what is sent on the wire. The only help is to turn on resolver logging and see the transformed template and try to deduce from there. But occasionally there are subtle bugs caused by things you don't see there.

The transformed template contains the request details

To see the request itself we can take advantage of a test endpoint that accepts HTTP requests and prints all sorts of information about them. One such site is Webhook.site that creates a unique URL for you with a randomly generated path:

Unique debug URL

Each request sent here will be logged, providing all the details.

To set up request to this debug endpoint, we'll need to change two things: the HTTP data source's endpoint and the request path.

First, change the endpoint to https://webhook.site:

The data source pointing to the debug endpoint

Then modify the resolver to send the request to the path with the unique token:

{
  "version": "2018-05-29",
  "method": "GET",
  "params": {
    "query": {},
    "headers": {
      "Content-Type" : "application/json"
    },
  },
  "resourcePath": "/15beef57-b418-45c1-9a79-0f5d960ee76e"
}

Sending a GraphQL request to AppSync then triggers this resolver making AppSync to send an HTTP request. This request is then logged by the debug endpoint:

Request parameters as received by the debug endpoint

You can see that AppSync adds very little extra information. Just a few headers for connection handling and a user agent.

Calling a private API

While XKCD has a public API, which means anybody can send it a request and get back a response, most other APIs are private. Usually, you need to subscribe or at least create an account and they give you an API key or a token that you need to include in all requests sent to that API.

This is how APIs that allow an application to make calls work. For example, SendGrid uses the Authorization: Bearer <API key> approach. Then OpenWeatherMap needs an appid=<API key> query parameter. These are just two examples, but there are a lot more that work in a similar way.

OAuth

APIs that allow sending requests in behalf of users, such as OAuth providers, work in a 2-step approach. The first part is similar to the API key: the app requests permissions from a user and it uses its client secret. Then the second step is when it's calling the target API with the access token for a specific user. Here, the client secret is a sort-of API key.

Since the HTTP data source supports setting custom headers, path, or query parameter it's possible to send the token in the request. But there are a couple of problems with this.

The easiest but insecure solution is to put the token in the resolver. For example, this resolver sends an Authorization header with the token value:

{
  "version": "2018-05-29",
  "method": "POST",
  "params": {
    "headers": {
      "Content-Type": "application/json",
      "Authorization": "<token>"
    },
  },
  "resourcePath": "/"
}

The problem with this approach is the hardcoding. Anyone who has read-only access to the AppSync API can read this value and then send requests to the target API impersonating the application. Worse still, this usually means the token is embedded in the code repository also, allowing everyone who has access to the codebase to steal it.

Fortunately, we can do better.

Store the sensitive value in a dedicated service like the SSM Parameter Store or the Secrets Manager. Then when the resolver needs the token it sends a signed request to the secret store to get the current value. This way, the token is not hardcoded anywhere in the code and access to it is managed by IAM. And combining pipeline resolvers with an AWS-signed data source it's possible to implement this relying only on AppSync.

Reading a token from a secret store then call the protected API

So, the first step is to read the token from, for example, SSM Parameter Store:

{
  "version": "2018-05-29",
  "method": "POST",
  "params": {
    "headers": {
      "Content-Type" : "application/x-amz-json-1.1",
      "X-Amz-Target" : "AmazonSSM.GetParameter"
    },
    "body": {
      "Name": "<ssm value name>",
      "WithDecryption": true
    }
  },
  "resourcePath": "/"
}

We'll cover how reading values from SSM and the Secrets Manager works in detail in the Reading secret values chapter.

Extract the value from the response body:

$util.toJson($util.parseJson($ctx.result.body).Parameter.Value)

Then in the next step, use the value:

{
  "version": "2018-05-29",
  "method": "POST",
  "params": {
    "headers": {
      "Content-Type": "application/json",
      "Authorization": "$ctx.prev.result"
    },
  },
  "resourcePath": "/"
}

Here, the $ctx.prev.result contains the value from the previous step, which is the result of the SSM GetParameter operation.

This follows the best practices on how to store and use secret values in AWS. Access is granted via temporary credentials and IAM policies and the value is never saved outside the dedicated secret storage.

Sensitive data in logs

All seems secure enough, but there is an AppSync-related problem here. If you turn on resolver logging then AppSync will send the full response of the HTTP data source to CloudWatch Logs. And, unfortunately, this includes the token value in plain-text.

The secret value shows up in the logs

Is it a good practice to use this approach then?

It's definitely better than hardcoding, and it's a best practice to treat logs as they can potentially include sensitive data.

Calling AWS services

The HTTP data source has a "hidden" setting. On the Management Console, the only thing to setup is the endpoint. But looking at the API documentation, there is another part:

Authorization config setting for the HTTP data source

While this setting is not available on the Console, all the tools that interact with the AWS API support it. This includes the AWS CLI, the AWS SDKs, the CDK, Terraform, and CloudFormation.

The authorization config is to sign the requests using the AWS signing algorithm. This is the primary way how permissions work in AWS and it is a common pattern in all AWS APIs.

The signing process requires IAM credentials: Access Key ID, Secret Access Key, and, in some cases, a Session Token. The signature is calculated using these values and added to the request as headers. Then the target service (the AWS API) checks these values, verifies the signature, then checks IAM if the caller has the permissions to call that operation.

The IAM signature process

In practice, signed requests makes it possible to manage permissions with IAM. In the case of AppSync, you can define an IAM role to the data source, add permissions to that role, then use any of the hundreds of AWS APIs. In the background, IAM issues temporary credentials to the role, then the HTTP data source uses them to sign the outgoing requests, and finally, the target service checks the IAM permissions assigned to the role.

HTTP request signing in AppSync

To configure the signing process, you'll need to know two things: the region of the target resource and the signing service name. For example, if you want to send requests to an SNS topic in the us-west-2 region, you can configure the data source like this (this example uses Terraform):

resource "aws_appsync_datasource" "sns" {
  api_id           = aws_appsync_graphql_api.appsync.id
  name             = "webhook_signed"
  service_role_arn = aws_iam_role.appsync.arn
  type             = "HTTP"
  http_config {
    endpoint = "https://sns.us-west-2.amazonaws.com"
    authorization_config {
      authorization_type = "AWS_IAM"
      aws_iam_config {
        signing_region = "us-west-2"
        signing_service_name = "sns"
      }
    }
  }
}
How to know the service name?

Sometimes the service name is logical: s3 for S3, sns for SNS, appsync for AppSync.

Sometimes it matches the IAM permission prefix. For example, calling an API Gateway is signed as execute-api. Here, it matches the service prefix for the Invoke permission.

But on other cases, the service name has no connection to anything and the only way to figure it out is to read the API documentation. The service name for the IoT Jobs data is IotLaserThingJobManagerService.

Sending a request using this data source adds a set of headers to the request.

The HTTP data source can add an AWS signature to the requests

Notice the service_role_arn attribute of the data source. This is the IAM role AppSync will use when it sends a request to the endpoint. This means you need to add permissions to that role.

Since most of the AWS APIs support being called with signatures, this feature opens a lot of possibilities. You can publish to an SNS topic directly from AppSync, invalidate a CloudFront cache, create a Cognito user, and pretty much everything else you can do with the CLI or the SDKs.

Let's see a few examples!

Sending notifications

A common need is to send notifications when something happens in the API. As this is a central topic in any cloud deployment, AWS has several services to do this, each for a different use-case and offers different configuration options, and different limits and gurantees.

In this chapter, we'll cover interacting with 3 such services: SNS, SQS, and EventBridge.

Publishing to SNS

(Example code)

SNS is a pub-sub notification service that provides an easy way to send messages to a topic and AWS then sends them to subscribers. To make AppSync publish a message, it first needs publish access to the topic:

{
  "Effect": "Allow",
  "Action": "sns:Publish",
  "Resource": "<topic arn>"
}

Then the data source needs to be configured with the SNS service:

data "aws_arn" "sns" {
  arn = aws_sns_topic.topic.arn
}

resource "aws_appsync_datasource" "sns" {
  api_id           = aws_appsync_graphql_api.appsync.id
  name             = "sns"
  service_role_arn = aws_iam_role.appsync.arn
  type             = "HTTP"
  http_config {
    endpoint = "https://sns.${data.aws_arn.sns.region}.amazonaws.com"
    authorization_config {
      authorization_type = "AWS_IAM"
      aws_iam_config {
        signing_region = data.aws_arn.sns.region
        signing_service_name = "sns"
      }
    }
  }
}

Here, the region is extracted from the ARN of the target topic and is used both for the endpoint and the signing region. Then the signing_service_name is sns.

The documentation details the structure of the request.

{
  "version": "2018-05-29",
  "method": "POST",
  "params": {
    "query": {
      "Action": "Publish",
      "Version": "2010-03-31",
      "TopicArn": "$utils.urlEncode("${aws_sns_topic.topic.arn}")",
      "Message": "$utils.urlEncode($ctx.args.message)"
    },
    "body": " ",
    "headers": {
      "Content-Type" : "application/xml"
    },
  },
  "resourcePath": "/"
}

The above resolver sends a message defined in the message argument to a topic coming from a Terraform resource. The body: " " is interesting, I got SignatureDoesNotMatch errors when the body was missing or empty even though it is not mentioned in the documentation.

Then the response mapping template needs to check for errors then it can extract values from the response:

#if ($ctx.error)
  $util.error($ctx.error.message, $ctx.error.type)
#end
#if (
  $ctx.result.statusCode < 200 ||
  $ctx.result.statusCode >= 300
)
  $util.error(
    $ctx.result.body,
    "StatusCode$ctx.result.statusCode"
  )
#end
$util.toJson(
  $util.xml.toMap($ctx.result.body)
    .PublishResponse.PublishResult.MessageId
)
Publishing to SQS

(Example code)

SQS is a queue service that is similar to SNS but it also supports FIFO ordering where messages are guaranteed to be handled in the order the are submitted. But the process is the same as for publishing to SNS.

First, add the permissions:

{
  "Effect": "Allow",
  "Action": "sqs:SendMessage",
  "Resource": "<queue arn>"
}

Then configure the data source:

data "aws_arn" "sqs" {
  arn = aws_sqs_queue.queue.arn
}

resource "aws_appsync_datasource" "sqs" {
  api_id           = aws_appsync_graphql_api.appsync.id
  name             = "sqs"
  service_role_arn = aws_iam_role.appsync.arn
  type             = "HTTP"
  http_config {
    endpoint = "https://sqs.${data.aws_arn.sqs.region}.amazonaws.com"
    authorization_config {
      authorization_type = "AWS_IAM"
      aws_iam_config {
        signing_region = data.aws_arn.sqs.region
        signing_service_name = "sqs"
      }
    }
  }
}

The region is again extracted from the target ARN, and the signing_service_name is sqs.

Then construct the request according to the API reference and the developer guide:

{
  "version": "2018-05-29",
  "method": "POST",
  "params": {
    "body": "Action=SendMessage&MessageBody=$util.urlEncode($ctx.args.message)&Version=2012-11-05",
    "headers": {
      "Content-Type" : "application/x-www-form-urlencoded"
    },
  },
  "resourcePath": "/${data.aws_arn.sqs.account}/${aws_sqs_queue.queue.name}/"
}
There is more, but you've reached the end of this preview
Read this and all other chapters in full and get lifetime access to:
  • all future updates
  • full web-based access
  • PDF and Epub versions