TECH
BLOG

What to do when the intended data cannot be obtained with AppSync's DynamoDB resolver Scan

2020
8

beginning

Data added during development with AppSync in the middle of development
Suddenly there was a situation where it could not be retrieved with List Query...

When I investigated it,
GraphQL was automatically generated when the DynamoDB data source was generated with the AppSync data source,
I discovered there was a hint there

At the end of the day,Lack of understanding of DynamoDBIt's a clean-up story, but...
I think it's a problem you'll run into when using AppSync for some reason...

So, I thought I could help people suffering from the same problem,
I decided to leave it in the article

An event may occur where data that should have been registered in DynamoDB is not included in the data obtained by AppSync List Query

Right after starting development with AppSync,
While the number of registered data is small, while narrowing down the data using the List Query filter,
You should be able to successfully obtain the intended data


4a96c5227343327499ec4c380f65b661.png


8b41f67abe2fc998ca2d6e97925d87ec.png

However, one day, even though data was registered in DynamoDB, it was a List Query filter,
The phenomenon of not being able to properly obtain intended data now occurs in specific List Queries...

20b80b348009f35572b930aeb5bbdeb2.png


5492af978c4e56f29e6f8234d999344d.png

It has been confirmed that 1 data has been registered with the channel_id 42713673-eb21-43ef-bdbc-f460989fb505 in DynamoDB, but why can't even 1 data be retrieved with AppSync's List Query...??

Check out the automatically generated resolver when adding the DynamoDB data source

If you check the automatic GraphQL generation check box when registering the DynamoDB data source,
AppSync List Queries are automatically generated and DynamoDB resolvers are registered

Let's take a peek inside what the DynamoDB resolver behind AppSync is doing to investigate the cause

9a7d6dcbfb03aab6abfaf21cdf91a20a.png


3a91f0e79dcd05153f6e52543975dcda.png

If you look at it, it seems that it is now possible to obtain the number of data items specified by limit by narrowing down the search conditions with a filter using Scan

Check DynamoDB Scan specifications

Next DynamoDB official documentation We will check the specifications of Scan by looking at it.
An explanation about Scan was immediately described at the top of the official document

The Scan operation in Amazon DynamoDB reads every item in a table or secondary index. By default, the Scan operation returns data attributes for every item in a table or index.

If you look at it, it seems that Scan behaves like retrieving all records by default.

Limiting the number of fields in a result set

As I read through the documentation,
Next Limiting the number of fields in a result set The item caught my eye

Now let's say we want to add a filter expression to Scan. In this case, DynamoDB applies a filter expression to the 6 returned items and discards the items that don't match. The final Scan results include 6 or fewer items, depending on how many items are filtered.

If you look at it, it's the behavior when using the Scan filter Discard items that don't match There is


When I checked AppSync's DynamoDB resolver, the scan limit was 20 by default.

In other words, based on AppSync's List Query where data could not be acquired properly,
If we infer the behavior of the DynamoDB Scan that was actually executed,
The data for the corresponding channel_id was not found among the 20 data items acquired, so an empty one was returned
It seems like it's going to be a behavior like

Like the SQL Where clause in RDBMS,
Search and acquire up to 20 data items with the corresponding channel_id
I was expecting this behavior, but it seems that was an error in the first place...

Scan reading consistency

As I read the documentation further Scan reading consistency I also discovered an item called

The Scan operation will result in consistent reads by default. This means that Scan results may not reflect changes made by recently completed PutItem or UpdateItem operations. For more information, see Read Consistency.

If strong consistent reads are required, the ConsistentRead parameter can be set to Scan with a true request when Scan starts. This ensures that all write operations completed before Scan starts are included in the Scan response.

If you look at it, there is no consistency guaranteed by RDBMS,
Data that can be obtained may fluctuate depending on when Scan is executed
It seems necessary to consider this when acquiring data using Scan

Countermeasures

First, since the AppSync limit is 20 by default, the GraphQL query limit is 100, etc.
Set it to a number where it is likely that enough data can be obtained to be narrowed down with a filter in one query1

8bbf374ac035a470852cad3f425466ab.png

Also, when considering the implementation of a process that loads when scrolling to the bottom of the screen, which is often used in web applications, etc., it is necessary to consider implementing pagination using nextToken

d3e7bd3b6c78b2c47f1693baa83fca03.png


6e5f9b851e6d2af8b0e83b716f3a6cc3.png

If you execute List Query according to the flow,
Loading processing when scrolling to the bottom of the screen can be implemented

concluding

As a result, the content of the article focused on DynamoDB specifications rather than AppSync

The problem occurred because AppSync's DynamoDB resolver was handled in an atmosphere,
Since the problem was discovered after data was actually added to some extent in the staging environment,
I couldn't notice it at all when I was debugging by myself...

Also, I think it is better to consider the DynamoDB design beforehand from the viewpoint of maintenance and operation


To the official documentation DynamoDB design best practices There is an explanation page about, so it seems better to read it

Please be careful when automatically generating and using the DynamoDB resolver when using AppSync above

Reference links

  1. If it is a situation where a large amount of data has been registered, the limit will be lost from Scan Query, and it may be good to acquire up to 1.0MB of data in one scan

RELATED PROJECT

No items found.