Tomer Elmalem - GraphQL APIs: REST in Peace - Codemotion Milan 2017

1. GraphQL at Yelp REST in Peace Tomer Elmalem tomer@yelp.com @zehtomer

2. Yelp’s Mission Connecting people with great local businesses.

3. In the beginning

4. In the beginning

5. Enter GraphQL

6. { business(id: "yelp") { name alias rating } } { "data": { "business": { "name": "Yelp", "alias": "yelp-sf", "rating": 4.9 } } }

7. { business(id: "yelp") { name alias rating reviews { text } } } { "data": { "business": { "name": "Yelp", "alias": "yelp-sf", "rating": 4.9, "reviews": [{ "text": "some review" }] } } }

8. { business(id: "yelp") { name rating reviews { text } hours { ... } } } { "data": { "business": { "name": "Yelp", "rating": 4.9, "reviews": [{ "text": "some review" }], "hours": [{ ... }] } } }

9. { b1: business(id: "yelp") { name rating reviews { text } } b2: business(id: "sforza") { name rating reviews { text } } } { "data": { "b1": { "name": "Yelp", "rating": 4.9, "reviews": [{ "text": "some review" }] }, "b2": { "name": "Sforza Castle", "rating": 5.0, "reviews": [{ "text": "awesome art" }] } } }

10. { business(id: "yelp") { reviews { users { reviews { business { categories } } } } } } { "data": { "business": { "reviews": [{ "users": [{ "reviews": [ { "business": { "categories": ["food"] } }, { "business": { "categories": ["media"] } } ] }] }] } } }

11. Let’s start with some vocab

12. Query The representation of data you want returned query { business(id: "yelp") { name rating reviews { text } hours { ... } } }

13. Schema The representation of your data structure class Business(ObjectType): name = graphene.String() alias = graphene.String() reviews = graphene.List(Review) def resolve_name(root, ...): return "Yelp" def resolve_alias(root, ...): return "yelp-sf" def resolve_reviews(root, ...): return [ Review(...) for review in reviews ]

14. Fields The attributes available on your schema class Business(ObjectType): name = graphene.String() alias = graphene.String() reviews = graphene.List(Review) def resolve_name(root, ...): return "Yelp" def resolve_alias(root, ...): return "yelp-sf" def resolve_reviews(root, ...): return [ Review(...) for review in reviews ]

15. { "business": { "name": "Yelp", "alias": "yelp-sf", "rating": 4.9 } } { business(id: "yelp") { name alias rating } }

16. Resolvers Functions that retrieve data for a speciﬁc ﬁeld in a schema class Business(ObjectType): name = graphene.String() alias = graphene.String() reviews = graphene.List(Review) def resolve_name(root, ...): return "Yelp" def resolve_alias(root, ...): return "yelp-sf" def resolve_reviews(root, ...): return [ Review(...) for review in reviews ]

17. So how does this all work?

18. Our Setup • Dedicated public API service • Python 3.6 • Graphene (Python GraphQL library) • Pyramid + uWSGI • Complex microservices architecture • No attached database

20. { business(id: "yelp") { name reviews { text } } } POST /v3/graphql

21. The View @view_config( route_name='api.graphql', renderer='json', decorator=verify_api_access, ) def graphql(request): schema = graphene.Schema( query=Query, ) locale = request.headers.get( 'Accept-Language' ) context = { 'request': request, 'client': request.client, 'dataloaders': DataLoaders(locale), } return schema.execute( request.body, context_value=context )

23. def verify_api_access(wrapped): def wrapper(context, request): access_token = _validate_authorization_header(request) response = _validate_token( access_token, path, request.client_addr ) request.client = response if response.valid: return wrapped(context, request) else: raise UnauthorizedAccessToken() return wrapper

24. def verify_api_access(wrapped): def wrapper(context, request): access_token = _validate_authorization_header(request) response = _validate_token( access_token, path, request.client_addr ) request.client = response if response.valid: return wrapped(context, request) else: raise UnauthorizedAccessToken() return wrapper

27. class Query(graphene.ObjectType): business = graphene.Field( Business, alias=graphene.String(), ) search = graphene.Field( Businesses, term=graphene.String(), location=graphene.String(), # ... ) # ...

28. class Query(graphene.ObjectType): business = graphene.Field( Business, alias=graphene.String(), ) search = graphene.Field( Businesses, term=graphene.String(), location=graphene.String(), # ... ) # ...

29. class Query(graphene.ObjectType): # ... @verify_limited_graphql_access('graphql') def resolve_business(root, args, context, info): alias = args.get('alias') internalapi_client = get_internal_api_client() business = internalapi_client.business.get_business( business_alias=alias ).result() return context['dataloaders'].businesses.load(business.id)

37. The Schema class Business(graphene.ObjectType): name = graphene.String() reviews = graphene.List(Reviews) def resolve_name(root, context, ...): return root.name def resolve_reviews(root, context, ...): return [ Review(...) for review in root.reviews ]

39. Dataloaders…?

40. The N+1 Problem

41. The N+1 Problem The ineﬃcient loading of data by making individual, sequential queries cats = load_cats() cat_hats = [ load_hats_for_cat(cat) for cat in cats ] # SELECT * FROM cat WHERE ... # SELECT * FROM hat WHERE catID = 1 # SELECT * FROM hat WHERE catID = 2 # SELECT * FROM hat WHERE catID = ...

42. query { b1: business(id: "yelp") { name } b2: business(id: "moma") { name } b3: business(id: "sushi") { name } b4: business(id: "poke") { name } b5: business(id: "taco") { name } b6: business(id: "pizza") { name } } GET /internalapi/yelp GET /internalapi/moma GET /internalapi/sushi GET /internalapi/poke GET /internalapi/taco GET /internalapi/pizza

44. Dataloaders! • An abstraction layer to load data in your resolvers • Handle batching ids and deferring execution until all of your data has been aggregated

46. query { b1: business(id: "yelp") { name } b2: business(id: "moma") { name } b3: business(id: "sushi") { name } b4: business(id: "poke") { name } b5: business(id: "taco") { name } b6: business(id: "pizza") { name } } GET /internalapi/yelp,moma,sushi,poke,

48. class DataLoaders: def __init__(self, locale): self.businesses = BusinessDataLoader(locale) self.coordinates = CoordinatesDataLoader() self.hours = HoursDataLoader() self.photos = PhotosDataLoader() self.events = EventDataLoader() self.reviews = ReviewsDataLoader(locale) self.venues = VenueDataLoader()

49. Dataloader class BusinessDataLoader(DataLoader): def __init__(self, locale, **kwargs): super().__init__(**kwargs) self._locale = locale def batch_load_fn(self, biz_ids): businesses = get_businesses_info( biz_ids, self._locale ).result() biz_id_map = self._biz_map( businesses ) return Promise.resolve([ biz_id_map.get(biz_id) for biz_id in biz_ids ]) def _biz_map(self, businesses): return { biz.id: biz for biz in businesses }

50. Dataloader class BusinessDataLoader(DataLoader): def __init__(self, locale, **kwargs): super().__init__(**kwargs) self._locale = locale def batch_load_fn(self, biz_ids): businesses = get_businesses_info( biz_ids, self._locale ).result() biz_id_map = self._biz_map( businesses ) return Promise.resolve([ biz_id_map.get(biz_id) for biz_id in biz_ids ]) def _biz_map(self, businesses): return { biz.id: biz for biz in businesses }

53. Considerations • Caching • Performance • Complexity • Rate limiting • Security • Error handling

54. Caching • Edge caching is hard • Greater diversity of requests • Many caching strategies don't ﬁt

55. query { business(id: "yelp") { name } }

56. query { business(id: "yelp") { name rating } }

57. query { search(term: "burrito", latitude: 30.000, longitude: 30.000) { ... } }

58. query { search(term: "burrito", latitude: 30.001, longitude: 30.001) { ... } }

59. query { search(term: "Burrito", latitude: 30.000, longitude: 30.000) { ... } }

60. Service Caching • Network caching proxy • Generic caching service • Can wrap any service, applies to everyone

64. What about bulk data?

65. Caching in Bulk • ID-based caching, setup a key: value cache map • Parse and cache individual models, don't cache the entire response as-is

66. cached_endpoints: user.v2: { ttl: 3600, pattern: "(^/user/v2(?:?|?.*&)ids=)((?:d|%2C)+)(&.*$|$)", bulk_support: true, id_identifier: 'id' }

67. cached_endpoints: user.v2: { ttl: 3600, pattern: "(^/user/v2(?:?|?.*&)ids=)((?:d|%2C)+)(&.*$|$)", bulk_support: true, id_identifier: 'id' }

68. Request Budgets

69. Request Budgets X-Ctx-Request-Budget 1000 sleep(0.470) X-Ctx-Request-Budget 530

70. Request Budgets X-Ctx-Request-Budget 1000 sleep(1.470) X-Ctx-Request-Budget -530

71. Complexity

72. { business(id: "yelp") { reviews { users { reviews { business { categories } } } } } } { "data": { "business": { "reviews": [{ "users": [{ "reviews": [ { "business": { "categories": ["food"] } }, { "business": { "categories": ["media"] } } ] }] }] } } }

75. Rate Limiting

76. Normally GET https://api.yelp.com/v3/search

77. Normally GET https://api.yelp.com/v3/search GET https://api.yelp.com/v3/search GET https://api.yelp.com/v3/search GET https://api.yelp.com/v3/search GET https://api.yelp.com/v3/search GET https://api.yelp.com/v3/search GET https://api.yelp.com/v3/search GET https://api.yelp.com/v3/search GET https://api.yelp.com/v3/search GET https://api.yelp.com/v3/search

78. GraphQL POST https://api.yelp.com/v3/graphql query { business(id: "yelp-san-francisco") { name } }

79. GraphQL POST https://api.yelp.com/v3/graphql query { b1: business(id: "yelp-san-francisco") { name } b2: business(id: "garaje-san-francisco") { name } b3: business(id: "moma-san-francisco") { name } }

80. GraphQL POST https://api.yelp.com/v3/graphql query { search(term: "burrito", location: "sf") { business { name reviews { rating text } } } }

81. Node-based • Count individual nodes returned by  the request sent to the API POST https://api.yelp.com/v3/graphql query { search(term: "burrito", location: "sf") { business { name reviews { rating text } } } }

85. Field-based • Count each individual ﬁeld returned   by the request sent to the API POST https://api.yelp.com/v3/graphql query { search(term: "burrito", location: "sf") { business { name id } } }

86. Field-based • Count each individual ﬁeld returned   by the request sent to the API { "data": { "search": { "business": [ { "name": "El Farolito", "id": "el-farolito-san-francisco-2" }, { "name": "La Taqueria", "id": "la-taqueria-san-francisco-2" }, { "name": "Taqueria Guadalajara", "id": "taqueria-guadalajara-san-francisco" }, { "name": "Taqueria Cancún", "id": "taqueria-cancún-san-francisco-5" }, { "name": "Little Taqueria", "id": "little-taqueria-san-francisco" }, { "name": "Pancho Villa Taqueria", "id": "pancho-villa-taqueria-san-francisco" }, { "name": "Tacorea", "id": "tacorea-san-francisco" }, { "name": "El Burrito Express - San Francisco", "id": "el-burrito-express-san-francisco-san-francisco" }, { "name": "El Burrito Express", "id": "el-burrito-express-san-francisco" }, ... ] } }

87. Field-based • Count each individual ﬁeld returned   by the request sent to the API { "data": { "search": { "business": [ { "name": "El Farolito", "id": "el-farolito-san-francisco-2" }, { "name": "La Taqueria", "id": "la-taqueria-san-francisco-2" }, { "name": "Taqueria Guadalajara", "id": "taqueria-guadalajara-san-francisco" }, { "name": "Taqueria Cancún", "id": "taqueria-cancún-san-francisco-5" }, { "name": "Little Taqueria", "id": "little-taqueria-san-francisco" }, { "name": "Pancho Villa Taqueria", "id": "pancho-villa-taqueria-san-francisco" }, { "name": "Tacorea", "id": "tacorea-san-francisco" }, { "name": "El Burrito Express - San Francisco", "id": "el-burrito-express-san-francisco-san-francisco" }, { "name": "El Burrito Express", "id": "el-burrito-express-san-francisco" }, ... ] } }

88. Securing the API • Bulk endpoints to minimize the number of queries • Network-level caching • Daily rate limiting • Limiting the maximum query size • Per-resolver level authentication • Persisted queries

89. Securing the API • Bulk endpoints to minimize the number of queries • Network-level caching • Daily rate limiting • Limiting the maximum query size • Per-resolver level authentication • Persisted queries

90. class MaxQuerySizeMiddleware: MAX_SIZE = 2000 def __init__(self): resolvers_executed = 0 def resolve(self, next, root, info, **args): # did we hit the max for this query? nope if resolvers_executed <= MAX_SIZE: self.resolvers_executed += 1 return next(root, info, **args) # we hit the max for this query return None

91. Easy* failure handling and retries • GraphQL requests can partially succeed!

92. { business(id: "yelp") { name rating reviews { text } hours { ... } } } { "data": { "business": { "name": "Yelp", "rating": 4.9, "reviews": [{ "text": "some review" }], "hours": null, } }, "errors": [ { "description": "could not load hours", "error_code": "HOURS_FAILED" } ] } HTTP 200

93. Easy* failure handling and retries • GraphQL requests can partially succeed! • But… that makes some other failure cases trickier

94. { business(id: "123-fake-street") { name rating reviews { text } hours { ... } } } { "data": null, "errors": [ { "description": "business not found", "error_code": "BUSINESS_NOT_FOUND" } ] } HTTP 200

95. def talk(): return end

96. Building UI Consistent Android Apps Saturday - 11.30 in Room 6 Nicola Corti

97. { questions? { answers } }

Tomer Elmalem - GraphQL APIs: REST in Peace - Codemotion Milan 2017

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (20)

Similar a Tomer Elmalem - GraphQL APIs: REST in Peace - Codemotion Milan 2017

Similar a Tomer Elmalem - GraphQL APIs: REST in Peace - Codemotion Milan 2017 (20)

Más de Codemotion

Más de Codemotion (20)

Último

Último (20)

Tomer Elmalem - GraphQL APIs: REST in Peace - Codemotion Milan 2017