Using Cypher With GraphQL
Adding Custom Logic To Your API With The Neo4j GraphQL Library
One of the most powerful features of the Neo4j GraphQL integrations is the @cypher schema directive, which allows users to add custom logic defined in Cypher to a GraphQL schema. This talk will demonstrate how to take advantage of this feature when designing your GraphQL API, avoid common pitfalls with using Cypher in GraphQL, and explore methods for supercharging your GraphQL schema using Cypher. These best practices with help you take advantage of Graph Data Science and APOC in your GraphQL API.
Links And Resources#
- Watch more talks from NODES 2021
- Slides
- Neo4j GraphQL Library Docs
- Neo4j GraphQL Library overview
- Graph Academy Training: Building GraphQL APIs With The Neo4j GraphQL Library
- GRANDstack Starter Project
hey everyone uh thanks for joining us today i hope the conference is is going well for everyone i know there's a lot of interesting talks and content out there so in this talk we're going to focus on adding custom logic to our graphql api using cipher so we're going to take kind of a deep dive into using cipher with the neo4j graphql library so my name is will i work on the developer relations team at neo4j um i publish a blog and a newsletter that's linked there linewj.com and that's also my twitter handle lionwj which is it's probably the the best way to keep up with what i'm working on these days i also co-host the graphstuff.fm podcast with my colleague lou so if you like the podcast format and you're interested in graphs definitely check that out that might be something of interest for you so there are a handful of talks at nodes today focused on graphql earlier today we heard from daryl giving us an overview of the neo4j graphql library then dan gave us a deep dive on adding authorization to our graphql api uh and then of course now we're hearing about uh using cipher with graphql for custom logic immediately after this session a reef in the absent api track so not this track you have to jump over to a different track uh immediately after my talk if you want to hear this one but he's going to be talking about using the brand new neo4j integration for hot chocolate which is a dot-net graphql server so if you're interested in neo4j and graphql in the net ecosystem definitely check that talk out and of course all these are recorded so if you miss them you can catch the recordings as well great so in this talk over the next 10 30 minutes or so what i want to do is go through adding custom logic features to a graphql api for a specific application and that is a news graph so i've pulled in some data from the new york times api and i've loaded that into neo4j already so we have a neo4j database with things like articles articles that have topics articles mention people we know the author of the article we know what geographical region these articles are referring to so we have this data in neo4j and what i want to go through is go through our list of sort of business requirements here to see how we can add logic to our graphql api to expose these features for our client application so things like show me the most recent articles let me search by search term how are comments going to work how can i create comments think about authorization around that how can i show personalized recommendations so either based on my viewing history or based on articles i'm looking at what are other articles that i might be interested in so we're going to go through uh building out a graphql api using the neo4j graphql library for this data set and just look at how we add these features to match our requirements so before we get into that let's talk a little bit about comparing and contrasting cipher and graphql so cipher is very much a graph database query language with cypher we have these declarative pattern matching paradigms that we use ascii art notation to define these sort of graph patterns that we want to work with and then just like any other database query language we have functions for doing things like aggregations math functions database operations like creating indexes data import like pulling in csv files these sorts of things but then because it's a graph database query language we have graph specific operations as well like variable length path operator functions for working with nodes relationships path these kind of things so compare that to graphql which is not a database query language rather it's a query language for apis but also a runtime for fulfilling these requests with graphql we have a type system that clearly defines the data that's available to the client of the api and how that data is connected this is the the data graph so that's where the graph in graphql comes from so while graphql is modeling our application data as a graph we can really use any backend system to resolve data for our graphql api it's not specific to graph databases we can even federate data from multiple sources pulling from multiple databases other apis uh these sorts of things and the way that we traverse this data graph in graphql is by creating this sort of nested structure called a selection set where we're sort of specifying at query time how we want to traverse through that data graph and exactly what fields we want to be returned so let's look at some examples specific to this news data set that we're working with so one of the requirements was show me all of the articles that are available so in cypher we might do something like this so a match clause with this graph pattern that we're looking for is a common way to use cipher and here the parentheses around article represent a note that's a node pattern so we're saying we're saying find all nodes with the label article and return those and we get back a bunch of nodes in graphql it would look something like this so we start with a entry point for our graphql api is a field on a special type called the query type and then we specify in our selection set how we want to traverse the data graph what fields we want to bring back in this case we're just bringing back the title and abstract fields on the article so let's now instead of just giving me all the articles let's look at just the 10 most recent and in cypher we have ordering we have pagination ability so here we're saying order by the publish date give me just the the most recent 10 articles and that's what we get back in graphql uh sorting and this pagination skip limit is not really built into graphql it's not part of the graphql specification um it's up to the implementer of the graphql api to sort of choose how they want to handle that but in general we can handle this idea of sorting and limiting results with field arguments so okay now let's say give me the most recent 10 articles and also their topics remember we're modeling topics as another node connected to the article because it's useful to be able to traverse from an article to the topics to other articles that are connected to those same topics so to do that in this case in cypher it's a little more complicated here because we we have to first find the first 10 articles order those by date and then we have another match clause in this case it's an optional match because not all articles are connected to topics but for the articles that are we want to traverse the graph and so we define this graph pattern so you can see we're sort of drawing this ascii art notation again for the the article node a that's bound to this variable a that we can refer to later where we're saying follow this outgoing has topic relationship you can see we're kind of drawing an arrow there to find the topic nodes and in graphql we do this by adding on to our selection set so our selection set now becomes this nested structure where we're we've added topics and for every topic we want to bring back the name field now what if our our graph pattern is a little more complex now for every topic we also want to know other articles that are in that are connected to those topics and to do that in cypher we just add on to our graph pattern here now in addition to following this outgoing has topic relationship we're also going to follow the incoming has topic relationship to find other articles return those and we get a little sub graph here and in graphql we just keep adding on to our nested selection set so now under the topics selection we've now added articles so that says traverse from the topics to other articles and return the title of those other articles so so far we've been able to represent both in cipher and graphql what we're looking for what about more complex graph operations so let's say can we find the shortest path from vladimir putin to the topic extortion and blackmail so in cypher it looks like this there's a couple of interesting things going on here one is this concept of the variable length path operation and that's right here in the middle this asterisk and then the dot dot nine in our brackets where we're defining our relationship pattern and what this is saying is find a pattern that connects this person node with the name vladimir putin to the topic node extortion blackmail but follow an arbitrary number of relationships so variable length path and this dot dot 9 means go up to 9 hops deep and then the shortest path function will execute a binary breadth first search to see the shortest path where these two nodes are connected and it ends up being not a very long path just through one article about the hacking of the colonial pipeline recently and that's what we get back now in graphql we don't really have anything built in to represent this concept of the shortest path or variable length traversal we could build a graphql api that incorporates this functionality would define that on the backend but it's not something that's sort of built in to graphql let's look at another example this is one of our requirements to show me recommended articles so if you're looking at an article what are other articles you might be interested in in cypher we have lots of different ways that we could do this in this case we're looking for articles that share either an author or a topic or a geographical region right so if i'm reading an article that's about i don't know san francisco bay area maybe i'm interested in other articles about that same geographic area and you can see here here's the article we're looking at initially on the left and the recommended articles are on the right that go through either the author node or through some of the topic nodes and in graphql again we don't really have this concept built in of how we can express this and again we could add this sort of logic in the back end to our graphql api but it's not something that's built in to the language so hopefully that's helpful to think about the powers of graphql cipher when we want to use one when we want to use other things that we get with with cipher that maybe aren't available in graphql how graphql works for querying data from the api client let's think about how we can leverage graphql and cipher together so let's say now we're ready to build a react application for our news graph we have our data in a neo4j aura cluster well we don't want to just have all the the clients querying our database directly we don't want to just expose the database to the world we don't want arbitrary queries to be executed against our database we don't want to think about how we want to handle application users and application authorization and so for that reason we build this api layer that sits between the database and the client uh in this case we want to use graphql maybe we deploy that as an aws lambda function and now our client our react application is speaking graphql to our graphql api and our graphql api is speaking cipher to our neo4j aura cluster okay so that begs the question how do we build this graphql api well maybe an initial naive call it a naive approach might be let's just take some cipher and use the neo4j driver and in the resolver function so resolver functions that's how we actually resolve these graphql requests just use the driver to execute the cipher query and return the results and that works and in fact that's an approach we took for the activity feeds that you see on the neo4j community site and although this this introduces some problems though um it's fine if we have just a single query that we want to to run and expose the data which is which is the case we have in the activity feeds but oftentimes in an application we have much more complex interactions that we have and this can introduce what's called the in plus one query problem where we end up making multiple requests to the backend we have to think about batching and caching so a more sophisticated approach might be something like this which is a demo that michael simons from the neo4j sdn team put together which uses the spring data neo4j integration along with the netflix dgs graphql implementation to build a graphql api uh leveraging some of the query generation from spring data but also leveraging some of the built-in batching and caching functionality in netflix ggs using the data loader pattern but now what if we we want a much more uh a way to get started much more quickly sort of generating the api for us not really having to think about how to uh generate some of that data fetching logic that's where the neofj graphql library comes in um so i'm gonna gonna kind of skip over the a lot of the functionality of the neo4j graphql api uh since there were a couple of other talks from uh from dan and daryl earlier today so definitely check those out if you missed them but basically with the neo4j graphql api we take graphql type definitions and use that to drive the data model in uh in neo4j and the database so here we're pulling in our dependencies we're creating some graphql type definitions a connection to neo4j using neo4j driver we don't have to implement any resolvers we can just spin up an apollo server and we have all of our crud operations generated for us so we can create data we can query it we can use pagination filtering all of these kinds of things now what's going on at query time is we pass an arbitrary graphql query graphql operation i should say rather since this can include mutations as well to the graphql api that we've built using the neo4j graphql library and it has the logic for generating a database query from that arbitrary graphql request using cipher in this case and we also project out just the fields that are requested in the graphql query okay so we talked about crud operations that then begs the question how do we add custom logic then to our graphql api and this is where what i think is one of the most powerful features in the neo4j graphql library comes in and that is the cipher graphql schema directive so schema directives are graphql's sort of built-in extension mechanism that says hey some custom logic needs to happen here on the server so this directive is implemented in the neo4j graphql library and allows us to add cipher statements to our graphql schema so here we're defining a computed field on the topic type we're adding an article count field that the value of is mapped to this cipher statement so we're looking at uh the number of articles connected to this topic that's the article count so now when i include article count in the selection set that cipher query that i've attached in the schema runs as a sort of sub query in the single database query that is generated so we're still able to generate a single database query addressing that in plus one query problem but we're doing that now with some custom user-defined logic so we saw how to do this for a computed scalar field that's the article count we saw if we add article count to our selection set now that cipher query runs and we get back the results we can also use cipher directives on node in object fields or object array fields so here we're adding sort of that recommended recommendation query that we saw earlier where we're looking for articles with overlapping articles articles overlapping authors or topics to show here are similar articles you might be interested in and we're returning article nodes as a recommendation so when i add the similar field now to my selection set i can grab in this case we're just grabbing the title and we can see here for articles we're looking at here are similar articles you might be interested in now oftentimes it's useful to be able to pass in field arguments in our graphql query at query time so in this case we can specify the number of similar articles number of recommended articles to return here we're setting a default value of three and then we can reference that in our cipher statement because those field arguments are passed as cipher parameters to our cipher statement so here we're saying okay only show me the first two recommended articles and that's what we get back we can also use the cipher directive on custom query fields so one feature that's really powerful for search in neo4j is the full text index functionality so we can create a full text index and then use lucine query syntax kind of like what you would see in elasticsearch to do say like fuzzy matching so here we're creating a full text index that we're calling article index on the article node bringing in the title and abstract properties so another nice thing is i can combine node labels and multiple properties into a single index to search which is quite nice so here we're creating a custom query field called article search and we're querying that full text index that we created passing in the search string that's provided at query time and we add this tilde at the end which is leucine query syntax for fuzzy matching so if we have some slight misspelling we'll still return results so if a user is searching for news articles about montana but maybe they misspell it we'll still find results because we're using that fuzzy matching functionality of our full text index now in the the keynote presentation i think it was cameron from meredith who was talking about using apoc to enable scalability for their use case so apoc is the super powerful standard library that's available with neo4j that extends cypher with lots of different functionality let's see how we can use apoc in a cipher directive because after all we have access to cipher procedures through the cipher directive and in this case we'll add some federated data to our graphql api using the google knowledge graph api so the google knowledge graph has an api we have people in our news graph so let's search the google knowledge graph to bring back their biographies so we have that data available so here we're adding a description field on the person type and attaching a cipher statement that's doing a couple of things with apoc first we've stored our key for the google knowledge graph api in a configuration file so you don't have to make that secret available then we're using apoc load json to call out to the google knowledge graph api searching for results for whatever person we're resolving and then we grab the detailed description field from the result so now we have this description field available in our graphql api and we get back data alongside our data from neo4j we get back data from the google knowledge graph so in this article this is what the article about amazon buying mgm we have a couple of people mentioned uh barbara brockley now we have some information about who that is and some information about who jeff bezos is so that's really useful there's lots of other things we can do with apoc all this functionality is available to us through cypher we also saw in the keynote earlier today the power of machine learning and graph algorithms with graph data science and again because this functionality is exposed through cipher procedures we can make use of these in our graphql api using the cipher schema directive so let's use the jaccard similarity function to improve our article recommendation query so jaccard similarity this is like a set comparison operation uh where we get a score that shows us how similar two sets are and in this case the sets are going to be topics connected to articles so we'll change that similar cipher query on the article type to first for the article that we're resolving find all the topics uh and then we will use the uh gds similarity jacquard function to find the other article nodes that are most similar according to jacquard and return those so now when we look at similar articles we can take a look this first one here is a story about um children's author who died similar articles we can see are about libraries and books so that sounds pretty good the other article is again that amazon buying mgm article and we get back other articles about movies um so that that looks pretty good as well okay so we've talked kind of about like the world of of the possible what can we do with the cipher directive for adding custom logic let's talk a little bit about things that we should do maybe things that we shouldn't do right so it's sort of best practices and takeaways for using the cipher directive and there's a handful of observations that uh that i'll go through here so let's just jump into those so the first is regarding uh schema design and the distinction between the relationship directive and the cipher directive so we know that topics are connected to articles so we could add a topics field on the article type that looks like this that uses a cipher query to go out and find the connected topics uh we could do that and that would work but we shouldn't do that instead what we should do is use the relationship directive to indicate that connection to define that relationship and the reason for that is this will allow the cipher generation process in the neo4j graphql library to take advantage of expressing those graph traversals and allow the cipher execution engine to optimize those queries that's something a graph database is really really good at traversing from one node to another and if we sort of abstract that away into a cipher query which runs as a subquery we're going to not leverage that same performance another takeaway here is to leverage the auto-generated filters this is a really powerful feature and there's a lot of filters that are exposed so here's an example here we've written a custom query field geosearch that takes a latitude and longitude and then goes looking for articles connected to this geographic region that are within one kilometer and and we could do this but we don't need to instead just by defining a point type in our schema so we have a location on the geo node that is a point type so just by adding that in our schema and saying hey this is a point type in the database we have this distance filter available to us so we can actually do this as part of the generated logic without having to write any cipher directives the other point i want to make here is that we can also auto generate data using some directives in our schema rather than writing cipher directives so here's an example we've created a custom mutation field for creating comments that takes a user id comment text and an article id and then we've defined how to create that comment and connect that comment node to the user and the article in cypher and we're leveraging the timestamp function and this random uuid function so that when we create this data we don't want the clients to have to pass like a new id for the comment or to pass like the current time we want that to be generated on the server so we can do that in cypher but we don't need to do that instead we can add this id directive on the comment id field and this says uh okay when this is created i want to auto generate a uuid for the comment and similarly for this created field that's a date time i give that the timestamp directive and that means that i want to auto generate a time stamp when this is created or updated i could configure that if i just want it to be set when it's created or also when it's updated and the next point has to do with nested mutations so if we look at this custom mutation field that we created we are using some custom cipher that we wrote here to create the comment node and then create two relationships so this is sort of creating a a little piece of a sub graph and we talked about using the timestamp and the random uuid function but we don't even need to create that mutation field because we can leverage the nested mutation functionality that is available for us just generated from the type definitions so here we're using the create comments mutation that takes an input text so the the text for the comment and then with the nested mutation functionality for the author we can specify do we want to connect to an existing author node and look that up by id do we want to create the author node here we're just connecting and doing something similar for article so in this case we don't even need to define that custom mutation we have that available to us that is part of the auto-generated api and here you can see auto-generated uuid and the time stamp that was generated because we used the id and timestamp directives the final example i have has to do with authorization so earlier dan gave us this this deep dive into the authorization functionality available in uh the newport graphql library we can leverage some of that in the cipher directive field so uh the new graphql library uses jsonweb token for authentication and authorization and that token is verified and then passed in uh i should say the payload of that token is passed in to our cipher query under this auth cipher parameter so here we're saying okay only show comments that the the currently authenticated user wrote right so this new query field my comments it's going to show me only my comments and i can't do that and again this this works but i don't need to instead if i wanted to restrict the comments to only those authored by the currently authenticated user i could add an auth directive with a aware rule that says filter where the username matches the jwt dot sub so that's the subscriber the currently authenticated user making this request so i guess the takeaway there is one make sure that when you're using the cipher directive it's not something that can be specified using relationship directives as you're defining your schema and then make sure that the feature that you're implementing with a custom cipher field isn't already available as part of the generated graphql api because there are a lot of powerful things there so that uh is all i have to talk about um i will leave you with a few resources so these slides are available dev.net.com ciphergraphql also the neo4j graphql library docs and overview page have a lot more information there's a a full day graph academy training that goes through a lot of this functionality in an interactive way you get a certificate when you're done so that might be something to check out as well and then we didn't really talk about what do you do with this graphql api once you've created how do you integrate it with your full stack application and this is where grand stack so graphql react apollo and ufj database that's where that comes in there's a grand stack starter project that you can clone from github or run this npx create grand stack app command that will also provision a new api for you cool so thanks for thanks so much for joining us today and again if you have any questions be sure to drop those in the chat and we'll get to those in the q a panel thanks cheers
Subscribe To Will's Newsletter
Want to know when the next blog post or video is published? Subscribe now!