Geospatial Data With GraphQL
Building A Real Estate Search App With GRANDstack: Part 3
Will continues his series on building a real estate search app with GRANDstack - in this week's edition he continues with geospatial data, as well as introducing the brand new tool, GraphQL Architect.
Links And Resources#
- Hi folks, welcome to the Neo4j, Twitch stream. 00:07 My name is Will and in this session, 00:10 we're gonna be picking up where we left off 00:13 with building our real estate search application. 00:18 This is the third session 00:22 where we'd been working on this. 00:24 First, we'll talk maybe a little bit about 00:26 what we've accomplished so far 00:28 and then start to talk about our goals for today. 00:33 I do wanna apologize though, 00:34 I have kind of a wonky setup today. 00:37 I had some issues with my laptop monitor, 00:42 so unfortunately most of the time my webcam 00:45 is gonna be off to the side here. 00:47 So sorry for looking off to the side on the stream, 00:51 but maybe we'll get that fixed next time. 00:56 Cool, so let's talk a little bit about 00:59 what we've accomplished so far. 01:02 So the code is up on GitHub here, willow-grandstack. 01:08 Let me just drop a link in the chat here. 01:15 Cool, so in the first session 01:20 we built out kind of the basic architecture 01:25 for our application from the GRANDstack starter. 01:28 With our web front-end, our GraphQL API, 01:33 talking to Neo4j instance. 01:35 Using the Neo4j geographical integration. 01:38 Then we deployed that to notify both 01:43 the front-end, React static application, 01:46 and then also the GraphQL API as a serverless function. 01:50 Using netlify functions 01:53 which eventually goes to AWS lambda. 01:56 Then in the second session, we started looking 01:58 at more into the data model and actually trying to get 02:03 our hands on some real data. 02:06 So we drew out the graph data model, 02:08 they were interested in, 02:10 we started to create our GraphQL schema. 02:15 We talked through the graph data modeling process. 02:18 Where we sort of take the requirements of our application, 02:23 identify the entities, how they're connected. 02:26 Start to think of how we can traverse the graph 02:29 to answer the questions that we have 02:31 that fit our requirements. 02:34 And that's kind of where we left off. 02:37 So couple of things, if we look in API, 02:44 source, schema.GraphQL. 02:48 This is the pretty simple GraphQL schema 02:52 that we left off with last time. 02:53 If you remember, we had some data 02:56 that we had converted from shapefiles 02:59 that had information about spatials. 03:04 We then imported that into Neo4j, 03:07 first converting it into GeoJSON. 03:10 Then we use the inferSchema functionality, 03:13 and then you have geographical integration 03:14 to generate the GraphQL schema. 03:16 So then we could search for properties that are, 03:21 you know, at least a certain number of acres, and so on. 03:26 We looked at adding the GraphQL schema directive 03:30 to add some custom logic to our GraphQL API here. 03:34 We had a very simple estimated sales price 03:36 where we just take the tax assessed value and add 20%. 03:42 If we look in let's jump back to 03:47 in the data directory, 03:49 this is mostly what we worked with last time. 03:52 Where we actually downloaded in this case 03:55 from the state of Montana websites, 03:57 we downloaded all of the spatial data for a specific County. 04:02 That was a shapefile. 04:05 We then converted that into GeoJSON 04:08 and then started to import that data into Neo4j 04:12 using the APOC load JSON procedure. 04:16 So APOC is this great standard library for Neo4j 04:19 that adds a lot of additional functionality to cipher. 04:23 That among many other things can allow us 04:25 to import data in JSON format. 04:30 Cool, so that's where we left off. 04:32 Just kind of, as a reminder, 04:35 what we're going for here is, initially building out 04:38 kind of like this base layer. 04:40 So this is the Zillow website. 04:42 Which is an example 04:44 of the kind of real estate search application 04:46 we're trying to build. 04:47 But we're trying to build out this base layer 04:49 where we're starting with, 04:50 okay, we have all of this spatial information already. 04:54 We know information about every house 04:58 from the square footage of the lot. 05:01 The latitude and longitude, the polygon bounds of the lots. 05:05 We know information about property taxes and so on. 05:10 So just sort of building out this base layer 05:12 before we start adding listings on top of that. 05:15 So that's kind of what we're working on. 05:20 Okay, great, so with that, 05:23 let's jump over to Neo4j desktop 05:27 and open up Neo4j browser. 05:36 And so if you missed the last few sessions 05:41 you can of course watch the videos on Twitch. 05:45 But if you want to just clone the Github project 05:49 and just kind of follow along the steps here in data import. 05:54 That'll bring you up to speed pretty quickly. 05:59 Okay, so here's Neo4j browser. 06:03 Let's take a look at our data model. 06:11 Looks like I accidentally added some other data. 06:17 Let's go ahead and delete that. 06:21 So match in detached, delete it. 06:24 And that'll match every node in the database 06:27 and then delete it. 06:29 So somehow accidentally added some, looks like some data 06:33 on businesses and users and reviews. 06:37 So we don't want that, so sorry about that. 06:40 Okay, let's make sure we deleted everything, great. 06:49 Okay, so what we were working with last time 06:53 was something like this APOC load, JSON file, 06:58 Gallatin GeoJSON, yield value. 07:06 And then we were doing something like 07:09 for every value.features. 07:14 So let's take a look at our data here, 07:21 Gallatin GeoJSON. 07:23 So this is a JSON file. 07:26 It's called a GeoJSON file because it has 07:28 a certain schema with some, as you can see here, 07:33 some Geospatial data in it as well. 07:39 So first we had in this JSON object, 07:40 first, we have some metadata 07:43 and then we have this big features array and it's... 07:46 So for each one of these feature objects 07:52 is what I want to create a property node. 07:56 A node at the label property in the database. 07:59 So for every feature in value.features, 08:04 we're going to do something. 08:09 And what we wanna do is create property 08:15 where the ID, let's set that to feature that ID. 08:19 And then we were doing updating all of the properties, 08:24 just setting those equal to the properties. 08:31 Which is an object. 08:32 So the properties that's this object here. 08:37 So basically, all of the information that we have 08:40 about each spatial, except for the geometry, essentially. 08:53 This should be plus equals I think. 08:58 Okay, let's see if that runs. 09:03 Yield value, not every case value. 09:10 Okay, so this is where we left off last time 09:12 in terms of data imports. 09:14 And then once this data was imported, 09:17 then we went off and built our GraphQL schema 09:21 on top of this. 09:21 And then took a look at querying with GraphQL. 09:24 So what I want to do now is 09:28 now that we have some of our data, 09:30 let's take a little further look 09:34 at working with some of this data in the import process. 09:39 Let's see if we can make this a bit more graphy. 09:41 So far we just have these property nodes, 09:45 which is not too exciting. 09:47 So let's see if maybe we can extract out 09:49 more of our data model here. 09:54 So based on the data that we have so far, 09:59 we should at least be able to pull out the city nodes, 10:01 so we can tell what city, each property is in. 10:07 We just pulled down one County, 10:08 but we should have a handful of cities in that County. 10:14 Okay, before we do that, I wanna make a few changes here 10:22 to our database. 10:24 So let's copy that. 10:29 So let's look at a couple of these properties. 10:39 Do we still have businesses and such 10:42 that's going on there? 10:43 Oh, I have some business category review constraints 10:49 in the database. 10:54 That's fine, you can leave those there. 11:02 Oh, actually let's make these as this. 11:03 This is a bit annoying. 11:07 Strap constraint the business, 11:14 certain businesses unique. 11:17 Okay, so first I'm gonna drop these rebuild... 11:25 So first I'm gonna drop these constraints 11:28 that I had created accidentally. 11:34 This is on, review ID. 11:41 What else do we have, category, 11:48 category ID. 11:51 Oops, Maple to drop constraint of category. 11:54 It's name 12:02 and user. 12:10 Okay, so now we refresh schema. 12:14 Okay, now we have no indexes and no constraints. 12:17 Okay, so what was going on there? 12:18 So I had accidentally created 12:21 some constraints working with these other data sets 12:26 that I had accidentally inserted in here. 12:28 So I just, first of all, removed those constraints. 12:32 If we take a look now at our data model, 12:38 we have some properties, 12:41 let's do style reset. 12:46 That sort of works. 12:47 So sometimes you'll notice that our styles configured 12:52 in browser, the caption sometimes don't display. 12:56 So sometimes doing a colon style reset, 12:58 that'll remove any of the configured styles. 13:02 Maybe that's just styles that uses the owner name 13:05 or something like that. 13:08 It gives a bit more information about the property. 13:10 Okay, so we have these properties. 13:14 This is our data model so far. 13:16 All of these properties have an ID property. 13:19 This came from the shapefile as the spatial ID. 13:23 And what I wanna do is start to think about 13:27 some constraints, some bit of a schema 13:30 that I should create for the database. 13:33 Now, Neo4j is schema optional. 13:36 We like to say in that it doesn't require us to create 13:39 a schema for the database upfront, 13:41 but we can create constraints. 13:43 You saw we had some uniqueness constraints 13:47 that were created in the database. 13:50 And that ensures that let's say if I have an ID property 13:53 that ensures that I won't be able to create duplicate nodes 13:56 in the database, we've got to create another node 13:59 with the same value of that ID property. 14:02 It also gives us an index, 14:04 which is useful because then that allows us 14:08 to do really fast lookups by that property. 14:13 So let's go ahead and create a uniqueness constraint 14:21 on our property node, a node label. 14:25 And we're gonna assert P.ID Is unique. 14:31 Now I might get an error when I run this and I do. 14:35 And it tells me that well, "Okay, this data that 14:39 "you've created in the database, 14:43 "there's already some data that is inconsistent 14:48 "with this constraint that you're trying to impose." 14:51 So I have two nodes that have the same value for the ID. 14:56 So this is a good example. 14:58 And you see this when you're working with real-world data 15:02 all the time that that's maybe not clean. 15:04 And maybe not super consistent, right? 15:07 I wouldn't really expect that I have duplicate property 15:11 entries in my GeoJSON on file, but apparently I do. 15:15 So let's delete everything and see how 15:18 we can avoid this problem. 15:19 So match in, delete in, 15:25 let's remove all those properties we just created. 15:29 And if we look at query, we were running, 15:34 we were just iterating through this CSV file, 15:38 sorry, this JSON file. 15:40 And for every feature, just creating a node. 15:45 Instead, what we wanna do is merge and merge. 15:51 Merge is like an upsert get or creates. 15:55 It will check to see if a property node 15:58 with this ID already exists. 16:00 And if it does not already exist, it will create it. 16:05 If it does, then it will just match on it. 16:11 It's worth pointing out that we typically want to 16:14 have created a constraint in the database 16:19 for this node label and property before we use a merge. 16:27 First of all, so that gives us the index 16:31 so that this lookup is really fast. 16:33 So it's very quick for any of our data to say, 16:36 "Hey, I have this node that I'm about to create 16:42 "it has this ID before I do that. 16:44 "Is there another node with the same value?" 16:47 That's much faster with an index. 16:51 But then also the constraint imposes at 16:54 the database integrity level imposes that we're not going 16:58 to violate having duplicate properties by ID value. 17:13 I know this does take a little bit longer 17:16 than just simply going through and creating a new node 17:21 for each record. 17:22 There is a little bit of overhead there 17:24 because we do have to do that check, 17:28 to make sure that the node doesn't already exist. 17:37 Okay, now, once this completes, 17:40 then the next thing I want to do 17:42 is take a look at how we can extract out 17:49 the city nodes. 17:51 So if we look at the data that we have, 18:00 we do have and again, 18:03 this is real-world data, 18:05 so it's not all consistent and clean. 18:09 But we do have a city-state zip for some of these things. 18:18 Here we go. 18:19 So if a part of an address, at least for some of these. 18:22 So we should be able to strip off the city 18:25 and build that into the graph. 18:30 Okay, this is still running. 18:35 I wonder why that is, let's see. 18:40 Oh, did I not actually create, 18:45 let's cancel this. 18:48 Did I forget to create my constraint? 19:06 Okay, now I have my constraint, 19:10 and let's run this again. 19:14 So you noticed that I was taking quite a while 19:18 without our constraint index. 19:21 And the reason for that is because each time we go to do 19:25 the insert is having to scan all records. 19:30 But now with the index, this should be much faster. 19:38 Okay, so now what I wanna do when this is done importing, 19:41 is for each of these properties strip off the city piece, 19:46 we'll create a city node. 19:49 And then connect that to our property node 19:52 to start to build out our graph model here. 19:58 Okay, so this time I took about 19 seconds. 20:02 That's a bit faster. 20:05 Great, so let's see. 20:08 So what we'll do is for each property 20:13 where exist P dot. 20:17 So this state zip. 20:20 Let's return that. 20:26 Let's just look at this. 20:27 Let's look at the first 10 of these. 20:32 Okay, so what I wanna do is just kind of 20:35 split things off at the comma here. 20:38 We have some common string functions in cipher. 20:41 So let's do a split. 20:45 So split takes a string in the delimiter 20:54 that will break up our string into an array. 20:59 And we want just the first element of that array. 21:06 Okay, great, so now for each property, 21:10 we can see the city. 21:14 Let's look at a few more. 21:15 Let's look at, let's say the first 500. 21:21 And as I scroll through these, 21:25 it's kind of a good idea, in general, 21:27 to check for consistency here. 21:30 And what I wanna do is I don't wanna make the assumption 21:34 that all of these are going to be uppercase. 21:38 So, for example, if I can do, 21:44 with split city. 21:50 If I return the distinct city. 21:52 So all the different values for city that we find. 21:57 What we'll see here is again, 21:58 because this is real-world data. 22:00 We have some cities in lower case, 22:05 and then we have that same name again in an upper case. 22:11 So what I wanna do 22:15 is when I create that city, 22:17 I wanna make sure that I case them all the same. 22:21 But before I start creating cities, 22:23 let's create another database constraint. 22:35 City assert that C dot name is unique. 22:44 Yeah, we can verify that is online. 22:47 Here's our constraint on city. 22:52 Close some of these since we don't need those. 22:58 So now I have the city. 22:59 So now what I wanna do is merge 23:03 on the city where the name is equal to 23:08 the city value from my property. 23:12 So that will create a new city node if it doesn't exist. 23:16 If that city node already exists, we'll just match on it. 23:19 And then I want to merge, creating a relationship 23:23 that doesn't exist 23:24 saying that the property is in the city. 23:30 So match for every property where we have 23:32 the city-state zip, 23:35 split out the value of the city, 23:40 merge on the city node, 23:44 and then merge on the city relationship. 23:47 And I wanna make sure in this, with clause that I have here. 23:51 So with allows me to do aggregations 23:56 or kind of intermediate operations. 23:59 But I always, anything that I wanna bring through 24:01 and pipe to the rest of my queries, 24:03 I always, to the rest of the lines 24:07 in my cipher statement, really. 24:08 I want to make sure that I specify everything 24:11 that I want to bring through. 24:12 So I wanna bring through this property node 24:16 because I referenced it here as well. 24:18 So I'm gonna be sure to include that 24:21 in my word statement as well. 24:25 Okay, let's see what happens there. 24:26 So this should now be iterating over each property node 24:32 and connecting it to a city if we have that. 24:50 This is running. 24:51 If we look at some of the other data that we have, 24:54 I think next time we'll have to maybe scrape some more data, 25:01 maybe from property record cards or something like that. 25:06 'Cause we don't really have a whole lot more that we can add 25:10 to our data model now. 25:12 We don't really have information about say like, 25:15 if there's a house on this property, 25:16 this is more kind of strictly spatial level data 25:20 it seems like. 25:22 Okay, so since we've created 15 labels 25:25 and 33,000 relationships, 25:28 So I think of our 50,000 property nodes, 25:35 only 33,000 of these seem to have the city state zip. 25:40 So again, real world data is fun to work with. 25:47 And we can see right away that I forgot 25:51 to do my casing that I said I was going to do. 25:55 So we have Bozeman, in uppercase and in lowercase. 26:01 Let's fix this. 26:02 I'm gonna match on every city 26:07 detached to lead those. 26:10 If I read the city nodes and the relationships. 26:14 And I want to go to upper here. 26:17 So for all of those cities where they're lower case, 26:21 let's convert those to uppercase before we merge on those. 26:33 Okay, so now that we have city data, 26:37 so we know each city that our property is in. 26:41 This is useful because that was a main piece 26:45 of our functionality as we wanted to 26:49 search for properties in a certain city. 26:51 That's kind of the common, initial way that we interact 26:56 with the search application. 26:59 But if you remember when we were first looking at this, 27:01 there is also some sort of aggregate data 27:07 that we were looking at. 27:08 And I can't remember how we found that. 27:11 Let's Zillow San Mateo, housing market, 27:16 maybe that'll. 27:17 Yeah, here we go. 27:18 These home values section. 27:20 So in addition to just kind of showing us 27:24 individual listings for sale, 27:25 we also wanna have some of this more like economic, 27:30 like aggregate type data to help us 27:33 kind of evaluate the state of the housing market. 27:37 So things like showing, you know, 27:40 the average home price over time, or the average home price 27:44 within a city. 27:45 So we also want to work with that sort of data as well. 27:50 So let's see how we would do something like that. 27:52 So maybe now that we have 27:56 if we do call DB schema visualization. 28:00 Now that we have data, looks like this for every property 28:03 is in a city. 28:04 If we select a city as Willow Creek, 28:08 you'll all the properties in Willow Creek or some of them. 28:14 Maybe we wanna do like an aggregation across 28:16 all of these cities to tell us, I don't know, 28:19 maybe like the average home value 28:22 in each one of these cities. 28:23 That might be something interesting that we could look at 28:27 in that sort of economic overview type data. 28:34 So for every property that's in the city, 28:38 let's grab the city name, 28:41 and let's calculate the average 28:46 of the total value. 28:48 So each property has a total value 28:52 which is the tax assessed value. 28:55 It'd be not necessarily the market rate, 28:57 but I don't know, fairly close. 28:59 And then order by average value descending. 29:04 So this should tell us for each of our cities, 29:09 what is the average value of all the properties 29:13 in that city ordered by the most expensive. 29:16 So for the one county that we pulled down 29:20 looks like West Yellowstone is the most expensive 29:23 an average value of $3 million. 29:26 Followed by Big Sky has an average value of 800,000, 29:33 it looks like and so on. 29:37 And you can see our data still isn't perfect. 29:39 We have Gallatin Gateway, Gateway is GWY. 29:45 Then up here, we have Gallatin Gateway spelled out. 29:47 And here we have West Yellowstone and W Yellowstone. 29:50 So we have a bit of work ahead of us for... 29:54 If you want to take care of things 29:55 like normalizing this data a bit better, 29:59 maybe going by zip code or something would be more accurate. 30:04 But anyway, that's kind of a first stab at it. 30:08 Sure I think it's pretty good. 30:11 Okay, so let's now see how we can work 30:17 with this piece of the data which is the geometry. 30:24 In this case, it is a polygon. 30:28 So list of latitude, longitude, coordinates 30:32 defines the polygon of each individual spatial. 30:37 So if we look on the map, 30:41 that's these polygons, right? 30:42 For each record we have the points 30:44 that make up that polygon. 30:47 And what I wanna do is import that into New Virginia. 30:52 And if we look back at our model, we've said, 30:55 okay, we kinda need to store this data, opps, 31:00 (mumbles) 31:02 store this data in two ways. 31:05 One is just a single point for the location. 31:09 That should probably be like 31:11 the centroid of that spatial 31:15 is probably what makes most sense. 31:19 So what's right in the middle. 31:21 And then the polygon itself, we're gonna store as an array 31:26 of point on objects. 31:30 Cool, let's see how to do that. 31:34 Before we get into that, 31:39 if we search this file, 31:43 multi polygon Gallatin.geoJSON. 31:54 So in addition to a polygon geometry, 31:58 we also have a bunch of multi polygon geometries. 32:01 And this is the case I talked about this last time. 32:04 Where we have maybe disjointed polygons 32:07 that make up a spatial or maybe a polygon within a polygon. 32:11 And what I wanna do is exclude these for now. 32:16 Let's see how many of these there are. 32:21 So there's 400 of these out of 50,000. 32:24 So for now, we're just gonna exclude any of these 32:26 multi polygon spatials. 32:30 Well, the spatials are already in there we just won't be 32:32 adding the geometry to them, 32:35 just as to simplify things for us. 32:38 And to do that, we're going to use 32:42 this command-line tool called JQ. 32:46 If you haven't used JQ, it's a great command-line tool 32:51 for working with JSON data. 32:55 Paste a link to the docs here. 32:58 So allows us to do things like search, filter, transform, 33:03 JSON files, which is really quite nice. 33:06 And what I'm gonna do here, 33:09 is for the features array, 33:13 I want to select only entries where 33:17 the geometry dot type value is polygon, 33:21 and then return that object from the features array 33:25 for every entry in the features array 33:28 in Gallatin dot GeoJSON and then write that 33:32 to a new file called filtered dot GeoJSON. 33:38 So this is gonna create a new GeoJSON file which we'll use. 33:41 Then to add these polygon geometries. 33:51 So to grab this file while it's being created, 33:55 let's jump back into, 34:00 move to desktop, let's go to manage 34:04 and open terminal. 34:06 And this will allow me to really quickly just copy that file 34:10 in right here like this. 34:13 Let's make sure that that's done writing. 34:15 Yep, it's done writing. 34:21 Great, 34:24 so now back in 34:29 Neo4j browser. 34:33 If I do something like call Apoc load JSON file, 34:40 now it's filtered dot GeoJSON, 34:48 yield value, 34:54 let's just look at the first 10 of these. 34:57 Opps, can't read. 35:01 Filtered, no such file. 35:11 L-O-A-D dot GeoJSON. 35:17 Did I not put it in there? 35:18 Let's see. 35:25 I copied it in the wrong directory. 35:28 So I wanted to copy it into import. 35:36 So by default, in the import directory 35:41 is where the root path is 35:47 within Apoc.JSON. 35:50 Okay, cool so now we have just the geometry object. 35:57 So value takes on in this context, 36:01 each of those geometry objects. 36:04 So what I wanna do is match on each of our properties. 36:16 And then I want to set the polygon 36:23 equal to something. 36:26 And if we think of the 36:30 data that we have here. 36:31 So I have, 36:43 let's look at the dash in the first hundred lines, 36:48 the other yeah. 36:53 So what I have here for the geometry coordinates 36:57 is this list... 37:02 Have a list of coordinates, right? 37:07 So this kind of nested structure. 37:09 And let's just take one of these. 37:14 In Neo4j, 37:22 I have the ability to create a point object. 37:27 So I can say, 37:30 "Latitude equals whatever longitude equals this." 37:38 So in my case copy those numbers. 37:41 So minus 111, 37:45 this is the longitude I like. 37:50 So if you think of, this is how I think of this 37:52 as comparing longitude and latitude is longitude 37:55 is as long, right? 37:56 So it goes up and down. 37:58 And if I see a negative value, 38:00 that means it should be somewhere in the western hemisphere. 38:03 So that kind of makes sense as is our latitude. 38:09 Okay, so I can construct point objects 38:12 this way in Neo4j and I can store as properties. 38:18 But to construct this in, 38:26 deleted my P merge on property 38:30 or the ID is value. 38:33 That ID set P dot polygon or something. 38:37 So what I need to do is kind of like iterate 38:40 over each one of these items in the list, 38:44 create a point object, 38:46 and then kind of collect them together in an array 38:49 and set that as a value. 38:53 And there's probably a few different ways to do it. 38:55 I chose to go about this using an Apoc procedure 39:00 called run first column and docs. 39:12 There we go running session fragments. 39:13 So if you look at the docs for this. 39:15 So this is a cipher function that kind of gives us 39:19 the ability to define a subquery with kind of 39:25 like a Lambda functionality. 39:28 So if we look at this returning a single column. 39:36 So in this case here, where in the example in the docs 39:41 we're getting all of the labels of the nodes. 39:45 We're constructing a match statement 39:47 using the label of the node, and then returning 39:49 the count and passing each one of those statements 39:52 into Apoc cipher run first column. 39:55 So what this does, it allows us to pass 39:58 in a cipher statements and then some parameters 40:03 and it will evaluate it as a single function call 40:08 within our larger cipher statement. 40:12 So what we can do is something like this, 40:14 where what we set the property value to 40:17 is Apoc cipher run first column single. 40:23 And what we wanna do in here is unwind 40:27 some list of coordinates that we're gonna pass in. 40:31 So that's our latitude longitude pair, 40:35 and what we wanna return, 40:38 I'm sorry, that's not a longitude-latitude pair. 40:40 That's a list of longitude-latitude pairs. 40:45 What we wanna do is return collects, 40:49 so collect is gonna give us a list. 40:52 And on each one of these, we want to cast it to a point 40:57 where latitude is 41:01 the second value in the pair. 41:05 And longitude is the first. 41:12 And then for each one of those, 41:15 now these are the parameters that we're passing in. 41:18 So we're going to define the value for cords here. 41:23 It's going to be value.geometry.coordinates. 41:34 And remember, this was kind of a weird list 41:36 of lists of lists. 41:37 So we want to just grab the first element in that list. 41:41 Okay, so before you're on this, 41:42 let's make sure you have it right. 41:43 So call it APOC load JSON, filtered JSON on file, 41:47 yield value, match the property by ID, 41:53 sets the polygon property equal to APOC cipher. 41:57 Run first column single where the cipher statement 42:00 is unwind cords return collect, cast it to a point. 42:07 Latitude is the second value. 42:10 Longitude is the first passing 42:13 in value, geometry.coordinates. 42:16 Cool, that looks good to me. 42:18 Let's see what happens when we are on that. 42:36 Okay, so that took about 15 seconds. 42:40 Let's just take a look at some property nodes at random, 42:44 and let's see if we have probably gone boundaries. 42:48 Yep, and we do, so this is some land owned by LB Cattle Co. 42:53 So some ranch or something. 42:55 And we have now the polygon that defines the boundary 42:59 of this spatial. 43:01 Cool, so that's neat. 43:04 Let's also for every property but we also said we wanted 43:09 to set a location property 43:15 that's just a single point. 43:17 Let's go ahead and do that. 43:19 Ideally, this should be the centroid of the polygon. 43:23 I'm gonna cheat a little bit for now and just set that 43:27 to, I don't know, maybe the first point in the list. 43:35 Okay, so every property that has a polygon, 43:38 let's set a P dot location equal to 43:45 the first entry in the polygon list. 44:01 P, oh, I typed in accidentally. 44:04 P property where exists P.polygon. 44:09 Okay great, now we've got, polygon 44:11 and we've got 44:16 the bounds or the location property as a single point. 44:20 What can we do with that? 44:22 Well, one thing 44:26 that is really neat 44:29 is the ability to sort of search within a polygon. 44:35 So if we, if we think of on these, 44:40 this example, doesn't make a whole lot of sense. 44:42 But if we were looking at, I don't know, a map of, 44:45 say like Montana or something where some of these spatials 44:48 might be a lot larger. 44:50 When we click within the boundaries 44:53 of one of these spatials, 44:54 we need to to know, okay? 44:56 What property are we clicking on? 44:58 What house are we clicking on, right? 45:02 So here, so we need to be able to say, 45:05 okay, for the latitude and longitude value 45:09 that you've clicked on the map? 45:11 We need to do sort of a within polygon search to find, 45:16 okay, what node has a polygon 45:19 that contains within it, the value of the latitude 45:23 and longitude, where you clicked. 45:26 And there is a 45:35 procedure library available 45:37 for Neo4j called spatial algorithms. 45:40 That adds some of this functionality for working 45:43 with polygon geometry. 45:45 So we can do point polygon. 45:46 We can do distance, 45:48 we can do intersection of two polygon geometries. 45:52 So I'm going to install this real quick, 45:54 by going to the release tab. 45:59 I'm going to copy the link address for the jar file. 46:05 And then now if still have this open 46:09 let's go into plugins 46:15 and we'll just bring that jar file in 46:19 and we'll need to restart our database 46:24 to load these procedures. 46:35 Check the logs while this starts up. 46:44 It's just some air from 46:47 when I was making some mistakes with those constraints. 47:03 Okay, it looks like we're back online. 47:06 So now if we do call 47:13 DB.procedures. 47:20 DBMS dot procedures, 47:26 there we go. 47:27 So now this will tell us all the procedures 47:29 that we have installed. 47:30 If we scroll down or we can filter for these. 47:32 We see we have some things specific, 47:36 some that start with a spatial namespace 47:37 that adds some spatial functionality. 47:40 We can also look at functions 47:48 and we have things like spatial algo intersection, 47:53 spatial algo distance so on. 47:56 So what this allows us to do now 47:59 is something like let's match a property 48:05 where, and we'll call one of these spatial functions, 48:08 spatial.algo.within polygon. 48:17 And then we pass in a point, 48:23 latitude, 48:28 longitude return P. 48:32 So we need some of these values for latitude longitude here, 48:34 but what this does, if we pass in a single point, 48:37 it's gonna tell us, "Okay, are there any nodes 48:40 "with the label of property?" 48:43 Oh, and then the other piece we need is to specify 48:49 what property 48:52 on this node 48:55 as our polygon. 48:58 And polygons are, in this context lists of point objects. 49:03 So let's see if we can find one here. 49:05 Let's look at let's look at the map. 49:13 The county we downloaded is Gallatin County, 49:15 which has Bozeman. 49:17 Let's find out what else, there is Montana State University 49:21 is in Bozeman, 49:22 so let's get the latitude and longitude 49:25 for Montana state university. 49:29 Here's a neat trick that I learned. 49:30 If we click on, right-click on something 49:32 and do a what's here in Google maps, 49:35 we can get the latitude and longitude. 49:40 Which we can then copy, 49:44 come on now. 49:45 Let's copy that, 49:47 there we go. 49:53 Okay, this is my longitude. 49:58 This is my latitude, 50:03 the floor, the front. 50:13 Failed to invoke, 50:17 no pointer exception polygon. 50:26 We have to say where exists, 50:30 P dot polygon 50:33 and like that maybe, 50:42 there we go. 50:43 Okay, so it looks like the null pointer exception happened 50:45 because it was searching over some nodes 50:47 that didn't have that polygon property. 50:52 Okay cool and so we found the property, node 50:57 that in this case represents Montana State University 51:02 from searching on Google maps. 51:05 Okay cool so now that we have this, 51:09 what I wanna do is really quick, we have seven minutes left. 51:15 What I wanna do now is show how we can use this 51:18 in our GraphQL API to do things like searching by location. 51:25 And also how we can include 51:30 working with this polygon geometry, using 51:32 the spatial algorithms procedure that we just added. 51:36 So I'm gonna switch now over, 51:42 to Neo4j desktop 51:43 and I'm going to open this graph app 51:45 called GraphQL architect. 51:49 So, GraphQL architect, this is a fairly new graph app. 51:53 I showed kind of a initial version of this, 51:58 I think a few sessions ago. 52:01 But GraphQL architect allows us to do things 52:05 like infer the GraphQL schema from our existing database. 52:12 Looks like that was loading the saved ones. 52:15 I'm gonna clear saved one. 52:16 Let's try inferSchema again, 52:21 there we go. 52:21 So now it's got our property in our city. 52:26 And we can start a local GraphQL server 52:33 and then switch to the GraphQL view 52:37 where we have entry points for city and property. 52:41 So we can do things like say search for, 52:46 I don't know, for each let's do this for each city. 52:50 Find properties, give me the name, 52:55 city state zip and we have a couple of different fields 52:58 for address, and let's just look at like the first 10 53:03 for each city. 53:06 'Cause this allows me to pretty easily get up and running 53:10 with GraphQL on top of Neo4j. 53:12 And by the way, we can install GraphQL architect 53:17 from the GitHub repo 53:21 and grand stack GraphQL architect. 53:28 So to install this in the Neo4j desktop, 53:30 either open this link in your browser 53:34 and that will deep link into the Neo4j browser and load it. 53:38 Or you can just copy this URL 53:42 and then paste it in install here. 53:44 Let me drop a link to this and the chats. 53:47 So we have that. 53:51 But okay, so I can also do things like, 53:56 remember for city we had 54:01 that average value calculation that we were doing. 54:05 Where we said average value is gonna be afloat. 54:11 We can find a cipher query to that, 54:21 where we say, match this. 54:25 So for every property in that city, 54:34 we're going to just return the average of P dot 54:40 what's the total value. 54:47 And let's save this and restart 54:52 our GraphQL API. 54:59 Here in my schema, here 55:04 average value float. 55:10 Oh, statement, 55:14 misspelled statement. 55:18 Cool so now we've added average value. 55:24 So now for each city, 55:27 we can bring in 55:29 the average home value, let's also return 55:34 the name of the city. 55:35 Let me do things like order by 55:38 average home value, descending. 55:45 Cool, we can also add 55:51 custom query fields, so custom route level entry points. 55:56 So if we wanted to do something like property in polygon, 56:07 we can bind this now to a cipher query 56:10 that's going to use 56:14 our spatial algo within polygon. 56:21 Where we pass in the value for latitude and longitude, 56:24 then call that procedure to return 56:28 any properties that we find 56:30 that are within that have a geometry 56:34 that contains the points that we're passing in. 56:36 So just like we did 56:41 here, where we found 56:43 you know, the node that represented Montana State University 56:46 by passing in latitude and longitude, 56:48 we can do that and expose that through GraphQL as well. 56:52 Unfortunately, we're out of time and I have a meeting 56:56 I have to jump off to. 56:58 So next time we will pick up 57:02 from here working with GraphQL architect 57:05 to build our GraphQL API 57:07 and add in this property and polygon functionality. 57:11 I will pick up that next Thursday at 2:00 p.m Pacific. 57:15 And then I don't know maybe 57:17 I'll also see about maybe scripting 57:20 some property information from property record cards 57:25 from the state website 57:26 or we've been neglecting the front-end, 57:29 the react portion of our application. 57:31 So maybe, we'll now switch to that. 57:34 Okay, cool well, thanks for joining today. 57:37 Hope to see you next time as well. 57:41 So thanks a lot, cheers.
Subscribe To Will's Newsletter
Want to know when the next blog post or video is published? Subscribe now!