Degree Days
Weather Data for Energy Saving
This page is for anyone planning to integrate degree days into their software system. It is moderately technical, but you do not need to be a software developer to read it. It offers a fairly high-level overview of the Degree Days.net API and the various common approaches to integrating with it.
If you already have compatible software that needs API access keys to unlock its degree-day-based functionality, please sign up for an API account here and enter the API keys into your software. You should not need this integration guide unless you are actually building the software yourself.
If you are a software developer you will find technical details and code samples in the language-specific guides listed under "Programming language options" below. But we recommend that you read this page at some point as the higher-level concepts it describes should help you decide on the best approach for your integration. We have advised a lot of companies that have integrated with our API, and the same patterns recur again and again.
You can use any programming language to access the API with XML or JSON, but it is easier to start with one of the languages for which we have a client library or sample code:
Full client libraries and quick-start guides: Java .NET Python
Sample code to get you started: JavaScript Node.js Office/Excel PHP PowerShell R Ruby VBA
If you're using another language, you can access the API directly with XML or JSON. Both the XML and JSON APIs let you do exactly the same things, both are served by the same backend system, and you can use either or both through the same API account. We suggest you favour JSON if your language makes it easy to turn objects into JSON and back, but it's mainly a question of which best suits your platform and experience.
XML and JSON are both robust options, but we do still recommend you favour the client libraries above if you are using Java (or a JVM language like Kotlin, Scala, or Groovy), .NET (e.g. C# or VB.NET), or Python, as they are high-performance, full-featured, dependency-free libraries that are easy to use and will have you fetching data in minutes.
The links above cover the technical details. The rest of this page gives a higher-level language-agnostic overview of the API and the common approaches to integrating with it:
The Degree Days.net API lets you get all the data you can get through the website, but automated, faster, and in much larger quantities.
You send a request to the API specifying what data you want, and you get a response back containing the data you specified.
A request is for one specific location. So, to fetch data for 1,000 locations, your software would make 1,000 requests.
A location can be a weather station ID or a geographic location: a postal/zip code or a longitude/latitude position. The API has a sophisticated system for figuring out which weather station can best represent a specific geographic location over any particular time period. It is more complicated than just picking the closest station. Data quality and data coverage (how much data history is available) varies from station to station, and the API takes all this into account when choosing which weather station to use for any particular request.
Each request takes a certain number of request units. A big request (like fetching 10 years' worth of data) takes more request units than a small request (like fetching data for just the last day, week, or month). API accounts have an hourly rate limit which determines how many request units they can handle in any given hour. When you sign up for an API account you should choose a plan that will accommodate the volume of data you intend to fetch from the API. The Pricing & Sign-Up page has an API plan selector that can help you choose.
A LocationDataRequest is the type of request you use to get degree-day data and/or hourly temperature data. You specify a location (as a weather station ID, zip/postal code, or longitude/latitude position), and the data you want for it (e.g. HDD, CDD, daily, weekly, monthly, average, what base temperatures, covering what period of time etc.), and the API will send back the data you requested in the response.
A LocationDataRequest always takes at least 1 request unit, but it can take many more if you fetch a lot of data in one request. Generalist weather-data APIs often only allow you to fetch data one day at a time, forcing you to make hundreds of thousands of requests to assemble a relatively small amount of data. But the Degree Days.net API will let you fetch many years' worth of data in a single request. You can also fetch multiple sets of data for a location in one request (e.g. HDD and CDD in multiple base temperatures, and hourly temperature data too). The only catch is that big LocationDataRequests take lots of request units. But it is always more efficient to fetch the data you need in as few requests as possible.
A LocationInfoRequest is specified in almost exactly the same way as a LocationDataRequest. You specify the data you want and the location you want it for (typically a geographic location). But you don't get any degree-day data back in the response, just information about the location. Most importantly this will include the ID of the weather station that the API would use to generate the data you specified. So you can then use a LocationDataRequest to fetch the data you want directly from that weather station. Or, if you already have that data in your database, you can get it straight from there.
A LocationInfoRequest only ever takes 1 request unit. So you can use it to efficiently map geographic locations (e.g. a big list of zip/postal codes or longitude/latitude positions) to weather station IDs. This enables you to save time and request units when assembling data for thousands of building locations, many of which would map to the same weather stations.
A RegressionRequest instructs the API to run regression analysis on your energy data. With it you specify a set of energy-usage data and the location of the building it came from (or of a weather station nearby). The API will generate heating and cooling degree days for that location, in lots of base temperatures, and it will test thousands of regressions against your energy data to find the base temperature(s) that give the best statistical fit. You would typically then use the best regression model for further analysis (e.g. to calculate energy savings). We have separate docs on using the API for regression, and we discuss how to integrate regression into your data-fetching strategy a little further below.
The simplest integration option is often to fetch data from the API as and when it's needed. For example, consider:
For systems like these it often makes sense to fetch data from the API on demand. If you request data from the geographic location of the target building (by specifying a postal/zip code or longitude/latitude position), the API will automatically generate the data you requested using the weather station that's best placed to supply it (considering data quality and coverage as well as distance from the target location).
This on-demand approach is particularly likely to make sense if:
When the locations of interest remain fairly constant it can often make sense to build and maintain your own local database of degree days. This is a common pattern for:
Building and maintaining a database of degree days will typically involve:
The initial backfill is a one-off job, so it is unlikely to be a huge problem if it takes a while or is a little inefficient. You should consider the efficiency of the backfill for new locations as they are added over time, but it is unlikely to be a critical factor unless you are likely to add a lot. The most important thing is usually to ensure that regular updates can be performed efficiently.
With this approach you would fetch data by weather station ID (using a LocationDataRequest for each ID) for your initial backfill and for all regular updates.
This approach is ideal if it is weather station IDs that you ultimately want data for. But it is less ideal if your locations of interest are really geographic locations like postal/zip codes or longitude/latitude positions of real-world buildings. Rather than figuring out your own scheme for assigning a weather station ID to each geographic location, it usually makes sense to use the sophisticated system that the API provides specifically for this purpose. Good station selection is more complicated than it seems on the surface. Please see Approach 2 and 3 below for more.
A simple approach to building and maintaining a degree-day database is to fetch data by geographic location (e.g. postal/zip code or longitude/latitude position) for both the backfill and for regular updates.
The API is very good at handling geographic locations – give it a postal/zip code or longitude/latitude position, and tell it what data you want, and it will choose the best weather station automatically. Data quality varies considerably (both between stations and over time), and some stations have been recording temperatures for longer than others, but the API takes all of this into account, choosing which weather station to use based on data quality and coverage as well as distance from the target location.
The main problem with this approach is that, although it is OK for a relatively small number of locations (e.g. up to 1,000 or so), it gets less and less efficient as the number of locations grows. The more locations you are dealing with, the more likely it is that many of them will share the same weather stations. So you will probably end up fetching the same data multiple times in your initial backfill and in each of your regular updates. This will slow your system down and you may need a higher-level API plan than you would need if you used a more efficient approach.
There can be a consistency issue too: the updates (recent data only) for a geographic location might sometimes use a different weather station to the one used for the initial backfill (longer data history). For example, you could backfill a location with a long data history going back 15 years, and the API might choose a weather station 9 miles away that could provide that long data history. But then for an update you might only be fetching a months' worth of data so the API might instead choose a newer station that is only 2 miles away. In reality this issue doesn't occur all that often, as the API does tend to favour stations with a good data history, even for requests that only want a month's worth of data. You could also get around it by fetching updates by station ID (always returned with the degree-day data generated for a geographic location), but, if you were doing that, you might as well use proper two-stage data fetching instead (see Approach 3 below).
On the plus side, this simple approach is very easy to design and program. Each building location has its own set of data – this is simple even if there is duplication. You also don't need to worry about what happens if a weather station stops working – the API will just automatically choose another station to replace it. Despite its limitations, this is not a bad way to integrate with the API if you want to get something working quickly and you aren't dealing with thousands of locations.
The more locations you have, the more likely it is that multiple locations will share the same weather station. Fetching data (using LocationDataRequest) can take a lot of request units, so it is inefficient to inadvertently fetch the same data multiple times for nearby locations that share a weather station. If you are dealing with more than a thousand or so locations, or you know that many of your locations are close to each other, you should think about using two-stage data fetching for better scalability:
10,000 building locations will typically map to under 2,000 weather stations, or less if the buildings are concentrated in certain countries or regions. You only need to do this mapping once when you backfill (for which performance is unlikely to be critical), and with LocationInfoRequest it will only take one request unit for each building location. You will save request units in your backfill and in each of your updates, as you will only need to make a heavyweight LocationDataRequest for each of your weather stations (e.g. 2,000) rather than for each of your geographic locations (e.g. 10,000).
The more locations you have, the greater the advantage of this approach.
A few more tips:
You will find more technical info on LocationInfoRequest in the more technical programming-language-specific documentation on this website. But for now the main thing is just to understand the general approach of two-stage data fetching and determine whether it makes sense for your application.
Regression is at the core of most good energy data analysis, and the API offers advanced regression functionality that makes it easy to do it well.
You will likely want the API to use regression to help you determine the optimal base temperature(s) for each building (or energy meter) in your system. And, once you have determined the base temperature(s) of a building (or energy meter), you will likely want to fetch degree days in those base temperature(s) on an ongoing basis, for ongoing energy monitoring.
If you have multiple buildings that share the same weather station, those buildings are likely to have different base temperatures, so you will probably want to fetch degree days in multiple base temperatures for that weather station. As you add more buildings into your system, the weather stations and base temperatures you are using will naturally grow in number. There are two main ways to handle this:
We have designed the API with careful consideration of the edge cases and things that could go wrong, with the aim of shielding you from as much of the underlying complexity as possible. But there are still some particular things you might want to plan for if you are aiming to build a robust system:
At present it is not possible to fetch any data from before the year 2000. We calculate degree days accurately using detailed temperature readings taken throughout each day (typically hourly or more frequently than that), and continuous records from the last century are patchy and difficult to get hold of. This is the only real disadvantage of the accurate calculation method that we use.
Also, many of the stations in our system were set up more recently than 2000. And some stations have had measurement or reporting problems that have caused us to discard some of their earlier data. So the length of data history available varies from weather station to weather station.
If you specify a geographic location (a postal/zip code or longitude/latitude position) and leave the API to choose the weather station automatically, it is always best to specify the longest data history that you will ultimately want, as that may affect which weather station the API selects to satisfy your request. This is the case whether you are using LocationDataRequest to actually fetch the data, or LocationInfoRequest to just get the ID of the weather station that the API would use to satisfy your data specification.
If you request more data than the API can supply for your specified location (whether it's a weather station ID or a geographic location), the API will, by default, return what it can from within the range you requested. Except in the unlikely event of your geographic location having no active stations near it, recent data should always be available ("recent" meaning to within around 10 days of yesterday in the location's local time zone, and usually up to and including yesterday). And the API will never return data with gaps in it. But there are limits on how far back in time you can go, so you might find data missing from the start of your requested range. If you'd rather receive an error than a partial set of data, you can specify a minimum required range in your request.
Although a weather station might be working well today, there is no guarantee that it will still be working well next month or next year. Unfortunately not even the best ICAO airport stations are completely exempt from reliability problems (though they are rare).
If you're only storing data from a handful of locations, it might not be worth worrying about the possibility of one of your stations going down. But, if you're storing data from hundreds or thousands of locations, it's likely that you'll run into station downtime at some point, so you might want your system to be prepared for it. We've designed our system to make it as easy as possible for you to handle station downtime in a robust manner.
Small patches of downtime are automatically filled with estimated data. But our system will only do this when it has temperature readings on both sides of the gap. So, if a station goes down for a while, its most recent data won't be available until it comes back up again. Bear this in mind if you're fetching updates at the start of each day/week/month: if a station went down towards the end of the last day/week/month, it will need to come back up again before our system can patch the gap with estimated data and supply a value for that last day/week/month.
If you need the latest data for a given location, but the station you have been using does not yet have it, you can put in a request for the missing data from the underlying geographic location of interest. Specify a minimum-required range that includes the latest data (taking care to ensure that the last required day has finished in the location's local time zone) and the API will hopefully find a stand-in station that can supply the data you need. If you store the stand-in data, you could replace it later if/when your original station recovers (if you think it's important to use a consistent source).
A long period of downtime will result in a station being labelled "inactive". This happens if a station doesn't report any usable temperature readings for around 10 days or more (10 being an approximate number that is subject to change).
If you try to request data from an inactive station, you'll get a LocationNotSupported
failure. (In .NET or Java this will appear as a LocationException
, in Python a LocationError
.) This is an indication that you should find an alternative station to use as a replacement. Typically this would involve you making another request for data from the geographic location that you ultimately want the data for (e.g. the location of the target building) so the API can automatically choose a replacement station for you.
There can occasionally be some volatility in the most recent data. Sometimes a station's automated reporting system goes down, then comes back up again, leaving a gap in the reported data... As explained above, our system will plug the gap with estimates, but occasionally the station will recover the missing data and report it several days later, enabling our system to calculate the degree days (or hourly temperature data) more accurately. Also, once our system has a couple of good days' worth of post-gap data, it may be able to do a better job of filling the gap with estimates than it could initially when it had only one or two post-gap temperature readings to work with.
This sort of volatility is unusual for the higher-quality weather stations that you are likely to be using (especially if you let our API select stations for you automatically), but you should still expect it to happen occasionally. Such volatility would generally only affect the latest 10 or so days of data, so it's easy to counteract by fetching a little more data than you need each time you update your database. For example, instead of fetching the latest day, fetch the latest 30 days; instead of fetching the latest week, fetch the latest 4 weeks; instead of fetching the latest month, fetch the latest 2 months. Overwrite any previously-stored values with the most-recently fetched values. The vast majority of the time it will make no difference (and when it does the difference will almost always be small), but this is a good approach to maintaining data quality that doesn't usually add much complexity.
To use the API you need API access keys. These exist so you can ensure that your API account is only used by the people and software systems that you've authorized to use it.
If you're making an internal system, or a public-facing web application that is hosted on servers you control, you will probably only need one API account and one set of access keys. You can securely embed them into your application without much risk of them escaping. That's the simple case.
If you are making installable software (a desktop, mobile, or server application that your users install themselves), it would be unwise to embed your API access keys into that application. If those access keys escape, then anyone who gets hold of them will be able to use up the data-generation capacity that you have reserved for your application. And you won't be able to replace the compromised access keys without redistributing a new version of your application.
One good option is to leave it to your customers to get their own API accounts and enter their access keys into their installations of your software to unlock the degree-day based functionality. Just point them to our sign-up page and tell them where to enter their API access keys once they've subscribed. We have designed our API plans specifically to support this model.
This is a particularly good approach if not all of your customers want the degree-day-based functionality – you can leave it as an optional feature that they can unlock if they want it, or ignore if they don't. It's also a good option if you expect some of your customers to make heavy use of the API (e.g. if you're making software for large multi-site organizations).
Another option is to build an intermediate server. The customer's installed application would request data from your intermediate server, which would fetch the data from our API (using access keys that are stored securely on your server only) and pass it back down to the customer's application. This approach is more complicated but it gives you tighter control over the licensing side of things.
Running things through an intermediate server makes sense if you're building an installable application for the consumer market, as, although our low-end API plans are attractively priced for businesses, they're not really priced for consumers. If you're selling a $2.99 iPhone app for residential energy tracking, your non-business customers are unlikely to want to pay for an API subscription from us.
Also, with a widely-used application you will probably find that many of your customers will share the same weather stations (as groups of them will be located in the same neighborhoods). By routing everything through your own servers you can cache data locally and use two-stage data fetching to minimize your API access by fetching data for each station only once, rather than once for each customer that uses that station. With a few thousand customer locations this sort of caching may be more development effort than it's worth for the limited data-reuse it would make possible at that scale, but with hundreds of thousands or millions of customer locations it could certainly be worthwhile, especially for consumer applications where keeping costs down is a priority.
If you're a programmer, we suggest you take a look at the Java quick-start guide, the .NET quick-start guide, the Python quick-start guide, the JavaScript sample code, the Node.js sample code, the Office/Excel sample code, the PHP sample code, the PowerShell sample code, the R sample code, the Ruby sample code, the VBA sample code, the JSON docs, or the XML docs. With the client libraries and sample code you can literally be fetching data from the API within the next few minutes. How best to integrate with the API will probably become a lot clearer once you're familiar with the code itself.
© 2008–2024 BizEE Software – About | Contact | Privacy | Free Website | API | Integration Guide | API FAQ | API Sign-Up